What do you get with Print?

Instant access to your digital eBook copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Building Big Data Pipelines with Apache Beam

Chapter 1: Introduction to Data Processing with Apache Beam

Data. Big data. Real-time data. Data streams. Many buzzwords to describe many things, and yet they have many common properties. Mind-blowing applications can be developed from the successful application of (theoretically) simple logic – take data and produce knowledge. However, a simple-sounding task can turn out to be difficult when the amount of data needed to produce knowledge is huge (and still growing). Given the vast volumes of data produced by humanity every day, which tools should we choose to turn our simple logic into scalable solutions? That is, solutions that protect our investment in creating the data extraction logic, even in the presence of new requirements arising or changing on a daily basis, and new data processing technologies being created? This book focuses on why Apache Beam might be a good solution to these challenges, and it will guide you through the Beam learning process.

In this chapter, we will cover the following topics:

Why Apache Beam?
Writing your first pipeline
Running a pipeline against streaming data
Exploring the key properties of Unbounded data
Measuring the event time progress inside data streams
Assigning data to windows
Unifying batch and streaming data processing

Why Apache Beam?

There are two basic questions we might ask when considering a new technology to learn and apply in practice:

What problem am I struggling with that the new technology can help me solve?
What would the costs associated with the technology be?

Every sound technology has a well-defined selling point – that is, something that justifies its existence in the presence of competing technologies. In the case of Beam, this selling point could be reduced to a single word: portability. Beam is portable on several layers:

Beam's pipelines are portable between multiple runners (that is, a technology that executes the distributed computation described by a pipeline's author).
Beam's data processing model is portable between various programming languages.
Beam's data processing logic is portable between bounded and unbounded data.

Each of these points deserves a few words of explanation. By runner portability, we mean the possibility to run existing pipelines written in one of the supported programming languages (for instance, Java, Python, Go, Scala, or even SQL) against a data processing engine that can be chosen at runtime. A typical example of a runner would be Apache Flink, Apache Spark, or Google Cloud Dataflow. However, Beam is by no means limited to these; new runners are created as new technologies arise, and it's very likely that many more will be developed.

When we say Beam's data processing model is portable between various programming languages, we mean it has the ability to provide support for multiple SDKs, regardless of the language or technology used by the runner. This way, we can code Beam pipelines in the Go language, and then run these against the Apache Flink Runner, written in Java.

Last but not least, the core of Apache Beam's model is designed so that it is portable between bounded and unbounded data. Bounded data is what was historically called batch processing, while unbounded data refers to real-time processing (that is, an application crunching live data as it arrives in the system and producing a low-latency output).

Putting these pieces together, we can describe Beam as a tool that lets you deal with your big data architecture with the following vision:

Choose your preferred language, write your data processing pipeline, run this pipeline using a runner of your choice, and do all of this for both batch and real-time data at the same time.

Because everything comes at a price, you should expect to pay for flexibility like this – this price would be a somewhat bigger overhead in terms of CPU and/or memory usage. The Beam community works hard to make this overhead as small as possible, but the chances are that it will never be zero.

If all of this sounds compelling to you, then we are ready to start a journey exploring Apache Beam!

Writing your first pipeline

Let's jump right into writing our first pipeline. The first part of this book will focus on Beam's Java SDK. We assume that you are familiar with programming in Java and building a project using Apache Maven (or any similar tool). The following code can be found in the com.packtpub.beam.chapter1.FirstPipeline class in the chapter1 module in the GitHub repository. We would like you to go through all of the code, but we will highlight the most important parts here:

We need some (demo) input for our pipeline. We will read this input from the resource called lorem.txt. The code is standard Java, as follows:

ClassLoader loader = FirstPipeline.class.getClassLoader();
String file = loader.getResource("lorem.txt").getFile();
List<String> lines = Files.readAllLines(
    Paths.get(file), StandardCharsets.UTF_8);

Next, we need to create a Pipeline object, which is a container for a Directed Acyclic Graph (DAG) that represents the data transformations needed to produce output from input data:
```
Pipeline pipeline = Pipeline.create();
```
Important note
There are multiple ways to create a pipeline, and this is the simplest. We will see different approaches to pipelines in Chapter 2, Implementing, Testing, and Deploying Basic Pipelines.
After we create a pipeline, we can start filling it with data. In Beam, data is represented by a PCollection object. Each PCollection object (that is, parallel collection) can be imagined as a line (an edge) connecting two vertices (PTransforms, or parallel transforms) in the pipeline's DAG.
Therefore, the following code creates the first node in the pipeline. The node is a transform that takes raw input from the list and creates a new PCollection:
```
PCollection<String> input = pipeline.apply(Create.of(lines));
```
Our DAG will then look like the following diagram:
Figure 1.1 – A pipeline containing a single PTransform
Each PTransform can have one main output and possibly multiple side output PCollections. Each PCollection has to be consumed by another PTransform or it might be excluded from the execution. As we can see, our main output (PCollection of PTransform, called Create) is not presently consumed by any PTransform. We connect PTransform to a PCollection by applying this PTransform on the PCollection. We do that by using the following code:
```
PCollection<String> words = input.apply(Tokenize.of());
```
This creates a new PTransform (Tokenize) and connects it to our input PCollection, as shown in the following figure:
Figure 1.2 – A pipeline with two PTransforms
We'll skip the details of how the Tokenize PTransform is implemented for now (we will return to that in Chapter 5, Using SQL for Pipeline Implementation, which describes how to structure code in general). Currently, all we have to remember is that the Tokenize PTransform takes input lines of text and splits each line into words, which produces a new PCollection that contains all of the words from all the lines of the input PCollection.
We finish the pipeline by adding two more PTransforms. One will produce the well-known word count example, so popular in every big data textbook. And the last one will simply print the output PCollection to standard output:
```
PCollection<KV<String, Long>> result =
    words.apply(Count.perElement());
result.apply(PrintElements.of());
```
Details of both the Count PTransform (which is Beam's built-in PTransform) and PrintElements (which is a user-defined PTransform) will be discussed later. For now, if we focus on the pipeline construction process, we can see that our pipeline looks as follows:
Figure 1.3 – The final word count pipeline
After we define this pipeline, we should run it. This is done with the following line:
```
pipeline.run().waitUntilFinish();
```
This causes the pipeline to be passed to a runner (configured in the pipeline; if omitted, it defaults to a runner available on Classpath). The standard default runner is the DirectRunner, which executes the pipeline in the local Java Virtual Machine (JVM) only. This runner is mostly only suitable for testing, as we will see in the next chapter.
We can run this pipeline by executing the following command in the code examples for the chapter1 module, which will yield the expected output on standard output:
```
chapter1$ ../mvnw exec:java \
    -Dexec.mainClass=com.packtpub.beam.chapter1.FirstPipeline
```
Important note
The ordering of output is not defined and is likely to vary over multiple runs. This is to be expected and is due to the fact that the pipeline underneath is executed in multiple threads.

A very useful feature is that the application of PTransform to PCollection can be chained, so the preceding code can be simplified to the following:

ClassLoader loader = ...
FirstPipeline.class.getClassLoader();
String file =
   loader.getResource("lorem.txt").getFile();
List<String> lines = Files.readAllLines(
    Paths.get(file), StandardCharsets.UTF_8);
Pipeline pipeline = Pipeline.create();
pipeline.apply(Create.of(lines))
    .apply(Tokenize.of())
    .apply(Count.perElement())
    .apply(PrintElements.of());
pipeline.run().waitUntilFinish();

When used with care, this style greatly improves the readability of the code.

Now that we have written our first pipeline, let's see how to port it from a bounded data source to a streaming source!

Running our pipeline against streaming data

Let's discuss how we can change this code to enable it to run against a streaming data source. We first have to define what we mean by a data stream. A data stream is a continuous flow of data without any prior information about the cardinality of the dataset. The dataset can be either finite or infinite, but we do not know which in advance. Because of this property, the streaming data is often called unbounded data, because, as opposed to bounded data, no prior bounds regarding the cardinality of the dataset can be made.

The absence of bounds is one property that makes the processing of data streams trickier (the other is that bounded data sets can be viewed as static, while unbounded data is, by definition, changing over time). We'll investigate these properties later in this chapter, and we'll see how we can leverage them to define a Beam unified model for data processing.

For now, let's imagine our pipeline is given a source, which gives one line of text at a time but does not give any signal of how many more elements there are going to be. How do we need to change our data processing logic to extract information from such a source?

We'll update our pipeline to use a streaming source. To do this, we need to change the way we created our input PCollection of lines coming from a List via Create PTransform to a streaming input. Beam has a utility for this called TestStream, which works as follows.
Create a TestStream (a utility that emulates an unbounded data source). The TestStream needs a Coder (details of which will be skipped for now and will be discussed in Chapter 2, Implementing, Testing, and Deploying Basic Pipelines):
```
TestStream.Builder<String> streamBuilder =
    TestStream.create(StringUtf8Coder.of());
```

Next, we fill the TestStream with data. Note that we need a timestamp for each record so that the TestStream can emulate a real stream, which should have timestamps assigned for every input element:

Instant now = Instant.now();
// add all lines with timestamps to the TestStream
List<TimestampedValue<String>> timestamped =
    IntStream.range(0, lines.size())
        .mapToObj(i -> TimestampedValue.of(
           lines.get(i), now.plus(i)))
        .collect(Collectors.toList());
for (TimestampedValue<String> value : timestamped) {
  streamBuilder = streamBuilder.addElements(value);
}

Then, we will apply this to the pipeline:
```
// create the unbounded PCollection from TestStream
PCollection<String> input =
    pipeline.apply(streamBuilder.advanceWatermarkToInfinity());
```
We encourage you to investigate the complete source code of the com.packtpub.beam.chapter1.MissingWindowPipeline class to make sure everything is properly understood in the preceding example.
Next, we run the class with the following command:
```
chapter1$ ../mvnw exec:java \
    -Dexec.mainClass=\
        com.packtpub.beam.chapter1.MissingWindowPipeline
```
This will result in the following exception:
```
java.lang.IllegalStateException: GroupByKey cannot be applied to non-bounded PCollection in the GlobalWindow without a trigger. Use a Window.into or Window.triggering transform prior to GroupByKey.
```
This is because we need a way to identify the (at least partial) completeness of the data. That is to say, the data needs (explicit or implicit) markers that define a condition that (when met) triggers a completion of a computation and outputs data from a PTransform. The computation can then continue from the values already computed or be reset to the initial state.
There are multiple ways to define such a condition. One of them is to define time-constrained intervals called windows. A time-constrained window might be defined as data arriving within a specific time interval – for example, between 1 P.M. and 2 P.M.
As the exception suggests, we need to define a window to be applied to the input data stream in order to complete the definition of the pipeline. The definition of a Window is somewhat complex, and we will dive into all its parameters later in this book. But for now, we'll define the following Window:
```
PCollection<String> windowed =
    words.apply(
        Window.<String>into(new GlobalWindows())
            .discardingFiredPanes()
            .triggering(AfterWatermark.pastEndOfWindow()));
```
This code applies Window.into PTransform by using GlobalWindows, which is a specific Window that contains whole data (which means that it can be viewed as a Window containing the whole history and future of the universe).
The complete code can be viewed in the com.packtpub.beam.chapter1.FirstStreamingPipeline class.
As usual, we can run this code using the following command:
```
chapter1$ ../mvnw exec:java \
    -Dexec.mainClass=\
        com.packtpub.beam.chapter1.FirstStreamingPipeline
```
This results in the same outcome as in the first example and with the same caveat – the order of output is not defined and will vary over multiple runs of the same code against the same data. The values will be absolutely deterministic, though.

Once we have successfully run our first streaming pipeline, let's dive into what exactly this streaming data is, and what to expect when we try to process it!

Exploring the key properties of unbounded data

In the previous section, we successfully ran our sample pipeline against simulated unbounded data. We have seen that only a slight modification had to be made for the pipeline to produce output in the streaming case. Let's now dive a little deeper into understanding why this modification was necessary and how to code our pipelines to be portable from the beginning.

First of all, we need to define a notion of time. In our everyday life, time is a common thing we don't think that much about. We know what time it is at the moment, and we react to events that happen (more or less) instantly. We can plan for the future, but we cannot change the past.

When it comes to data processing, things change significantly. Let's imagine a smart home application that reads data from various sensors and acts based on the values it receives. Such an application is depicted in the following diagram:

Figure 1.4 – A simple sensor data processing application

The application reads a stream of incoming sensor data, reads the state associated with each device and/or the other settings related to the data being processed, (possibly) updates the state, and (possibly) outputs some resulting events or commands (for example, turn on a light if some condition is met).

Now, let's imagine we want to make modifications to the application logic. We add some new smart features, and we would like to know how the logic would behave if it had been fed with some historical events that we stored for a purpose like this. We cannot simply exchange the logic and push historical data through it because that would result in incorrect modifications of the state – the state might have been changed from the time we recorded our historical data. We see that we cannot mix two times – the time at which we process our data and the time at which the data originated. We usually call these two times the processing time and the event time. The first one is the time we see on our clock when an event arrives, while the other is the time at which the event occurred. A beautiful demonstration of these two times is depicted in the following table:

Figure 1.5 – Star Wars episodes' processing and event times

For those who are not familiar with the Star Wars saga, the processing time here represents the order in which the movies were released, while the event time represents the order of the episodes in the chronology of the story. By defining the event time and the processing time, we are able to explain another weird aspect of the streaming world – each data stream is inevitably unordered in terms of its event time. What do we mean by this? And why should this be inevitable?

The out-of-orderness of a data stream is shown in the following diagram:

Figure 1.6 – Unordered data stream

The circles represent data points (events), the x-axis is processing time, and the y-axis is event time. The upper-left half of the square should be empty under normal circumstances because that area represents events coming from the future – events with a higher event time than the current processing time. The rest of the data points represent events that arrive with a lower or higher delay from the time they occurred (event time). The vast majority of the delay is caused by technical reasons, such as queueing in the network stack, buffering, machine clocks being out of sync, or even outages of some parts of a distributed system. But there are also physical reasons why this happens – a vastly delayed data point is what you see if you look at the sky at night. The light coming from the stars we see with our naked eye is delayed by as much as a thousand years. Because even our physical reality works like this, the out-of-orderness is to be expected and has to be considered 'normal' in any stream processing.

So, we defined what the event and processing times are. We have a clock for measuring processing time. But what about event time? How do we measure that? Let's find out!

Measuring event time progress inside data streams

As we have shown, data streams are naturally unordered in terms of event time. Nevertheless, we need a way of measuring the completeness of our computation. Here is where another essential principle appears – watermarks.

A watermark is a (heuristic) algorithm that gives us an estimation of how far we have got in the event time domain. A perfect watermark gives the highest possible time (T) that guarantees no further data arrives with an event time < T. Let's demonstrate this with the following diagram:

Figure 1.7 – Watermark and late data

We can see from Figure 1.7 that the watermark is a boundary that moves along with the data points, ideally leaving all of the data on its left side. All data points lying on the right side (with a processing time on the x-axis) are called late data. Such data typically requires special handling that will be described in Chapter 2, Implementing, Testing, and Deploying Basic Pipelines.

There are many ways to implement watermarks. They are typically generated at the source and propagated through the pipeline. We will discuss the details of the implementation of some watermarks in later chapters dedicated to I/O connectors. Typically, users do not have to generate watermarks themselves, although it is very useful to have a very good understanding of the concept.

States and triggers

Each computation on a data stream that takes into account more than a single isolated event needs a state. The state holds (accumulates) values derived from the so-far processed stream elements. Let's imagine we want to calculate the current number of elements in a stream. We would do that as follows:

Figure 1.8 – Counting elements in a stream

The computational logic is straightforward – take the incoming element, read the current number of elements from the state, increment that number by one, store the new count into the state, and emit the current value to the output stream.

As simple as this sounds, the overall picture becomes quite complex when we consider that what we are building is an application that is supposed to run for a very long time (theoretically, forever). Such a long-running application will necessarily face disruptions caused by failing hardware, software, or necessary upgrades of the application itself. Therefore, each (sane) stream processing application has to be fault-tolerant by design.

Ensuring fault-tolerance puts specific requirements on the state and on the stream itself. Specifically, we must ensure the following:

We must keep the state in secure, fault-tolerant storage.
We must retain the ability to restore both the state and the stream to a defined position.

Both of these requirements dictate that every fault-tolerant stream processing engine must provide state management and state access APIs, and it must incorporate the state into its core concepts. The same holds true for Beam, and we'll dive deeper into the state concept in the following chapters.

Our element-count example raises another question: when should we output the resulting count? In the preceding example, we output the current count for each input element. This might not be adequate for every application. Other options would be to output the current value in the following ways:

In fixed periods of processing time (for instance, every 5 seconds)
When the watermark reaches a certain time (for instance, the output count when our watermark signals that we have processed all data up to 5 P.M.)
When a specific condition is met in the data

Such emitting conditions are called triggers. Each of these possibilities represents one option: a processing time trigger, an event time trigger, and a data-driven trigger. Beam provides full support for processing and event time triggers and supports one data-driven trigger, which is a trigger that outputs after a specific number of elements (for example, after every 10 elements).

If you remember, we have already seen a declaration of a trigger:

PCollection<String> windowed =
    words.apply(
        Window.<String>into(new GlobalWindows())
            .discardingFiredPanes()
            .triggering(
                AfterWatermark.pastEndOfWindow()));

This is one of many event time triggers, which specifies that we want to output a result when our watermark reaches a time, and it is defined as an end-of-window. We'll dive deeper into this in later chapters when we discuss the concept of windows.

Timers

Both event time and processing time triggers require an additional stream processing concept. This concept is a timer. A timer is a tool that lets an application specify a moment in either the processing time or event time domain, and when that moment is reached, an application-defined callback hook is called. For the same reason as with states, timers also need to be fault-tolerant (that is, they have to be kept in fault-tolerant storage). Beam is purposely designed so that there is actually no way to access a watermark directly, and the only way of observing a watermark is by using event time timers. We will investigate timers in more detail in Chapter 3, Implementing Pipelines Using Stateful Processing.

We now know that a streaming data processing engine needs to manage the application's state for us, but what is the life cycle of such a state? Let's find out!

Key benefits

Understand how to improve usability and productivity when implementing Beam pipelines

Learn how to use stateful processing to implement complex use cases using Apache Beam

Implement, test, and run Apache Beam pipelines with the help of expert tips and techniques

Description

Apache Beam is an open source unified programming model for implementing and executing data processing pipelines, including Extract, Transform, and Load (ETL), batch, and stream processing. This book will help you to confidently build data processing pipelines with Apache Beam. You’ll start with an overview of Apache Beam and understand how to use it to implement basic pipelines. You’ll also learn how to test and run the pipelines efficiently. As you progress, you’ll explore how to structure your code for reusability and also use various Domain Specific Languages (DSLs). Later chapters will show you how to use schemas and query your data using (streaming) SQL. Finally, you’ll understand advanced Apache Beam concepts, such as implementing your own I/O connectors. By the end of this book, you’ll have gained a deep understanding of the Apache Beam model and be able to apply it to solve problems.

What you will learn

Understand the core concepts and architecture of Apache Beam

Implement stateless and stateful data processing pipelines

Use state and timers for processing real-time event processing

Structure your code for reusability

Use streaming SQL to process real-time data for increasing productivity and data accessibility

Run a pipeline using a portable runner and implement data processing using the Apache Beam Python SDK

Implement Apache Beam I/O connectors using the Splittable DoFn API

What do you get with Print?

Instant access to your digital eBook copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Frequently bought together

The Machine Learning Solutions Architect Handbook

$89.99

$54.99

Building Big Data Pipelines with Apache Beam

$48.99

Total $ 193.97

Filter reviews by

All

Amazon verified reviews

Paul Henry Sep 08, 2024

As most data engineers, I code on python. So I just assumed the book would have examples in python. I was disappointed to see the book only in java, so it almost completely useless to me. Reading several chapters, I noticed the material was also over technical and not well explained. For example, do I really need to know that a window contains state? It's pretty obvious by examples that would be the case.

Amazon Verified review

Rachel Jun 15, 2022

The book explains Apache Beam distributed data processing system for batch and streaming processing, from introducing learning examples to diving deep into advanced topics.I liked that book provides Github repo with all you need to focus on learning Apache Beam.The first two chapters introduce foundational concepts for pipelines, data transforms, batch, streaming, how to think of event and processing times, and windowed processing. Next, book covers stateful processing that can be used for processing data with external RPC services, using side inputs to enrich data. While the initial introduction is using Java, then the book expands to using Apache Beam with SQL, Python, and a concept of cross-language pipelines.Advanced topics include exploring building your custom I/O using splittable DoFn, understanding how Beam runners execute pipelines at scale, observability etc. - this all helps to deeper understand Apache Beam and apply that to help optimize data pipelines.

David May 22, 2022

In general it's a useful book for learning Apache Beam. It contains a lot of information and some good walkthrough use cases that will show you how to solve some problems.But it definitely has some serious flaws.(1) If you don't already know the concepts of streaming well the book will confuse you because the explanations of these concepts are so short and bad that you will get lost.(2) Also, some of the code that is shown in the book is completely new to you and the way the author explains them don't help you to understand what the theory he was talking about has to do with the code.I had to go a step back and read a book on streaming concepts first to really understand what is going on in this book.The two star ratings comes from the fact that I think a book should be clear to understand and the writing should be so concise that you understand the content if you have the required knowledge that the book states. However, this was not the case here.

Nelson J. Perez Mar 11, 2022

The author it's definitely an expert on Apache Beam and other technologies that he uses in the exercises.But the book it's hard to swallow just like kale salad with no dressing.The problem is that he gives good examples but they are half baked in the book so you need to look at the source code in github. It's definitely a hands on book. But seems like he skim the basics like explaning the actual parameters and objects in Apache Beam.He introduces examples with unexplained code early on but does not explain quite well the code he just says go to the github repository to see the full code.Some of the github code doesn't work out of the box so you'll need to massage it a little to make work I'm still could not make chapter 6 code work so I had to disable it from the build and Docker file.Overall you need to read things twice or go back to understand previous examples. But you'll definitely need to look understand and play with the source code.I think he could have done a much better job explaining the basics which are missing in the book. A big plus is that it has lots of practical examples. Just not detailed enough explanations.Again he is an expert for sure but he skimmed through the basics sort of forgetting that the reader needs those to understand the rest of the book.

Antonio Cachuan Feb 22, 2022

A great opportunity to increase your skill in doing Apache Beam pipelines. I think this book is a good pocket reference if you are consolidating your Beam knowledge. Let me share the pros and cons after reading it:Pros-The book starts by giving you an overview of all the requirements for setting up an environment, and all the examples (mainly using Java) are located in a repository. I was new to using Minikube but I was able to run the examples without major difficulties. -Continuous giving details about Beam concepts like PCollections, PTransform, and others. Here I have to admit that the explanation of the concept of Window helped me to finally clarify some personal doubts in this field.-I agree with how was presented the batch to the streaming world "batch is a special case of streaming" and the examples that followed support the author's idea.-All the chapters include examples and Jan explains line-by-line the main parts of his code.-In chapters 8 and 4, the author made a great work explaining all about Runners and PTransform respectively.Cons-I would like to get more advice for deploying a pipeline in production. For example, including tips and more troubleshooting or including a deployment sample in any cloud provider.-Not exactly a con, but you need some experience in Java, and some background developing pipelines.ConclusionsMy final thought is that if you want to go serious in Apache Beam purchasing this book is a no doubt decision.

Building Big Data Pipelines with Apache Beam: Use a single programming model for both batch and stream data processing

What do you get with Print?

Building Big Data Pipelines with Apache Beam

Chapter 1: Introduction to Data Processing with Apache Beam

Technical requirements

Why Apache Beam?

Writing your first pipeline

Running our pipeline against streaming data

Exploring the key properties of unbounded data

Measuring event time progress inside data streams

States and triggers

Timers

Page 1 of 10

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Building Big Data Pipelines with Apache Beam: Use a single programming model for both batch and stream data processing

What do you get with Print?

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs