To get the most out of this book
It is expected that you have knowledge of Core Java and Maven to get the most out of the book. Basic knowledge of Apache Spark is desirable for Chapter 5, Architecting a Batch Processing Pipeline. Basic knowledge of Kafka is desirable for Chapter 6, Architecting a Real-Time Processing Pipeline. Also, basic knowledge of MongoDB is good to have to understand the implementation of Chapters 6, 9, and 10.
You can set up your local environment by ensuring the Java SDK, Maven, and IntelliJ IDEA Community Edition are installed. You can use the following links for installation:
- JDK installation guide: https://docs.oracle.com/en/java/javase/11/install/overview-jdk-installation.html#GUID-8677A77F-231A-40F7-98B9-1FD0B48C346A
- Maven installation guide: https://maven.apache.org/install.html
- IntelliJ IDEA installation guide: https://www.jetbrains.com/help/idea/installation-guide.html
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.