What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!

Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!

50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

Thousands of reference materials covering every tech concept you need to stay up to date.

Subscribe now

View plans & pricing

Revisiting Java

This chapter is added as the refresher course of Java. In this chapter, we will discuss some concepts of Java that will be useful while creating applications in Apache Spark.

This book assumes that the reader is comfortable with the basics of Java. We will discuss useful Java concepts and mainly focus on what is new in Java 8? More importantly, we will touch upon on topics such as:

Generics
Interfaces
Lambda expressions
Streams

Key benefits

Perform big data processing with Spark—without having to learn Scala!

Use the Spark Java API to implement efficient enterprise-grade applications for data processing and analytics

Go beyond mainstream data processing by adding querying capability, Machine Learning, and graph processing using Spark

Description

Apache Spark is the buzzword in the big data industry right now, especially with the increasing need for real-time streaming and data processing. While Spark is built on Scala, the Spark Java API exposes all the Spark features available in the Scala version for Java developers. This book will show you how you can implement various functionalities of the Apache Spark framework in Java, without stepping out of your comfort zone. The book starts with an introduction to the Apache Spark 2.x ecosystem, followed by explaining how to install and configure Spark, and refreshes the Java concepts that will be useful to you when consuming Apache Spark's APIs. You will explore RDD and its associated common Action and Transformation Java APIs, set up a production-like clustered environment, and work with Spark SQL. Moving on, you will perform near-real-time processing with Spark streaming, Machine Learning analytics with Spark MLlib, and graph processing with GraphX, all using various Java packages. By the end of the book, you will have a solid foundation in implementing components in the Spark framework in Java to build fast, real-time applications.

What you will learn

Process data using different file formats such as XML, JSON, CSV, and plain and delimited text, using the Spark core Library.

Perform analytics on data from various data sources such as Kafka, and Flume using Spark Streaming Library

Learn SQL schema creation and the analysis of structured data using various SQL functions including Windowing functions in the Spark SQL Library

Explore Spark Mlib APIs while implementing Machine Learning techniques to solve real-world problems

Get to know Spark GraphX so you understand various graph-based analytics that can be performed with Spark

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!

Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!

50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

Thousands of reference materials covering every tech concept you need to stay up to date.

Subscribe now

View plans & pricing

Frequently bought together

S$74.99

Building Data Streaming Applications with Apache Kafka

S$66.99

S$74.99

Total S$ 216.97

Ray Brown Apr 07, 2020

The index needs a lot of help. I don't know if this is a packt publisher problem. The book has a few typos, but only annoying. Spark is a huge subject and this text -- used as a notebook so you can add your own material, combined with a course on Spark can get you started in the right direction. I've not seen any great texts that cover Spark thoroughly and do not require some research on your own. Spark is a changing product that can provide significant throughput increases with Machine Learning and Extract Transform and Load (ETL) systems. Regardless of which text you purchase you will be doing research on the web to find all your answers.

Amazon Verified review

Amazon Customer Oct 19, 2019

content not upto the mark

mark berman Dec 21, 2017

Lots of grammatical and spelling mistakes. Detracts from quality of this book. Suggest the authors engage a professional proof reader next time.

phani kumar yadavilli Mar 31, 2018

Some of the chapters are staggered and they are completely unreadable. Please check the screenshots for more details.

Apache Spark 2.x for Java Developers: Explore big data at scale using Apache Spark 2.x Java APIs

What do you get with a Packt Subscription?

Apache Spark 2.x for Java Developers

Revisiting Java

Why use Java for Spark?

Generics

Interfaces

Static method in an interface...

Lambda expressions

Lexical scoping

Why use Java for Spark?

Generics

Interfaces

Static method in an interface

Lambda expressions

Note

Page 1 of 10

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with a Packt Subscription?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the authors

FAQs

Apache Spark 2.x for Java Developers: Explore big data at scale using Apache Spark 2.x Java APIs

What do you get with a Packt Subscription?

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with a Packt Subscription?

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the authors

FAQs