Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Apache Spark for Data Science Cookbook
Apache Spark for Data Science Cookbook

Apache Spark for Data Science Cookbook: Solve real-world analytical problems

Arrow left icon
Profile Icon Chitturi Profile Icon Nagamallikarjuna Inelu
Arrow right icon
Free Trial
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (4 Ratings)
Paperback Dec 2016 392 pages 1st Edition
eBook
₱579.99 ₱2000.99
Paperback
₱2500.99
Subscription
Free Trial
Arrow left icon
Profile Icon Chitturi Profile Icon Nagamallikarjuna Inelu
Arrow right icon
Free Trial
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (4 Ratings)
Paperback Dec 2016 392 pages 1st Edition
eBook
₱579.99 ₱2000.99
Paperback
₱2500.99
Subscription
Free Trial
eBook
₱579.99 ₱2000.99
Paperback
₱2500.99
Subscription
Free Trial

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Apache Spark for Data Science Cookbook

Chapter 2. Tricky Statistics with Spark

In this chapter, you will learn the following recipes:

  • Working with Pandas
  • Variable identification
  • Sampling data
  • Summary and descriptive statistics
  • Generating frequency tables
  • Installing Pandas on Linux
  • Installing Pandas from source
  • Using IPython with PySpark
  • Creating Pandas DataFrames over Spark
  • Splitting, slicing, sorting, filtering and grouping DataFrames over Spark.
  • Implementing co-variance and correlation using DataFrames over Spark.
  • Concatenating and merging operations over DataFrames
  • Complex operations over DataFrames.
  • Sparkling Pandas

Introduction

Statistics refers to the mathematics and techniques with which we understand data. It is a vast field which plays a key role in the areas of data mining and artificial intelligence, intersecting with the areas of engineering and other disciplines. Statistics helps in describing data, that is, descriptive statistics reveals the distribution of the data for each variable. Also, statistics is widely used for the purpose of prediction.

In this chapter, we'll see how to apply various statistical measures and functions on large datasets using Spark.

Working with Pandas

Pandas is an open source Python library for highly specialized data analysis. It is the reference point that all professionals using the Python language need to study and analyze data sets for statistical purposes of analysis and decision-making. Pandas arises from the need to have a specific library for the analysis of the data which provides tools for data processing , data extraction and data manipulation...

Variable identification

In this recipe, we will see how to identify predictor (input) and target (output) variables for data at scale in Spark. Then the next step is to identify the category of the variables.

Getting ready

To step through this recipe, you will need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, you need to have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

  1. Let's take an example of student's data, using which we want to predict whether a student will play cricket or not. Here is what the sample data looks like:
    How to do it…
  2. The preceding data resides in HDFS and load the data into Spark as follows:
          import org.apache.spark._ 
          import org.apache.spark.sql._ 
            object tricky_Stats { 
             def main(args:Array[String]): Unit = { 
                val conf = new SparkConf() 
                    .setMaster("spark://master:7077") 
                    .setAppName("Variable_Identification") 
                val sc = new SparkContext...

Sampling data

In this recipe, we will see how to generate sample data from the entire population.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed. Readers are expected to have knowledge of sampling techniques.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

How to do it…

Note

Download the data from the following location https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

  1. Here is the code for sampling data from a DataFrame:
          import org.apache.spark._ 
          import org.apache.spark.sql.SQLContext 
          import org.apache.spark.sql.types.{StructType,
          StringType,DoubleType, StructField} 
         
          object Sampling_Demo { 
            def main(args:Array[String]): Unit = { 
              val conf = new SparkConf() 
                .setMaster("spark://master:7077") 
                   .setAppName...

Summary and descriptive statistics

In this recipe, we will see how to get the summary statistics for data at scale in Spark. The descriptive summary statistics helps in understanding the distribution of data.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

How to do it…

Note

Download the data from the following location: https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

  1. The preceding data contains numerical as well as categorical fields. We can get the summary of numerical fields as follows:
          import org.apache.spark._ 
          import org.apache.spark.sql._ 
          object Summary_Statistics { 
             def main(args:Array[String]): Unit = { 
                 val conf = new SparkConf() 
                  .setMaster("spark://master:7077") 
          ...

Generating frequency tables

In this recipe, we will see how to analyze the distribution of various variables in the data. Generally, we can take a histogram/boxplot of the variables to understand the distribution and also identify the outliers. But currently, Spark has no support for plotting the data. Let's see how we can perform analysis by generating frequency tables.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

How to do it…

Note

Download the data from the following location: https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

The total record count is 614.

  1. Let us look at the chances of getting a loan-based on Credit_History. Here is the code to generate the frequency distribution of set of variables such as Loan_Status and Credit_History...

Introduction


Statistics refers to the mathematics and techniques with which we understand data. It is a vast field which plays a key role in the areas of data mining and artificial intelligence, intersecting with the areas of engineering and other disciplines. Statistics helps in describing data, that is, descriptive statistics reveals the distribution of the data for each variable. Also, statistics is widely used for the purpose of prediction.

In this chapter, we'll see how to apply various statistical measures and functions on large datasets using Spark.

Working with Pandas

Pandas is an open source Python library for highly specialized data analysis. It is the reference point that all professionals using the Python language need to study and analyze data sets for statistical purposes of analysis and decision-making. Pandas arises from the need to have a specific library for the analysis of the data which provides tools for data processing , data extraction and data manipulation.

It is designed...

Variable identification


In this recipe, we will see how to identify predictor (input) and target (output) variables for data at scale in Spark. Then the next step is to identify the category of the variables.

Getting ready

To step through this recipe, you will need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, you need to have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

  1. Let's take an example of student's data, using which we want to predict whether a student will play cricket or not. Here is what the sample data looks like:

  2. The preceding data resides in HDFS and load the data into Spark as follows:

          import org.apache.spark._ 
          import org.apache.spark.sql._ 
            object tricky_Stats { 
             def main(args:Array[String]): Unit = { 
                val conf = new SparkConf() 
                    .setMaster("spark://master:7077") 
                    .setAppName("Variable_Identification") 
                val sc = new SparkContext...

Sampling data


In this recipe, we will see how to generate sample data from the entire population.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed. Readers are expected to have knowledge of sampling techniques.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

Note

Download the data from the following location https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

  1. Here is the code for sampling data from a DataFrame:

          import org.apache.spark._ 
          import org.apache.spark.sql.SQLContext 
          import org.apache.spark.sql.types.{StructType,
          StringType,DoubleType, StructField} 
         
          object Sampling_Demo { 
            def main(args:Array[String]): Unit = { 
              val conf = new SparkConf() 
                .setMaster("spark://master:7077") 
        ...

Summary and descriptive statistics


In this recipe, we will see how to get the summary statistics for data at scale in Spark. The descriptive summary statistics helps in understanding the distribution of data.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

Note

Download the data from the following location: https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

  1. The preceding data contains numerical as well as categorical fields. We can get the summary of numerical fields as follows:

          import org.apache.spark._ 
          import org.apache.spark.sql._ 
          object Summary_Statistics { 
             def main(args:Array[String]): Unit = { 
                 val conf = new SparkConf() 
                  .setMaster("spark://master:7077") 
    ...

Generating frequency tables


In this recipe, we will see how to analyze the distribution of various variables in the data. Generally, we can take a histogram/boxplot of the variables to understand the distribution and also identify the outliers. But currently, Spark has no support for plotting the data. Let's see how we can perform analysis by generating frequency tables.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Also, have Apache Hadoop 2.6 and Apache Spark 1.6.0 installed.

How to do it…

Let's take an example of load prediction data. Here is what the sample data looks like:

Note

Download the data from the following location: https://github.com/ChitturiPadma/datasets/blob/master/Loan_Prediction_Data.csv.

The total record count is 614.

  1. Let us look at the chances of getting a loan-based on Credit_History. Here is the code to generate the frequency distribution of set of variables such as Loan_Status and Credit_History :

          import org...

Installing Pandas on Linux


In this recipe, we will see how to install Pandas on Linux. Before proceeding with the installation, let's consider the version of Python we're going to use. There are two versions or flavors of Python, namely Python 2.7.x and Python 3.x. Although the latest version, Python 3.x, appears to be the better choice, for scientific, numeric, or data analysis work, Python 2.7 is recommended.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Python comes pre-installed. The python --version command gives the version of Python installed. If the version seems to be 2.6.x, upgrade it to Python 2.7 as follows:

sudo apt-get install python2.7 

How to do it…

  1. Once Python version is available, make sure that the Python .dev files are installed. If not, install them as follows:

          sudo apt-get install python-dev 
    
    
  2. Installing through pip:

          sudo apt-get install python-pip 
          sudo pip install numpy 
    ...

Installing Pandas from source


In this recipe, we will see how to install Pandas from Source on Linux. Before proceeding with the installation, let's consider the version of Python we're going to use. There are two versions or flavors of Python, namely Python 2.7.x and Python 3.x. Although the latest version, Python 3.x, appears to be the better choice, for scientific, numeric, or data analysis work, Python 2.7 is recommended.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Python comes pre-installed. The python --version command gives the version of Python installed. If the version seems to be 2.6.x, upgrade it to Python 2.7 as follows:

    sudo apt-get install python2.7

How to do it…

  1. Install the easy_install program:

           wget http://python-distribute.org/distribute_setup.pysudo python 
           distribute_setup.py
    
    
  2. Install Cython:

           sudo easy_install -U Cython
    
    
  3. Install from the source code as follows:

       ...

Using IPython with PySpark


As Python is the most preferred choice for data scientists due to its high-level syntax and extensive library of packages, Spark developers have considered it for data analysis. The PySpark API has been developed for working with RDDs in Python. IPython Notebook is an essential tool for data scientists to present the scientific and theoretical work in an interactive fashion, integrating both text and Python code.

This recipe shows how to configure IPython with PySpark and also focuses on connecting the IPython shell to PySpark.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Python comes pre-installed. The python --version command gives the version of the Python installed. If the version seems to be 2.6.x, upgrade it to Python 2.7 as follows:

    sudo apt-get install python2.7

How to do it…

  1. Install IPython as follows:

           sudo pip install ipython 
    
    
  2. Create an IPython profile for use with PySpark...

Creating Pandas DataFrames over Spark


A DataFrame is a distributed collection of data organized into named columns. It is equivalent to a table in a relational database or a DataFrame in R/Python Python with rich optimizations. These can be constructed from a wide variety of sources, such as structured data files (JSON and parquet files), Hive tables, external databases, or from existing RDDs.

PySpark is the Python API for Apache Spark which is designed to scale to huge amounts of data. This recipe shows how to make use of Pandas over Spark.

Getting ready

To step through this recipe, you will need a running Spark cluster either in pseudo distributed mode or in one of the distributed modes, that is, standalone, YARN, or Mesos. Also, have Python and IPython installed on the Linux machine, that is, Ubuntu 14.04.

How to do it…

  1. Invoke ipython console -profile=pyspark  as follows:

          In [4]: from pyspark import SparkConf, SparkContext, SQLContext
          In [5]: import pandas as pd
    
  2. Creating...

Left arrow icon Right arrow icon

Key benefits

  • Use Apache Spark for data processing with these hands-on recipes
  • Implement end-to-end, large-scale data analysis better than ever before
  • Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas to gain insights from your data

Description

Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark’s selling point is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. It lets you tackle the complexities that come with raw unstructured data sets with ease. This guide will get you comfortable and confident performing data science tasks with Spark. You will learn about implementations including distributed deep learning, numerical computing, and scalable machine learning. You will be shown effective solutions to problematic concepts in data science using Spark’s data science libraries such as MLLib, Pandas, NumPy, SciPy, and more. These simple and efficient recipes will show you how to implement algorithms and optimize your work.

Who is this book for?

This book is for novice and intermediate level data science professionals and data analysts who want to solve data science problems with a distributed computing framework. Basic experience with data science implementation tasks is expected. Data science professionals looking to skill up and gain an edge in the field will find this book helpful.

What you will learn

  • Explore the topics of data mining, text mining, Natural Language Processing, information retrieval, and machine learning.
  • Solve real-world analytical problems with large data sets.
  • Address data science challenges with analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale.
  • Get hands-on experience with algorithms like Classification, regression, and recommendation on real datasets using Spark MLLib package.
  • Learn about numerical and scientific computing using NumPy and SciPy on Spark.
  • Use Predictive Model Markup Language (PMML) in Spark for statistical data mining models.

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Dec 22, 2016
Length: 392 pages
Edition : 1st
Language : English
ISBN-13 : 9781785880100
Vendor :
Apache
Category :
Concepts :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Dec 22, 2016
Length: 392 pages
Edition : 1st
Language : English
ISBN-13 : 9781785880100
Vendor :
Apache
Category :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just ₱260 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just ₱260 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 8,420.97
Spark for Data Science
₱2806.99
Apache Spark for Data Science Cookbook
₱2500.99
Mastering Spark for Data Science
₱3112.99
Total 8,420.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Big Data Analytics with Spark Chevron down icon Chevron up icon
2. Tricky Statistics with Spark Chevron down icon Chevron up icon
3. Data Analysis with Spark Chevron down icon Chevron up icon
4. Clustering, Classification, and Regression Chevron down icon Chevron up icon
5. Working with Spark MLlib Chevron down icon Chevron up icon
6. NLP with Spark Chevron down icon Chevron up icon
7. Working with Sparkling Water - H2O Chevron down icon Chevron up icon
8. Data Visualization with Spark Chevron down icon Chevron up icon
9. Deep Learning on Spark Chevron down icon Chevron up icon
10. Working with SparkR Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5
(4 Ratings)
5 star 50%
4 star 0%
3 star 25%
2 star 0%
1 star 25%
pavan kumar jalla Sep 10, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As a big data engineer for 3 years in the industry, I was looking around for a solid hands on book for data science, this book has great content and well structred right from the beginning till the end, which takes you a deep dive into data science concepts, appreciate the author for sharing her knowledge.would recommend to anyone who is looking for practical data science approach.
Amazon Verified review Amazon
Brandon Jan 23, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book represents a useful resource to learn Spark programming model and how to employ it in several tasks. The approach followed is very practical, with code provided in every chapter, which guarantees a fast learning process. As technical reviewer of this book I feel to suggest it to people who want to understand how to perform data exploration, analysis and visualization tasks in Spark. With the many use cases covered in the book, it will represent a resource to inspire solutions for daily working tasks.
Amazon Verified review Amazon
Dimitri Shvorob Jun 01, 2017
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
I would dismiss a five-star review by the book's technical reviewer - conflict of interest, anyone? - and "Apache Spark for Data Science Cookbook" is not a five-star book. It is, however, a decent book which compensates for the Packt-standard weakness of explanations with a thoughtful collection of (Scala) code, paying attention to the less glamorous but essential job of data manipulation. And yet, I hesitate to recommend it, and feel that a combo of "Machine Learning with Spark" by Pentreath and "Spark for Data Science" by Duvvuri and Singhal would be a better choice. I would suggest getting all three and deciding which one(s) to leave.
Amazon Verified review Amazon
Santanu Feb 25, 2017
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
This book does not improve you spark knowledge. Only bunch of code with input and output. No proper comments on code.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.