In this chapter we saw the free and open source sparklyr package that provides an R interface to Spark and a backend to the dplyr package. Later, we created the dplyr data and SQL to manipulate Spark. We also used R to analyze Spark datasets to manipulate a table. We saw how we can use machine learning with Spark using R by Spark machine learning library and H2O Sparking Water. Later we created an extension application using Spark API and various Spark packages to browse Spark DataFrames within Rstudio IDE.
In the next chapter, we will see how we can use R on Azure Machine Learning Studio.