Programming with SparkR
So far, we have understood the runtime model of SparkR and the basic data abstractions that provide the fault tolerance and scalability. We have understood how to access the Spark API from R shell or R studio. It's time to try out some basic and familiar operations:
> > //Open the shell > > //Try help(package=SparkR) if you want to more information > > df <- createDataFrame(iris) //Create a Spark DataFrame > df //Check the type. Notice the column renaming using underscore SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, Petal_Width:double, Species:string] > > showDF(df,4) //Print the contents of the Spark DataFrame +------------+-----------+------------+-----------+-------+ |Sepal_Length|Sepal_Width|Petal_Length|Petal_Width|Species| +------------+-----------+------------+-----------+-------+ | 5.1| 3.5| 1.4| 0.2| setosa| | 4.9| 3.0| 1.4| ...