This section introduces the basics of using applied examples.
Basics of statistics
Summary level statistics
Summary level statistics provide us with such information as minimum, maximum, and mean values of data.
The following is an example in Spark that looks at summarizing numbers from 1 to 100:
- Start a Spark shell in your Terminal:
$ spark-shell
- Import Random from Scala's util package:
scala> import scala.util.Random
import scala.util.Random
- Generate integers from 1 to 100 (included) and use the shuffle method of Scala's Random utility class to randomize their positions:
scala> val nums = Random.shuffle(1 to 100) // 100 numbers randomized
nums: scala.collection.immutable.IndexedSeq[Int] = Vector(70, 63...