Working with streaming sources
Apache Spark uses the structured streaming API to read from a large variety of streaming sources. This API is easy for developers to use because it is very interoperable with Spark’s batch API, so developers can reuse their knowledge across both use cases. The following are some examples of using Spark to read from Kafka and Kinesis and show the results in the console. The writeStream
portion will be covered in more detail later in this section:
//KINESIS val spark = SparkSession .builder() .appName("Spark SQL basic example") .master("local[*]") .getOrCreate() var df = spark.readStream.format("kinesis") .option("streamName","scala-book") .option("region","us-east-1") .option("initialPosition","TRIM_HORIZON") .option("awsAccessKeyId",sys.env.getOrElse("AWS_ACCESS_KEY_ID","")) .option("awsSecretKey",sys.env...