Implementing the transaction batch producer
In this section, we will first discuss how to call a REST API to fetch BTC/USD transactions. Then we will see how to use Spark to deserialize the JSON payload into a well-typed distributed Dataset
.
After that, we will introduce the parquet format and see how Spark makes it easy to save our transactions in this format.
With all of these building blocks, we will then implement our program in a purely functional way using the Test-Driven-Development (TDD) technique.
Calling the Bitstamp REST API
Bitstamp is a cryptocurrency exchange that people use to trade a cryptocurrency, such as bitcoin, for a conventional currency, such as US dollar or euro. One of the good things about Bitstamp is that it provides a REST API, which can be used to get information about the latest trades, and can also be used to send orders if you have an account.
You can find out more here:Â https://www.bitstamp.net/api/.
For this project, the only endpoint we are interested in is the...