In this recipe, we use Spark RDD-based regression API to demonstrate how to use an iterative optimization technique to minimize the cost function and arrive at a solution for a linear regression.
We examine how Spark uses an iterative method to converge on a solution to the regression problem using a well-known method called Gradient Descent. Spark provides a more practical implementation known as SGD, which is used to compute the intercept (in this case set to 0) and the weights for the parameters.