In the previous chapters, we established the fact that the machine learning algorithms generalize the input data into a hypothesis that fits the data so that the output, based on the new values, can be predicted accurately by the model. The accuracy of the model is a function of the amount of the input data along with variation in the values of the independent variables. The more data and variety, the more computation power we require to generate and execute the models. The distributed computing frameworks (Hadoop, Spark, and so on) work very well with the large volumes of data with variety. The same principles apply to ANNs.
The more input data we have along with variations, the more accurate the models can be generated, which requires more storage and computation power. Since the computation power and storage is available with...