Machine learning solution architecture for big data (employing Hadoop)
In this section, let us look at the essential architecture components for implementing a Machine learning solution considering big data requirements.
The proposed solution architecture should support the consumption of a variety of data sources in an efficient and cost-effective way. The following figure summarizes the core architecture components that should potentially be a part of the Machine learning solution technology stack. The choice of frameworks can either be open source or packaged license options. In the context of this book, we consider the latest version of open source (Apache) distribution of Hadoop and its ecosystem components.
Note
Vendor specific frameworks and extensions are out of scope for this chapter.
In the next sections, we'll discuss in detail each of these Reference Architecture layers and the required frameworks in each layer.
The Data Source layer
The Data Source layer forms a critical part...