Chapter 7. Programming with RDDs
Analyzing historical data and uncovering the hidden pattern is one of the key objectives of modern enterprises. Data scientists/architects/developers are striving hard to implement various data analysis strategies that can help them in analyzing the data and uncovering the value in the shortest possible time.
Data analysis itself is a complex and multistep process. It is a process of examining each component of the provided data using analytical and logical reasoning, and deriving value out of it. Data from various sources are collected, reviewed, and then analyzed by leveraging variety of data analysis methods like data mining, text analysis, and many more with an objective of discovering useful information, suggesting conclusions, and supporting decision making.
Transformation is a most critical and important step in the data analysis process. It requires deep understanding of various techniques and functions available in a language that can help...