Part 2: Data Acquisition and Management
Machine learning software depends on data much more than other types of software. In order to make use of statistical learning, we need to collect, process, and prepare data for the development of machine learning models. The data needs to be representative of the problems that the software solves and the services it provides, not only during development but also during operations. In this part of the book, we focus on the data – how we can acquire it and how we make it useful for the training, testing, and deployment of machine learning models.
This part has the following chapters:
- Chapter 6, Processing Data in Machine Learning Systems
- Chapter 7, Feature Engineering for Numerical and Image Data
- Chapter 8, Feature Engineering for Natural Language Data