Chapter 2. Preprocessing Data
Building real world data analytic solutions requires accurate data. In this chapter, we discuss how to collect, clean, normalize, and transform raw data into a standard format such as Comma-Separated Values (CSV) format or JavaScript Object Notation (JSON), using a tool to process a messy data called OpenRefine.
In this chapter, we will cover the following:
- Data sources
- Data scrubbing
- Data reduction methods
- Data formats
- Getting started with OpenRefine