Chapter 4. Classifying UFO Sightings
In this chapter, we're going to look at a dataset of UFO sightings. Sometimes, data analysis begins with a specific question or problem. Sometimes, however, it's more nebulous and vague. We'll engage with this UFO sighting dataset, and along the way, we'll learn more about data exploration, data visualization, and topic modeling before we dive into Naïve Bayesian classification.
This dataset was collected by the National UFO Reporting Center (NUFORC), and is available at http://www.nuforc.org/. They have included dates, rough locations, shapes, and descriptions of the sightings. We'll download and pull in this dataset. We'll see how to extract more structured data from messy, free-form text. And from there, we'll see how to visualize, analyze, and gain insights into our data.
In the process, we'll discover when is the best time to look for UFOs. We'll also learn what their important characteristics are. And we'll learn how to tell a description of a possible...