Chapter 5. Retrieving, Processing, and Storing Data
Data can be found everywhere, in all shapes and forms. We can get it from the web, by e-mail and FTP, or we can create it ourselves in a lab experiment or marketing poll. An exhaustive overview of how to acquire data in various formats will require many more pages than we have available. Sometimes, we need to store data before we can analyze it or after we are done with our analysis. We will discuss storing data in this chapter. Chapter 8, Working with Databases, gives information about various databases (relational and NoSQL) and related APIs. The following is a list of the topics that we are going to cover in this chapter:
- Writing CSV files with NumPy and Pandas
- The binary
.npy
and pickle formats - Storing data with PyTables
- Reading and writing Pandas DataFrames to HDF5 stores
- Reading and writing to Excel with Pandas
- Using REST web services and JSON
- Reading and writing JSON with Pandas
- Parsing RSS and Atom feeds
- Parsing HTML with Beautiful...