Reading data from CSVs and other delimited files
In this recipe, you will use the pandas.read_csv()
function, which offers a large set of parameters that you will explore to ensure the data is properly read into a time series DataFrame. In addition, you will learn how to specify an index column, parse the index to be of the type DatetimeIndex
, and parse string columns that contain dates into datetime
objects.
Generally, using Python, data read from a CSV file will be in string format (text). When using the read_csv
method in pandas, it will try to infer the appropriate data types (dtype), and, in most cases, it does a great job at that. However, there are situations where you will need to explicitly indicate which columns to cast to a specific data type. For example, you will specify which column(s) to parse as dates using the parse_dates
parameter in this recipe.
Getting ready
You will read a CSV file containing hypothetical box office numbers for a movie. The file is provided in the...