Working with big data
Now that you have been introduced to NumPy and pandas, you will use them to analyze real data of a much larger size. The phrase big data does not have an unambiguous meaning. Generally speaking, you can think of big data as data that is far too large to analyze by sight. It could contain tens of thousands, millions, billions, trillions, or even more rows of data.
Data scientists analyze data that exists in the cloud or online. One strategy to analyze real data is to download the data directly to your computer.
Note
It is recommended to create a new folder called Data
to store all of the data that you will download for analysis. You can open your Jupyter Notebook in this same folder.
Downloading data
Data comes in many formats, and pandas is equipped to handle most of them. In general, when looking for data to analyze, it’s worth searching for the keyword “dataset.” A dataset is a collection of raw data that has been stored for...