Pandas is the tool for data manipulation in Python—it combines speed and convenience, allowing the rapid processing and manipulation of data. Let's first overview a number of basic operations: pandas is simple and intuitive to use, but it is still a learning curve.
pandas does have two main data structures:
- Series is a one-dimensional array of one data type that also has an index. The index could be numeric, categorical, a string, or datetime.
- DataFrame is a two-dimensional table consisting of a set of columns—each of one single data type. Dataframe has two indexes—index and columns. Columns of Dataframe can be thought of as Series. Rows can be retrieved as Series but, in this case, data in the cells will likely be converted to one shared data type object (more on that later).
Most of the time, we get our data from external...