pandas DataFrames
A pandas DataFrame
is a data structure, which is a labeled two-dimensional object and is similar in spirit to an Excel worksheet or a relational database table. A similar concept, by the way, was invented originally in the R programming language. (For more information, refer to http://www.r-tutor.com/r-introduction/data-frame.) A DataFrame
can be created in the following ways:
From another
DataFrame
.From a NumPy array or a composite of arrays that has a two-dimensional shape.
Likewise, we can create a
DataFrame
out of another pandas data structure calledSeries
. We will learn aboutSeries
in the following section.A
DataFrame
can also be produced from a file, such as a CSV file.
As an example, we will use data that can be retrieved from http://www.exploredata.net/Downloads/WHO-Data-Set. The original datafile is quite big and has many columns, so we will use an edited file instead, which only contains the first nine columns and is called WHO_first9cols.csv
; the file is in the...