Data loading and preprocessing with pandas
In the previous chapter, we discussed where to find useful datasets and examined basic import commands of Python packages. In this section, having kept your toolbox ready, you are about to learn how to structurally load, manipulate, preprocess, and polish data with pandas and NumPy.
Fast and easy data loading
Let's start with a CSV file and pandas. The pandas library offers the most accessible and complete function to load tabular data from a file (or a URL). By default, it will store the data into a specialized pandas data structure, index each row, separate variables by custom delimiters, infer the right data type for each column, convert data (if necessary), as well as parse dates, missing values, and erroneous values.
In: import pandas as pd iris_filename = 'datasets-uci-iris.csv' iris = pd.read_csv(iris_filename, sep=',', decimal='.', header=None, names= ['sepal_length', 'sepal_width', &apos...