When you think about it, NumPy is a fairly low-level array-manipulation library, and the majority of other Python libraries are written on top of it.
One of these libraries is pandas, which is a high-level data-manipulation library. When you are exploring a dataset, you usually perform operations such as calculating descriptive statistics, grouping by a certain characteristic, and merging. The pandas library has many friendly functions to perform these various useful operations.
Let's use a diabetes dataset in this example. The diabetes dataset in sklearn.datasets is standardized with a zero mean and unit L2 norm.
The dataset contains 442 records with 10 features: age, sex, body mass index, average blood pressure, and six blood serum measurements.
The target represents the disease progression after these baseline measures are taken. You can look at the data...