Datasource
Datasource is a term used for all the technology related to the extraction and storage of data. A datasource can be anything from a simple text file to a big database. The raw data can come from observation logs, sensors, transactions, or user's behavior.
In this section we will take a look into the most common forms for datasource and datasets.
A dataset is a collection of data, usually presented in tabular form. Each column represents a particular variable, and each row corresponds to a given member of the data, as is shown in the following figure:
A dataset represents a physical implementation of a datasource; the common features of a dataset are as follows:
Dataset characteristics (such as multivariate or univariate)
Number of instances
Area (for example life, business, and so on)
Attribute characteristics (namely, real, categorical, and nominal)
Number of attributes
Associated tasks (such as classification or clustering)
Missing Values
Open data
Open data is data that can be used, re...