Data modeling is a process where we try to find a function (or the so-called model) with a set of independent variables or input data. Just like in data warehousing, where modeling is referring to establishing the conceptual framework based on the physical data structure and with the help of ORM or UML (even CRC) diagrams one explores the structures in data the same is seen with exploring the structures when doing predictive analysis. In case of the latter, data modeling is exploring the structures (or relations) between two or more variables. These relations can be presented as a function and are essentially stored as a model.
To start modeling, we will use some of the Microsoft data available at the following GitHub repository: