Using Regression and Forecasting
One of the most important tasks that a statistician or data scientist has is to generate a systematic understanding of the relationship between two sets of data. This can mean a continuous relationship between two sets of data, where one value depends directly on the value of another variable. Alternatively, it can mean a categorical relationship, where one value is categorized according to another. The tool for working with these kinds of problems is regression. In its most basic form, regression involves fitting a straight line through a scatter plot of the two sets of data and performing some analysis to see how well this line fits the data. Of course, we often need something more sophisticated to model more complex relationships that exist in the real world.
Forecasting typically refers to learning trends in time series data with the aim of predicting values in the future. Time series data is data that evolves over a period of time, and usually...