Handling other issues in linear regression
So far in this chapter, we have learnt:
- How to implement a linear regression model using two methods
- How to measure the efficiency of the model using model parameters
However, there are other issues that need to be taken care of while dealing with data sources of different types. Let's go through them one by one. We will be using a different (simulated) dataset to illustrate these issues. Let's import it and have a look at it:
import pandas as pd df=pd.read_csv('E:/Personal/Learning/Predictive Modeling Book/Book Datasets/Linear Regression/Ecom Expense.csv') df.head()
We should get the following output:
The preceding screenshot is a simulated dataset from any-commerce website. It captures the information about several transactions done on the website. A brief description of the column names of the dataset is, as follows:
- Transaction ID: Transaction ID for the transaction
- Age: Age of the customer
- Items: Number...