Performing linear regression in Python using Excel data
Linear regression in Python can be carried out with the help of libraries such as pandas
, scikit-learn
, statsmodels
, and matplotlib
. The following is a step-by-step code example:
- First, import the necessary libraries:
# Import necessary libraries import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split import statsmodels.api as sm from statsmodels.graphics.regressionplots import plot_regress_exog from statsmodels.graphics.gofplots import qqplot
- Then, we create an Excel file with test data. Of course, in a real-life scenario, you would not need the mock data – you would skip this step and load the data from Excel (see the next step) after loading the necessary libraries:
# Step 0: Generate sample data and save as Excel file np.random.seed(0) n_samples = 100 X = np.random.rand(n_samples, 2) # Two features y = 2 * X[:, 0] + 3 * X[:, 1] ...