Splitting the Features
In the previous section, we saw how the missing values are filled with different types of imputation.
In this section, we will be splitting the dependent variables in the DataFrame into y and the independent variables into X. The dependent variables are an outcome of a process. In our case, this process is whether a company is bankrupt or not. Independent variables (also called features) are the input to our process, which in this case is the rest of the variables.
Splitting the features acts as a precursor to our next step, where we select the most important X variables that determine the dependent variable.
We will need to split the features for mean-imputed DataFrames, as shown in the following code:
#First DataFrame X0=mean_imputed_df1.drop('Y',axis=1) y0=mean_imputed_df1.Y #Second DataFrame X1=mean_imputed_df2.drop('Y',axis=1) y1=mean_imputed_df2.Y #Third DataFrame X2=mean_imputed_df3.drop('Y',axis=1) y2=mean_imputed_df3...