Summary
In this chapter, we learned how to import an ARFF file into a pandas DataFrame. Pandas profiling was performed on the DataFrame to get the correlated features. We detected the missing values using the missingno
package and performed imputation using the mean and iterative imputation methods.
In order to find the important features that contribute to bankruptcy, we performed lasso regularization. With lasso regularization, we found which features are responsible for bankruptcy. Even though we get the different important features across all five DataFrames, one of the features occurs across all five DataFrames, which is nothing but the ratio of total liabilities to total assets. This particular ratio has a very high significance in leading to bankruptcy.
However, our analysis is not fully complete since we only found the factors that affect bankruptcy, but not the direction (whether bankruptcy may occur when a particular ratio increases or decreases).
To get a complete...