Using charts and graphs
Visualization is normally the end goal for most of my work, so for me, this is a natural next step. I’m going to start by creating a bar graph that will show me the distribution of the counts of unique values within the data. I think this might give us some insight into which factor would affect the dependent variable in this study, which is whether a subject has early-onset PD. However, there’s still a problem. As shown in Figure 14.21, there are still some holes in the data I will need to account for before I begin analysis in earnest.
What I’m going to do first is create a bar chart to visualize our missing data. The following code cell handles this:
#%% missing_data = combined_user_df.isnull().sum() g = sns.barplot(x=missing_data.index, y=missing_data) g.set_xticklabels(labels=missing_data.index, rotation=90) plt.show()
Running this code produces the visualization shown in Figure 14.22:
Figure 14.22...