Since we've spent a significant amount of time discussing death rate, let us conclude this chapter with one final analysis of two cancer datasets. We have obtained the de-identified clinical dataset of breast cancer and brain tumor from http://www.cbioportal.org/; our goal is to see what the overall survival outcome looks like, and whether the two cancers are having statistically different survival outcomes. The datasets are being explored only for research purposes:
# The clinical dataset are in tsv format # We can use the .read_csv() method and add an argument sep='\t'
# to construct the dataframe gbm_df = pd.read_csv('https://github.com/PacktPublishing/Matplotlib-2.x-
By-Example/blob/master/gbm_tcga_clinical_data.tsv',sep='\t') gbm_primary_df = gbm_df[gbm_df['Sample Type']=='Primary Tumor&apos...