Handling never-before-seen metrics
Before proceeding, we have to consider that the top 20 table doesn’t cover all the metrics used in competitions. We should be aware that there are metrics that have only been used once in recent years.
Let’s keep on using the results from the previous code to find out what they are:
counts = (df[time_select&competition_type_select]
.groupby('EvaluationAlgorithmAbbreviation'))
total_comps_per_year = (df[time_select&competition_type_select]
.groupby('year').sum())
single_metrics_per_year = (counts.sum()[counts.sum().comps==1]
.groupby('year').sum())
single_metrics_per_year
table = (total_comps_per_year.rename(columns={'comps': 'n_comps'})
.join(single_metrics_per_year / total_comps_per_year)
.rename(columns={'comps': 'pct_comps'}))
print(table)
As a result, we get the...