Interpretable text classification from electronic health records
We can use many NLP techniques with EHRs to find their semantic relationships, as we saw in a previous use case. However, how do we know the resulting topics make sense? In this use case, we will see their proposal for objective evaluation metrics for the topic results.
Background
As we said when discussing previous use cases, the clinical notes in EHRs have great possibilities for predictive tasks. Various topic modeling techniques can be applied to texts. Using topic models allows us to use topics as features. Data science researchers come to realize that the interpretability of these classification models is the key aspect.
Questions
It is one thing to build many topic models, but selecting the most appropriate model for production use is not trivial. Is there an objective and systematic way to compare models? What are good evaluation metrics?
NLP solution
The authors [8] believe that interpretability...