An ML use case for genomics – Disease prediction
Let’s illustrate the power of ML for genomic applications, starting with classification models, which are a subset of supervised ML methods where the goal is to classify the outcome into two (binary classification) or more (multiclass classification) classes based on the independent variables.
One of the popular use cases for genomics is outcome prediction. In this particular use case, we will try to predict if a patient has lung cancer or not based on gene expression. Before we start building the model and using that to make a prediction, let’s try to understand how a typical ML disease prediction model work in this use case. It works by mapping the relationships between individual patients’ sample gene expression values (features) and the target variable (Normal
versus Tumor
)—in other words, mapping the pattern of the features within the expression data to the target variable. In this example,...