Applying ML to genomic workflows
ML has become an important technology, applied throughout the genomic sequencing workflow and the interpretation of genomic data in general. ML plays a role in data processing, deriving insights, and running searches, which are all important applications in genomics.
One of the primary drivers of ML in genomic workflows is the sheer volume of genomic data to analyze. As you may recall, ML relies on pattern recognition in unseen data that has been learned from a previous subset of data. This process is much more computationally efficient than applying complex rules on large genomic datasets. Another driver is that the field of ML has evolved, and computational resources such as GPUs have become more accessible to researchers performing genomic research. Proven techniques can now produce accurate models on different modalities of genomic information. Unsupervised algorithms such as principal component analysis (PCA) and clustering help pre-process...