Applying Machine Learning to Genomics
The genome is what defines a living organism. It resides in the cells of the organism and dictates how they perform essential functions. It is also what differentiates one organism from another by providing each organism a set of unique characteristics. Genomes form the complete deoxyribonucleic acid (DNA) of an organism. It resides mostly in the nucleus (and in small quantities within the mitochondria) of the cell and provides coded information about the organism by representing it in a sequence of four chemical bases: thymine (T), cytosine (C), guanine (G), and adenine (A). These are arranged in a double helix structure, which is now synonymous with how DNA is represented graphically. The double helix is comprised of two strands of nucleotides that bond with each other forming pairs of chemical bases (also known as base pairs), with A pairing with T and G pairing with C. Genomes may also contain sequences of ribonucleic acid (RNA), which is quite...