Next-Generation Sequencing
Next-generation sequencing (NGS) is one of the fundamental technological developments of the century in life sciences. Whole-genome sequencing (WGS), restriction site-associated DNA sequencing (RAD-Seq), ribonucleic acid sequencing (RNA-Seq), chromatin immunoprecipitation sequencing (ChIP-Seq), and several other technologies are routinely used to investigate important biological problems. These are also called high-throughput sequencing technologies, and with good reason: they generate vast amounts of data that needs to be processed. NGS is the main reason that computational biology has become a big-data discipline. More than anything else, this is a field that requires strong bioinformatics techniques.
Here, we will not discuss each individual NGS technique per se (this would require a whole book of its own). We will use an existing WGS dataset—the 1,000 Genomes Project—to illustrate the most common steps necessary to analyze genomic data...