Rare is the new common. The final month of the year is always a good time to review progress and think about what's next. In genetics, massively parallel next generation sequencing (NGS) technologies have been a dominating theme, and for good reason.
Unlike the previous high-throughput genetic analysis technologies (Sanger sequencing and microarrays), NGS allows us to explore genomes in far deeper ways and measure functional elements and gene expression in global ways.
What have we learned?
|Distribution of rare and common variants. From |
A consequence is that the exponentially growing human population has accumulated a very large number of rare genetic variants . Many of these variants can be predicted to affect phenotype and many more may modify phenotypes in yet unknown ways [4,5]. We are also learning that variants generally fall into two categories. They are either common to all populations or confined to specific populations (figure). More importantly, for a given gene the number of rare variants can vastly outnumber of the number of previously known common variants.
Another consequence of the high abundance of rare variation is how it impacts the resources that are used to measure variation and map disease to genotypes. For example, microarrays, which have been the primary tool of genome wide association studies utilize probes developed from a human reference genome sequence. When rare variants are factored in, many probes have several issues ranging from "hidden" variation within a probe to a probe simply not being able to measure a variant that is present. Linkage block size is also affected . What this means it the best arrays going forward will be tuned to specific populations. It also means we need to devote more energy to developing refined reference resources, because the current tools do not adequately account for human diversity [6,7].
Rare genetic variation has been understood for sometime. What's new is understanding just how extensive these variants are in the human population, which has resulted from the population recently rapidly expanding under very little selective pressure. Hence, linking variation to heath and disease is the next big challenge and the cornerstone of personalized medicine, or as some would like precision medicine. Conquering this challenge will require detailed descriptions of phenotypes, in many cases at the molecular level. As the vast majority of variants, benign or pathogenic, lie outside of coding regions we will need to deeply understand how those functional elements, as initially defined by ENCODE, are affected by rare variation. We will also need to layer in epigenetic modifications.
For the next several years the picture will be complex.
1. 1000 Genomes Project Consortium (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491 (7422), 56-65 PMID: 23128226
 Kong, A., et. al. (2012). Rate of de novo mutations and the importance of father’s age to disease risk Nature, 488 (7412), 471-475 DOI: 10.1038/nature11396
 Keinan, A., and Clark, A. (2012). Recent Explosive Human Population Growth Has Resulted in an Excess of Rare Genetic Variants Science, 336 (6082), 740-743 DOI: 10.1126/science.1217283
 Tennessen, J., et. al. (2012). Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes Science, 337 (6090), 64-69 DOI: 10.1126/science.1219240
 Nelson, M., et. al. (2012). An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People Science, 337 (6090), 100-104 DOI: 10.1126/science.1217876
 Rosenfeld JA, Mason CE, and Smith TM (2012). Limitations of the human reference genome for personalized genomics. PloS one, 7 (7) PMID: 22811759
 Smith TM., and Porter SG. (2012) Genomic Inequality. The Scientist.