FinchTalk: From Reads to Data Sets, Why Next Gen is not like Sanger Sequencing

Tuesday, January 6, 2009

From Reads to Data Sets, Why Next Gen is not like Sanger Sequencing

Time is running out! Be sure to register for the ABRF 2009 Next Generation DNA Sequencing Workshop to be held in Memphis TN Sat. Feb. 7, 2009. The day will include discussions from core lab directors and others about how to implement new sequencing technologies in a core lab environment. We'll consider how next generation sequencing differs from Sanger sequencing and devote much of the day learning about the practical impact of the differences.

An introduction:

Initially, DNA sequencing was performed to learn about sequences and structure of cloned genes. The first widely-used sequencing systems were based on the “Sanger” method. DNA was synthesized in the presence of chain terminating radioactive dideoxy-nucleotides (1). Mixtures of DNA fragments were separated by size using gel electrophoresis and the bases were identified and entered into a computer through manual techniques. Automated DNA sequencing instruments arrived later. These instruments made DNA sequencing thousands of times more efficient by detecting fluorescently labeled fragments and sending the information directly to a computer (2). For the first time, it became possible to sequence entire human genomes (3,4).

While highly successful, Sanger sequencing is cost prohibitive when it comes to deeper investigations of biological systems. Just as the questions investigated by Sanger sequencing shifted from single genes to entire genomes, the questions being asked by Next Generation techniques are changing as well. Questions related to transcription, for example, and promoter occupancy, can now be answered by using a massively parallel format to sample large collections of individual molecules. In effect, every RNA or DNA molecule might be sampled and counted. Not only are we looking at a “Next Generation” of DNA sequencing, we are looking a next generation of experimental techniques that answer different kinds of questions than those we asked before. These new technologies require fundamental changes in terms of experiment design and data analysis systems.

1) Sanger F., Nicklen S., Coulson A.R., 1977. “DNA sequencing with chain-terminating inhibitors.” Proc Natl Acad Sci U S A 74, 5463-5467.

2) Smith L.M., Sanders J.Z., Kaiser R.J., Hughes P., Dodd C., Connell C.R., Heiner C., Kent S.B., Hood L.E., 1986. “Fluorescence detection in automated DNA sequence analysis.” Nature 321, 674-679.

3) International Human Genome Sequencing Consortium, 2001. “Initial sequencing and analysis of the human genome.” Nature 409, 860-921.

4) Venter J.C., Adams M.D., Myers E.W., et. al. 2001. “The sequence of the human genome.” Science 291, 1304-1351.

Tuesday, January 6, 2009

From Reads to Data Sets, Why Next Gen is not like Sanger Sequencing

No comments: