Wednesday, August 10, 2011

Stitching Protein-Protien Interactions via DNA Sequencing Stitch-Seq, one of the newest editions to the Next Generation DNA Sequencing (NGS) was presented in June's Nature Methods.

Back in 2008, when groups were realizing the power of NGS technologies, I entitled a post "Next Gen Sequencing is not Sequencing DNA" to make the point that massively parallel ultra-high throughput DNA sequencing could be used to for quantitative assays that can measure transcriptome expression, protein-DNA interactions, methylation patterns, and more. Stitch-seq can now be added to a growing list of assays that include RNA-Seq, DNAse-Seq, ChIP-Seq, or HITS-CLIP, and others.

Stitch-seq explores the interactome, a term used to describe how the molecules of a cell interact in networks to carryout life's biochemical activities. Understanding how these networks are controlled through genetics and environmental stimuli is critical in discovering biomarkers that can be used to stratify disease and target highly specific therapies. However, the interactome is complex; studying it requires that interactions can be identified at high scale.

Many interactome studies focus on proteins. Traditional approaches involve specially constructed gene reporter systems. For example, in the two-hybrid approach, a portion of a protein encoding gene is combined with a gene fragment containing a DNA binding domain of a transcription factor (bait). In another construct a different protein encoding region is combined with the RNA polymerase binding domain fragment of the same transcription factor (prey).

When the DNA constructs are expressed, interactions can be measure by gene expression. If the protein attached to the bait interacts with the protein attached to the prey, transcription is initiated at the reporter gene.  When reporter genes confer growth on selective media, interacting protein encoding segments can be identified by isolating the DNA from growing cells and sequencing the DNA constructs.

Therein lies the rub

Until now, interactome studies combined high-throughtput assays systems with low-throughput characterization systems that PCR amplified the individual constructs and characterized the DNA by Sanger sequencing. Yu and colleagues overcame this problem by devising a new strategy that put potential interacting domains on common DNA fragments, via "stitch-PCR" to prepare libraries that can easily be sequenced by NGS methods.  Using this method the team was able to increase overall assay throughput by 42% and measure 1000s of interactions.

While still low-throughput relative to the kinds of numbers were used to on NGS, increasing the throughput of protein interaction assays is an important step toward making systems biology experiments more scalable. It also adds another Seq to our growing collection of Assay-Seq methods.

Yu, H., Tardivo, L., Tam, S., Weiner, E., Gebreab, F., Fan, C., Svrzikapa, N., Hirozane-Kishikawa, T., Rietman, E., Yang, X., Sahalie, J., Salehi-Ashtiani, K., Hao, T., Cusick, M., Hill, D., Roth, F., Braun, P., & Vidal, M. (2011). Next-generation sequencing to generate interactome datasets Nature Methods, 8 (6), 478-480 DOI: 10.1038/nmeth.1597