Friday, March 19, 2010

RNA Deep Sequencing - Beyond Proof of Concept

ABRF 2010 begins this weekend.  In addition to my LIMS presentation on Sunday, I will present our poster featuring data analysis of sequences from "Sex-specific and lineage-specific alternative splicing in primates" (Blekhman et. al) in GeneSifter Analysis Edition.

The poster number is RP-3. Stop by and see how we learned that not all samples are what they seem to be ...

Abstract 

Next Generation DNA Sequencing (NGS) technologies are powerful tools for rapidly sequencing genomes and studying functional genomics. Presently, the value of NGS technology has been largely demonstrated on individual sample analyses (1-3). The full potential of NGS will be realized when it can be used in multisample experiments that involve different measurements and include replicates, and controls to make valid statistical comparisons. Arguably, improvements in current technology, and soon to be available “third” generation systems, will make it possible to simultaneously measure 100’s to1000’s of individual samples in single experiments to study transcription, alternative splicing, and how sequences vary between individuals and within expressed genes. However, several bioinformatics systems challenges must be overcome to effectively manage both the volumes of data being produced and the complexity of processing the numerous datasets that will be generated.

In this poster we present a system that is used it to verify and further characterize previously published data from a gene expression study that includes both replicates and experimental values comparing sex and lineage specific alternative splicing in primates (4). This system, developed on a high performance computing architecture, stores and organizes the data, aligns millions of reads to different reference sequences, identifies and removes artifacts, executes comparative and statistical analyses, and finally links results to pathway and ontological information for making discoveries and confirming hypotheses. The supporting infrastructure includes intuitive user interfaces for organizing data, executing analytical operations, viewing summarized reports, navigating through details in the results and can be easily operated by biologists.

1. Marioni JC, et. al. (2008) Genome Res.

2. Ramsköld D, et. al. (2009) PLoS Comput Biol.

3. Pleasance ED, et. al.(2010) Nature.

4. Blekhman R, et. al. (2009) Genome Res.

1 comment:

reaearch paper said...

Many institutions limit access to their online information. Making this information available will be an asset to all.