Thursday, March 10, 2011

Sneak Peak: The Next Generation Challenge: Developing Clinical Insights Through Data Integration

Next week (March 14-18, 2011) is CHI's X-Gen Congress & Expo. I'll be there presenting a poster on the next challenge in bioinformatics, also known as the information bottleneck.

You can follow the tweet by tweet action via @finchtalk or #XGenCongress.

In the meantime, enjoy the poster abstract.

The next generation challenge: developing clinical insights through data integration

Next generation DNA sequencing (NGS) technologies hold great promise as tools for building a new understanding of health and disease. In the case of understanding cancer, deep sequencing provides more sensitive ways to detect the germline and somatic mutations that cause different types of cancer as well as identify new mutations within small subpopulations of tumor cells that can be prognostic indicators of tumor growth or drug resistance. Intense vendor competition amongst NGS platform and service providers are commoditizing data collection costs making data more assessable. However, the single greatest impediment to developing relevant clinical information from these data is the lack of systems that create easy access to the immense bioinformatics and IT infrastructures needed for researchers to work with the data. 

In the case of variant analysis, such systems will need to process very large datasets, and accurately predict common, rare, and de novo levels of variation. Genetic variation must be presented in an annotation-rich, biological context to determine the clinical utility, frequency, and putative biological impact. Software systems used for this work must integrate data from many samples together with resources ranging from core analysis algorithms to application specific datasets to annotations, all woven into computational systems with interactive user interfaces (UIs). Such end-to-end systems currently do not exist, but the parts are emerging. 

Geospiza is improving how researchers understand their data in terms of its biological context, function and potential clinical utility, by develop methods that combine assay results from many samples with existing data and information resources from dbSNP, 1000 Genomes, cancer genome databases, GEO, SRA and others. Through this work, and follow on product development, we will produce integrated sensitive assay systems that harness NGS for identifying very low (1:1000) levels of changes between DNA sequences to detect cancerous mutations, emerging drug resistance, and early-stage signaling cascades.

Authors: Todd M. Smith(1), Christoper Mason(2) 
(1). Geospiza Inc. Seattle WA 98119, USA. 
(2). Weil Cornell Medical College, NY NY 10021, USA

No comments: