FinchTalk: Translational Bioinformatics

Wednesday, April 6, 2011

Sneak Peak: RNA-Sequencing Applications in Cancer Research: From fastq to differential gene expression, splicing and mutational analysis

Join us next Tuesday, April 12 at 10:00 am PST for a webinar focused on RNA-Seq applications in breast cancer research.

The field of cancer genomics is advancing quickly. News reports from the annual American Association of Cancer Research meeting are indicating that whole genome sequencing studies such as the 50 breast cancer genomes (WashU) are providing more clues about the genes that may be affected in cancer. Meanwhile, the ACLU/Myriad Genetics legal action over genetic testing for breast cancer mutations and disease predisposition continues to move towards the supreme court.

Breast cancer, like many other cancers, is complex. Sequencing genomes is one way to interrogate cancer biology. However, the genome sequence data in isolation does not tell the complete story. The RNA, representing expressed genes, their isoforms, and non-coding RNA molecules, needs to be measured too. In this webinar, Eric Olson, Geospiza's VP of product development and principal designer of GeneSifter Analysis Edition, will explore the RNA world of breast cancer and present how you can explore existing data to develop new insights.

Abstract
Next Generation Sequencing applications allow biomedical researchers to examine the expression of tens of thousands of genes at once, giving researchers the opportunity to examine expression across entire genomes. RNA Sequencing applications such as Tag Profiling, Small RNA and Whole Transcriptome Analysis can identify and characterize both known and novel transcripts, splice junctions and non-coding RNAs. These sequencing based-applications also allow for the examination of nucleotide variant. Next Generation Sequencing and these RNA applications allow researchers to examine the cancer transcriptome at an unprecedented level. This presentation will provide an overview of the gene expression data analysis process for these applications with an emphasis on identification of differentially expressed genes, identification of novel transcripts and characterization of alternative splicing as well as variant analysis and small RNA expression. Using data drawn from the GEO data repository and the Short Read Archive, NGS Tag Profiling, Small RNA and NGS Whole Transcriptome Analysis data will be examined in Breast Cancer.

You can register at the webex site, or view the slides after the presentation.

Wednesday, March 23, 2011

Translational Bioinformatics

During the week of March 7, I had the pleasure of attending the AMIA’s (American Medical Informatics Association) summit on Translational Bioinformatics (TBI), at the Parc 55 Hotel in San Francisco.

What is Translational Bioinformatics?
Translational Bioinformatics can be simply defined as computer related activities designed to extract clinically actionable information from very large datasets. The field has grown from a need to develop computational methods to work with continually increasing amounts of data and an ever expanding universe of databases.

As we celebrate the 10th anniversary of completing the draft sequence of the human genome [1,2] we are often reminded that this achievement would transform medicine. The genome would be used to develop a comprehensive catalog of all genes and, through this catalog, we would be able to identify all disease genes and develop new therapies to combat disease. However, since the initial sequence, we have also witnessed an annual decrease in new drugs entering the market place. While progress is being made, it's just not moving at speeds consistent with the excitement produced by the first and “final” drafts [3,4] of the human genome.

What happened?

Biology is complex. Through follow on endeavors, especially with the advent of massively parallel low-cost sequencing, we’ve begun to expose the enormous complexity and diversity of the nearly seven billion genomes that comprise the human species. Additionally, we’ve begun to examine the hundred trillion, or so, microbial genomes that make a human “ecosystem.” A theme that has become starkly evident is our health and wellness is controlled by our genes and how they are modified and expressed in response to environmental factors. Or as described in one slide,

phenotype = f(genes, environment).

Conferences, like the Joint Summits on Translational Bioinformatics and Clinical Research Informatics, create a forum for individuals working on clinically related computation, bioinformatics and medical informatics problems to come together and share ideas. This year’s meeting was the fourth annual. The TBI meeting had four tracks: 1) Concepts, Tools and Techniques to Enable Integrative Translational Bioinformatics, 2) Integrative Analysis of Molecular and Clinical Measurements, 3) Representing and Relating Phenotypes and Disease for Translational Bioinformatics, and 4) Bridging Basic Science Discoveries and Clinical Practice. To simplify these descriptions, I’d characterize the attendees as participating in five kinds of activities:

Creating new statistical models to analyze data
Integrating diverse data to create new knowledge
Promoting ontologies and standards
Developing social media infrastructures and applications
Using data to perform clinical research and practice

As translational bioinformatics is about analyzing and mining data to develop clinical knowledge and applications, conference attendees represent a broad collection of informatics domains. Statisticians reduce large amounts of raw data into results that need to be compared and integrated with diverse information to functionally characterize biology. This activity benefits from standardized data descriptions and clinical phenotypes that are organized into ontologies. Because this endeavor requires large teams of clinical researchers, statisticians, and bioinformaticians, social media plays a significant role in finding collaborators and facilitating communication and data sharing between members of large de-centralized projects. Finally, it’s not translational unless there is clinical utility.

What did we learn?
The importance of translational bioinformatics is growing. This year’s summit had about 470 attendees with nearly 300 attending TBI, a 34% growth over 2010 attendance. In addition, to many talks and posters on databases, ontologies, statistical methods, and clinical associations of genes and genotypes, we were entertained by Carl Zimmer’s keynote on the human microbiome.

In his presentation, Zimmer made the case that we need to study human beings as ecosystems. After all our bodies contain between 10 and 100 times more microbial cells than human cells. Zimmer showed how our microbiome becomes populated, from an initially sterile state, through the same niche and succession paradigms that have been demonstrated in other ecosystems. While microbial associations with disease are clear, it is important to note that our microbiome protects us from disease and performs a large number of essential biochemical reactions. In reality our microbiome serves as an additional organ - or two. Thus, to really understand human health, we need to understand the microbiome, so in terms of completing the human ecosystem, we have only peaked at the tip of a very large iceberg, which only gets bigger when we consider bacteriophage and viruses.

Russ Altman closed the TBI portion of the joint summit with his, now annual, year in review. The goals of this presentation were to highlight major trends and advances of the past year, focus on what seems to be important now, and predict what might be accomplished in the coming year. From ~63 papers, 25 were selected for presentation. These were organized into Personal Genomics, Drugs and Genes, Infrastructure for TB, Sequencing and Science, and Warnings and Hope. You can check out the slides to read the full story.

My take away is that we’ve clearly initiated personal genomics with both clinical and do-it-yourself perspectives. Semantic standards will improve computational capabilities, but we should not hesitate to mine and use data from medical records, participant driven studies, and previously collected datasets in our association studies. Pharmacogenetics/genomics will change drug treatments from one-size-fits-all-benifits-few approaches to specific drugs for stratified populations and multi-drug therapies will become the norm. Deep sequencing continues to reveal deep complexity in our genome, cancer genomes, and the microbiome.

Altman closed with his 2011 predictions. He predicted that consumer sequencing (vs. genotyping) will emerge, cloud computing will contribute to a major biomedical discovery, informatics application to stem cell science will emerge, important discoveries will be made from text mining along with population-based data mining, systems modeling will suggest useful polypharmacy, and immune genomics will emerge as powerful data.

I expect many of these will come true as our [Geospiza's] research and development, and customer engagements are focused on many of the above targets.

References
Consortium efforts (2001). Human genome sequencing issues: Nature, 409 (6288) DOI: Nature, Vol. 409, no 6822 pp. 745-964

1. Nature, Vol. 409, no 6822 pp. 745-964
2. Science, Vol. 291, no. 5507 pp. 1145-1434
3. Nature, Vol. 422, no. 6934 pp. 835-847
4. Science, Vol. 300, no. 5617 pp. 286-290

AMIA - https://www.amia.org