Developing Bioinformatics Applications with BioHDF

HDF5 is an open-source technology suite for managing diverse, complex, high-volume data in heterogeneous computing and storage environments. The BioHDF project is investigating the use of HDF5 for working with very large scientific datasets. HDF5 provides a hierarchical data model, binary file format, and collection of APIs supporting data access. BioHDF will extend HDF5 to support DNA sequencing requirements.
Initial prototyping of BioHDF has demonstrated clear benefits. Data can be compressed and indexed in BioHDF to reduce storage needs and enable very rapid (typically, few millisecond) random access into these sequence and alignment datasets, essentially independent of the overall HDF5 file size. Additional prototyping activities we have identified key architectural elements and tools that will form BioHDF.
The BoF session will include a presentation of the current state of BioHDF and proposed implementations to encourage discussion of future directions.
No comments:
Post a Comment