Back to All Open Science Grantees
HTSJDK: Enhancing the Java Toolkit for Emerging Sequencing Technologies
Proposal Summary
To enhance the HTSJDK Java toolkit for genomics with an extensible plugin framework that will enable support for emerging technologies required by contemporary analysis methods, such as long reads, graph/circular references, and epigenetic modifications.
Project
HTSJDK is one of two reference implementations required by the GA4GH organization (ga4gh.org) of the core genomics file formats it maintains. These formats, such as BAM and VCF, are ubiquitous within the genomics community, and HTSJDK serves as the foundation for numerous gold standard genomics software tools, including GATK (gatk.broadinstitute.org), Picard, and IGV. These tools have been used by some of the most influential and widely-cited life sciences projects to date, including The 1000 Genomes Project, The Cancer Genome Atlas, and The Framingham Heart Study.