Back to All Open Science Grantees
Maintenance and Improvement of Validated, Community Developed NGS Analyses
Proposal Summary
To improve the bcbio-nextgen toolkit, focusing on maintaining existing variant calling functionality and extending support for structural and RNA-seq variant analyses.
Project
bcbio-nextgen is a Python toolkit providing best practice pipelines for fully-automated high throughput sequencing analysis. The user writes a high level configuration file specifying inputs and analysis parameters, which drives a parallel pipeline that handles distributed execution, idempotent processing restarts and safe transactional steps. The goal is to provide a shared community resource that automates the data processing component of sequencing analysis via validated, current, best practice methods, providing researchers with more time to focus on the downstream biology. bcbio-nextgen is installable for users on any UNIX-based environment without administration privileges and installs all of the tools and data needed to run a wide variety of different NGS analyses. bcbio-nextgen is scalable and agnostic to the underlying compute infrastructure— it can run on a local machine, in HPC environments with support for all cluster schedulers and runs on cloud providers such as Amazon Web Services (AWS), Google Cloud (GCP) and Microsoft Azure.