Essential Open Source Software for Science

CZI’s Essential Open Source Software for Science program supports software maintenance, growth, development, and community engagement for open source tools critical to science.

Showing 32 results

A Modular Suite of Advanced Bioimaging Tools with scikit-image and Dash

To bring the combined power of scikit-image and Dash to a larger number of scientists thanks to increased execution speed, interactive image annotation and processing, and outstanding documentation targeting life sciences practitioners.

Projects scikit-image, Dash

Emmanuelle Gouillart (Plotly Technologies, Inc.)

A Solid Foundation for Statistics in Python with SciPy

The project will improve the SciPy library's statistics functionality to better serve biomedical research and downstream projects. In addition, an outreach component will engage female students, inspiring them to participate in open source code development.

Project SciPy

Warren Weckesser (University of California, Berkeley; NumFOCUS)

Matt Haberland (California Polytechnic State University, NumFOCUS)

Advancing Microbiome Research Through QIIME 2 Community Development

To support the QIIME 2 user and developer communities by enabling sharing of automatically tested third-party content on the QIIME 2 Library, and hosting our first-ever co-convened user and developer workshop and networking event.

Project QIIME 2

Greg Caporaso (Northern Arizona University)

Bioconductor Build System: Continuous Integration and Developer Feedback

To reengineer the Bioconductor build system for nightly continuous integration, production, and distribution of tarballs and binaries for over 1,700 user-contributed software packages.

Project Bioconductor Build System

Vincent Carey (Brigham and Women's Hospital)

Comprehensive, Scalable, and Collaborative Single-Cell Analysis with Seurat

To develop extensive functionality, expand user support, and initiate new modes of community outreach for Seurat, an open-source R toolkit for integrative single-cell analysis.

Project Seurat

Rahul Satija (New York Genome Center)

Continuous Improvement to Essential High-Throughput Bio-Sequence Aligners

To maintain BWA and improve the performance and robustness of BWA and its next major version BWA-MEM2.

Project BWA

Heng Li (Dana-Farber Cancer Institute)

Cytoscape Explore for Biological Networks Brings Cytoscape to the Cloud

To build Cytoscape Explore, a web-based biological network viewer and editor that will make key aspects of the widely used Cytoscape application accessible to new audiences as part of its evolution from a desktop application to a cloud ecosystem.

Project Cytoscape

Dexter Pratt (University of California, San Diego)

DeepLabCut: An Open Source Toolbox for Robust Animal Pose Estimation

To support the maintenance, new extensions, and education of users of the DeepLabCut software community.

Project DeepLabCut

Mackenzie Mathis (Harvard University & Swiss Federal Institute of Technology Lausanne (2020))

Enabling Differential Analyses of Genomic Data with limma, edgeR and Glimma

To improve ease of use and interoperability of these packages, make methodological responses to new data challenges, refresh the documentation and structure of these packages, and prepare training materials.

Projects limma, edgeR, Glimma

Gordon Smyth (Walter and Eliza Hall Institute of Medical Research)

Enhancing the Performance, Documentation, and Data Ecosystem for bedtools

To enhance bedtools’ functionality, documentation, and access to data, which will empower and expand the user community.

Projects bedtools, Go Get Data (GGD)

Aaron Quinlan (University of Utah)

Ensuring the Continued Growth of pandas

To support continued maintenance and development of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Project Pandas

Tom Augspurger (NumFOCUS)

Improving User Experience and Debuggability of pip For All Python Users

To complete the design, implementation, and rollout of pip's next-generation dependency resolver, and permanently improve pip's maintainer capacity and user experience.

Project pip

Ernest W. Durbin III (Python Software Foundation)

IQ-TREE for Ultra-Large Genomic Data

To develop an open standard and API for phylogenetic models and improve the speed and scalability of the IQ-TREE software for phylogenetic inference from ultra-large genomic data.

Project IQ-TREE

Minh Bui (Australian National University)

JupyterHub Contributor in Residence Program

To improve community support and technical maintenance across the JupyterHub repositories.

Project Project Jupyter (JupyterHub, The Binder Project)

Chris Holdgraf (University of California, Berkeley; NumFOCUS)

Maintaining Rocker: Sustainability for Containerized Reproducible Analyses

To put Rocker, the de facto standard for reproducible, containerized R analyses, on a path to sustainable maintenance through refactoring, improving the quality of documentation, expanding the community, and targeting new hardware platforms.

Project The Rocker Project

Carl Boettiger (University of California, Berkeley)

Maintenance and Improvement of Validated, Community Developed NGS Analyses

To improve the bcbio-nextgen toolkit, focusing on maintaining existing variant calling functionality and extending support for structural and RNA-seq variant analyses.

Project bcbio-nextgen

Rory Kirchner (Harvard Chan School of Public Health)

Matplotlib: Foundation of Scientific Visualization in Python

To enable Matplotlib to continue as the core plotting library of the scientific Python ecosystem by addressing the maintenance backlog and planning Matplotlib's evolution to meet the community’s visualization challenges for the next decade.

Project Matplotlib

Thomas A. Caswell (Brookhaven National Laboratory, NumFOCUS)

MicroManager 2.0: An Open Platform for Microscopy Image Acquisition

To support the open source µManager optical microscopy acquisition platform to improve its architecture, infrastructure, and support to ensure many years of growth, both in user base and capabilities.

Projects MicroManager, ImageJ

Kevin Eliceiri (University of Wisconsin, Madison)

Migrating Protege to a Modern Web Stack

To construct a solid foundation for the next generation of Protege using a modern web stack that will make Protege easier to maintain, extend, and — crucially — make it easier for third parties to contribute to the code base.

Project Protege

Mark Musen (Stanford University)

MSstats and Cardinal: Next Generation Statistical Mass Spectrometry in R

To provide open-source, interoperable, and extensible statistical software for quantitative mass spectrometry, which enables experimentalists and developers of statistical methods to rapidly respond to changes in the evolving biotechnological landscape.

Projects MSstats, Cardinal

Olga Vitek (Northeastern University)

Next Generation File Formats for BioImaging

To support the Bio-Formats user community and develop new formats to make proprietary file formats obsolete.

Project Open Microscopy Environment

Jason Swedlow (University of Dundee)

Next Generation Mass Spectrometry with OpenMS

To enable the analysis of thousands of next generation data-independent acquisition (DIA) mass spectrometry measurements by implementing algorithms, visualization tools, and cloud containers based on OpenMS and the OpenSWATH algorithm.

Project OpenMS

Hannes Rost (University of Toronto)

Outreach and Software Development for the Apollo Genome Annotation Editor

To hire an outreach coordinator and (part of) a software developer for the Apollo genome annotation editor, provide developer- and user-oriented workshops and training, develop a plugin framework, and integrate protein visualizations.

Projects Apollo, JBrowse

Ian Holmes (University of California, Berkeley)

PsychoPy3: Essential Open Source Software for Neuroscience and Psychology

To create a smoother experience for PsychoPy users by detecting potential problems in user-created experiments before they are launched, and by increasing the test coverage within the application code itself.

Projects PsychoPy, PsychoJS

Jonathan Peirce (University of Nottingham)

QuPath: Open Source Bioimage Analysis and Quantitative Pathology

To accelerate biomedical research, biomarker discovery, and the translation of artificial intelligence into clinical practice by enhancing the QuPath open source platform and by integrating it with other CZI-funded software.

Project QuPath

Peter Bankhead (University of Edinburgh)

Reproducibility in Bioinformatics by Sustaining Bioconda Development

To establish teaching material, improve documentation, and minimize maintenance effort of the Bioconda project by extending automation of code review, testing, and building.

Project Bioconda

Johannes Köster (University of Duisburg-Essen, Bioconda Core Team)

Scalable Storage of Tensor Data for Scientific Computing

To establish Zarr as a foundation for scientific data storage, with clear data format and protocol specifications, implementations in multiple programming languages, and a community process for evolving to support new scientific applications.

Project Zarr

Ryan Williams (Mount Sinai School of Medicine)

Scaling OpenRefine

To attract new contributors by improving OpenRefine's documentation, and implement a new data model to improve the scalability, transparency, and reproducibility of OpenRefine workflows.

Project OpenRefine

Antonin Delpeuch (Code for Science and Society)

Scanpy 2.0

To expand Scanpy’s core infrastructure and community platforms for stability, versatility, and sharing knowledge.

Project Scanpy

Fabian Theis (Helmholtz Zentrum Munich)

Scikit-learn Maintenance and Enhancement for Gradient Boosting

To improve the open-source machine learning library scikit-learn and aid in maintaining the project, while considering the new implementation of Gradient boosting.

Project scikit-learn

Andreas Mueller (Columbia University)

Strengthening NumPy’s Foundations: Growing Beyond Code

To grow the maturity of the NumPy project through governance, documentation, and website work by improving the robustness of its links with OpenBLAS, and through diversifying the core team beyond the developer role.

Projects NumPy, OpenBLAS

Ralf Gommers (Quansight, NumFOCUS)

The GATK Methods for Bacterial Variant Discovery and Evaluation

To develop GATK methods for variant discovery and evaluation in bacteria, resolve inconsistent and diverse results from different research groups, and allow for the sharing of data and analysis globally to control bacterial transmission and antibiotic resistance.

Project GATK

Bhanu Gandham (Broad Institute of MIT and Harvard)

Sorry, there are currently no results that match those criteria. Please try clearing all search terms.