Essential Open Source Software for Science

CZI’s Essential Open Source Software for Science program supports software maintenance, growth, development, and community engagement for open source tools critical to science.

Filter by:

Cycle

Showing 72 results

A Modular Suite of Advanced Bioimaging Tools with scikit-image and Dash

To bring the combined power of scikit-image and Dash to a larger number of scientists thanks to increased execution speed, interactive image annotation and processing, and outstanding documentation targeting life sciences practitioners.

Projects scikit-image, Dash
Lead

Emmanuelle Gouillart (Plotly Technologies, Inc.)

Funding Cycle 1
A Solid Foundation for Statistics in Python with SciPy

The project will improve the SciPy library's statistics functionality to better serve biomedical research and downstream projects. In addition, an outreach component will engage female students, inspiring them to participate in open source code development.

Project SciPy
Leads

Warren Weckesser (University of California, Berkeley; NumFOCUS)

Matt Haberland (California Polytechnic State University, NumFOCUS)

Funding Cycle 1
Advancing Microbiome Research Through QIIME 2 Community Development

To support the QIIME 2 user and developer communities by enabling sharing of automatically tested third-party content on the QIIME 2 Library, and hosting our first-ever co-convened user and developer workshop and networking event.

Project QIIME 2
Lead

Greg Caporaso (Northern Arizona University)

Funding Cycle 1
Apache Arrow Apprenticeship Program for OSS Maintenance and Community

To support the growth, sustainability, and diversity of the Apache Arrow project by expanding an apprenticeship program, which recruits developers from underrepresented groups and trains them to be open source software maintainers.

Project Apache Arrow
Lead

Wes McKinney (Ursa Labs)

Funding Cycle 3
Automated Optimal Model Calibration for the OpenSim Biomechanics Simulator

To develop an automated tool that uses optimization to calibrate models of the musculoskeletal system and improve simulation results, and disseminate the tool to the OpenSim community with documentation, examples, case studies, and outreach events.

Project OpenSim
Lead

Thomas Uchida (University of Ottawa)

Funding Cycle 3
Bioconductor Build System: Continuous Integration and Developer Feedback

To reengineer the Bioconductor build system for nightly continuous integration, production, and distribution of tarballs and binaries for over 1,700 user-contributed software packages.

Project Bioconductor Build System
Lead

Vincent Carey (Brigham and Women's Hospital)

Funding Cycle 1
Bridging the Gap In Medical Image Analysis and Biomechanics with ITK-SNAP

This grant supports implementation of biomechanical analysis features in ITK-SNAP, an open source application for medical image segmentation, with the goal of streamlining image processing, anatomical modeling, and tissue mechanics analysis from clinical image data.

Project ITK-SNAP
Lead

Alison Pouch (University of Pennsylvania)

Funding Cycle 2
Comprehensive, Scalable, and Collaborative Single-Cell Analysis with Seurat

To develop extensive functionality, expand user support, and initiate new modes of community outreach for Seurat, an open-source R toolkit for integrative single-cell analysis.

Project Seurat
Lead

Rahul Satija (New York Genome Center)

Funding Cycle 1
Computational Biology Software Maintenance Framework

To reorganize the libSBML and Deviser code bases for better community involvement, spin out part of libSBML as a reusable component for Deviser and other projects, and establish protocols for long-term sustainability of these important resources.

Projects libSBML, Deviser
Lead

Sarah Keating (University College London)

Funding Cycle 2
Continuous Improvement to Essential High-Throughput Bio-Sequence Aligners

To maintain BWA and improve the performance and robustness of BWA and its next major version BWA-MEM2.

Project BWA
Lead

Heng Li (Dana-Farber Cancer Institute)

Funding Cycle 1
Cytoscape Explore for Biological Networks Brings Cytoscape to the Cloud

To build Cytoscape Explore, a web-based biological network viewer and editor that will make key aspects of the widely used Cytoscape application accessible to new audiences as part of its evolution from a desktop application to a cloud ecosystem.

Project Cytoscape
Lead

Dexter Pratt (University of California, San Diego)

Funding Cycle 1
DeepLabCut: An Open Source Toolbox for Robust Animal Pose Estimation

To support code maintenance, a new code cookbook, and user education for the DeepLabCut software community and set the foundation towards becoming a sustainable software package for years to come.

Project DeepLabCut
Lead

Mackenzie Mathis (Harvard University & Swiss Federal Institute of Technology Lausanne)

Funding Cycle 3
DeepLabCut: An Open Source Toolbox for Robust Animal Pose Estimation

To support the maintenance, new extensions, and education of users of the DeepLabCut software community.

Project DeepLabCut
Lead

Mackenzie Mathis (Harvard University & Swiss Federal Institute of Technology Lausanne (2020))

Funding Cycle 1
dynverse: A Toolkit for Studying Cell Development with Single-Cell Omics

Single-cell biology is the application of technologies that enable multi-omics investigation at the level of a single cell. This project will streamline trajectory inference from single-cell omics data by improving integration with upstream and downstream analysis pipelines.

Project dynverse
Lead

Yvan Saeys (Vlaams Instituut voor Biotechnologie)

Funding Cycle 2
Enabling Biomedical Science with Common Workflow Language

To enable portability of complex biomedical workflows across different clouds and on-premise environments via better documentation, community support, and tooling for Common Workflow Language (CWL) with examples using Arvados and from the Personal Genome Project.

Project Common Workflow Language (CWL)
Lead

Sarah Wait Zaranek (Curii Corporation)

Funding Cycle 2
Enabling Differential Analyses of Genomic Data with limma, edgeR and Glimma

To improve ease of use and interoperability of these packages, make methodological responses to new data challenges, refresh the documentation and structure of these packages, and prepare training materials.

Projects limma, edgeR, Glimma
Lead

Gordon Smyth (Walter and Eliza Hall Institute of Medical Research)

Funding Cycle 1
Enhancing the Open Health Imaging Foundation Web Medical Imaging Framework

To develop training materials, perform software maintenance, expand outreach, and provide community support for the Open Health Imaging Foundation (OHIF) web-based medical imaging framework including its underlying libraries (e.g., Cornerstone).

Projects Open Health Imaging Foundation (OHIF) Viewer, Cornerstone
Lead

Gordon Harris (Massachusetts General Hospital)

Funding Cycle 3
Enhancing the Performance, Documentation, and Data Ecosystem for bedtools

To enhance bedtools’ functionality, documentation, and access to data, which will empower and expand the user community.

Projects bedtools, Go Get Data (GGD)
Lead

Aaron Quinlan (University of Utah)

Funding Cycle 1
Enhancing Usability of mixtools and tolerance for the Biomedical Community

To provide significant modernization and enhanced usability of the R packages mixtools and tolerance for improved utilization and accessibility within the biomedical and health research communities.

Projects mixtools, tolerance
Lead

Derek Young (University of Kentucky Research Foundation)

Funding Cycle 3
Ensuring Reproducible Transcriptomic Analysis with DESeq2 and tximeta

To extend DESeq2 functions to develop interfaces with Bioconductor’s rich experiment and annotation data, including single-cell datasets and genomic annotations, all leveraging tximeta’s metadata functionality for computational reproducibility.

Project DESeq2
Lead

Michael Love (The University of North Carolina at Chapel Hill)

Funding Cycle 3
Ensuring the Continued Growth of pandas

To support continued maintenance and development of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Project Pandas
Lead

Tom Augspurger (NumFOCUS)

Funding Cycle 1
ETE Toolkit: Phylogenomic Data Analysis and Visualization

To support the release and maintenance of a new version of the ETE toolkit including updated documentation and new features such as tree diff, tree-like regular expression searches, and large tree visualization.

Project ETE Toolkit
Lead

Jaime Huerta Cepas (Centro de Biotecnología y Genómica de Plantas)

Funding Cycle 2
Expand Interoperability of Hosted Scientific Documentation

To make scientific Python documentation more valuable by improving the user experience of linking between projects, and promote this ability within the scientific Python community.

Project Read the Docs
Lead

Eric Holscher (Read the Docs, Inc)

Funding Cycle 3
Expanding the Open mHealth Platform to Support Digital Biomarker Discovery

Open mHealth created an open data standard and community for patient-generated data, and the Digital Biomarker Discovery Pipeline will enable transformation of that data into indicators of health outcomes and evaluation of novel digital biomarkers.

Project Open DBDP
Lead

Jessilyn Dunn (Duke University)

Funding Cycle 2
Extending Galaxy for Large-Scale and Integrative Biomedical Analyses

To extend Galaxy, a web-based computational workbench used by thousands of scientists across the world, so that it can analyze large datasets and connect with other analysis tools.

Project Galaxy
Lead

Jeremy Goecks (Oregon Health & Science University)

Funding Cycle 3
GPU Acceleration, Rapid Releases, and Biomedical Examples for scikit-image

To maintain the popular scikit-image Python library for microscopy and medical imaging data and bring significant improvements via development of a backend system enabling multi-threading and GPU acceleration, an improved release process for more rapid cycles, and expanded biomedical examples.

Project scikit-image
Lead

Gregory Lee (Quansight)

Funding Cycle 3
HTSJDK: Enhancing the Java Toolkit for Emerging Sequencing Technologies

To enhance the HTSJDK Java toolkit for genomics with an extensible plugin framework that will enable support for emerging technologies required by contemporary analysis methods, such as long reads, graph/circular references, and epigenetic modifications.

Project HTSJDK
Lead

Eric Banks (Broad Institute of MIT and Harvard)

Funding Cycle 2
ilastik and Scientific Python Ecosystem: Deep Integration with Other Tools

To integrate ilastik with napari and Dask, replacing the outdated internal viewer and task scheduler by modern, community-supported alternatives with the aim to reduce technical debt, engage with the community, and deliver a superior user experience for the bioimage analysis community.

Project ilastik
Lead

Anna Kreshuk (European Molecular Biology Laboratory)

Funding Cycle 3
Improving Bokeh Figure Publication: SVG, LaTeX, and Maintenance

To improve Bokeh in key areas that are relevant to bioscience research and to secure a solid foundation for long-term project health and sustainability by engaging in important maintenance and fostering new contributors.

Project Bokeh
Lead

Bryan Van de Ven (Nvidia)

Funding Cycle 3
Improving Usability and Sustainability for NumPy and OpenBLAS

To improve the robustness and usability of NumPy by continuing to work in documentation and community building, modernizing its integration with Fortran tools via numpy.f2py, and ensuring the sustainability of both NumPy and OpenBLAS.

Projects NumPy, OpenBLAS
Lead

Melissa Mendonça (Quansight)

Funding Cycle 3
Improving Usability of Core Neuroscience Analysis Tools with MNE-Python

To enhance usability of MNE-Python through improvements to its computational efficiency, API, interactive visualization capabilities, and the clarity and consistency of documentation.

Project MNE-Python
Lead

Daniel McCloy (University of Washington)

Funding Cycle 2
Improving User Experience and Debuggability of pip For All Python Users

To complete the design, implementation, and rollout of pip's next-generation dependency resolver, and permanently improve pip's maintainer capacity and user experience.

Project pip
Lead

Ernest W. Durbin III (Python Software Foundation)

Funding Cycle 1
Improving User Experience and Engagement for UCSC Xena

To improve the user experience of UCSC Xena and better engage users by implementing the redesign of two core features using UX principles, standardizing training materials, and publishing a blog highlighting research use cases.

Project UCSC Xena
Lead

Jingchun Zhu (University of California, Santa Cruz)

Funding Cycle 2
IQ-TREE for Ultra-Large Genomic Data

To develop an open standard and API for phylogenetic models and improve the speed and scalability of the IQ-TREE software for phylogenetic inference from ultra-large genomic data.

Project IQ-TREE
Lead

Minh Bui (Australian National University)

Funding Cycle 1
JupyterHub Contributor in Residence Program

To improve community support and technical maintenance across the JupyterHub repositories.

Project Project Jupyter (JupyterHub, The Binder Project)
Lead

Chris Holdgraf (University of California, Berkeley; NumFOCUS)

Funding Cycle 1
MACS3: A Versatile Peak Caller for Gene Regulation Studies

To enhance the infrastructure to support the continuous development and growing community of the popular algorithm MACS for gene regulation studies, in order to expand its features and adapt to new technologies such as single-cell ATAC-seq.

Project MACS
Lead

Tao Liu (Roswell Park Alliance Foundation)

Funding Cycle 2
Maintaining Rocker: Sustainability for Containerized Reproducible Analyses

To put Rocker, the de facto standard for reproducible, containerized R analyses, on a path to sustainable maintenance through refactoring, improving the quality of documentation, expanding the community, and targeting new hardware platforms.

Project The Rocker Project
Lead

Carl Boettiger (University of California, Berkeley)

Funding Cycle 1
Maintenance and Improvement of Validated, Community Developed NGS Analyses

To improve the bcbio-nextgen toolkit, focusing on maintaining existing variant calling functionality and extending support for structural and RNA-seq variant analyses.

Project bcbio-nextgen
Lead

Rory Kirchner (Harvard Chan School of Public Health)

Funding Cycle 1
Matplotlib: Foundation of Scientific Visualization in Python

To enable Matplotlib to continue as the core plotting library of the scientific Python ecosystem for researchers in biomedical imaging, microscopy, and genomics by addressing the maintenance backlog and beginning Matplotlib's evolution to meet the community’s visualization challenges for the next decade.

Project Matplotlib
Lead

Thomas A. Caswell (Brookhaven National Laboratory, NumFOCUS)

Funding Cycle 3
Matplotlib: Foundation of Scientific Visualization in Python

To enable Matplotlib to continue as the core plotting library of the scientific Python ecosystem by addressing the maintenance backlog and planning Matplotlib's evolution to meet the community’s visualization challenges for the next decade.

Project Matplotlib
Lead

Thomas A. Caswell (Brookhaven National Laboratory, NumFOCUS)

Funding Cycle 1
MicroManager 2.0: An Open Platform for Microscopy Image Acquisition

To support the open source µManager optical microscopy acquisition platform to improve its architecture, infrastructure, and support to ensure many years of growth, both in user base and capabilities.

Projects MicroManager, ImageJ
Lead

Kevin Eliceiri (University of Wisconsin, Madison)

Funding Cycle 1
Migrating Protege to a Modern Web Stack

To construct a solid foundation for the next generation of Protege using a modern web stack that will make Protege easier to maintain, extend, and — crucially — make it easier for third parties to contribute to the code base.

Project Protege
Lead

Mark Musen (Stanford University)

Funding Cycle 1
MSstats and Cardinal: Next Generation Statistical Mass Spectrometry in R

To provide open-source, interoperable, and extensible statistical software for quantitative mass spectrometry, which enables experimentalists and developers of statistical methods to rapidly respond to changes in the evolving biotechnological landscape.

Projects MSstats, Cardinal
Lead

Olga Vitek (Northeastern University)

Funding Cycle 1
Next Generation File Formats for BioImaging

To support the Bio-Formats user community and develop new formats to make proprietary file formats obsolete.

Project Open Microscopy Environment
Lead

Jason Swedlow (University of Dundee)

Funding Cycle 1
Next Generation Mass Spectrometry with OpenMS

To enable the analysis of thousands of next generation data-independent acquisition (DIA) mass spectrometry measurements by implementing algorithms, visualization tools, and cloud containers based on OpenMS and the OpenSWATH algorithm.

Project OpenMS
Lead

Hannes Rost (University of Toronto)

Funding Cycle 1
Nextflow and nf-core: Reproducible Workflows for the Scientific Community

To support a fast-growing, community-building software for infrastructure agnostic, open source biomedical pipelines.

Projects Nextflow, nf-core
Lead

Ellen Sherwood (KTH Royal Institute of Technology)

Funding Cycle 2
Open Source Image Registration: The Elastix Toolbox

This project will improve accessibility, interoperability, efficiency, and sustainability of the biomedical image registration software Elastix by providing complete support of Python, better integration with other software, code improvements, and a focus on community.

Project Elastix
Lead

Marius Staring (Leiden University Medical Center)

Funding Cycle 2
Open Source Software for Bulk and Single-Cell RNA-seq

To support development, outreach, and user support for the kallisto RNA-seq and single-cell RNA-seq software project.

Project kallisto
Lead

Lior Pachter (California Institute of Technology)

Funding Cycle 2
OpenCRAVAT Community Building for Integrated Variant Annotation Framework

To create a sustainable open-source community resource for deep annotation of genetic variants.

Project OpenCRAVAT
Lead

Rachel Karchin (Johns Hopkins University)

Funding Cycle 2
OpenMM: Key Infrastructure for Biomolecular Modeling and Simulation

This team will support the continued development of OpenMM to better serve its broad biomolecular modeling community, as well as support its extension to integrate machine learning that will enable genomic-scale biomolecular modeling, simulation, and prediction.

Project OpenMM: A high performance toolkit for molecular simulation
Lead

Thomas Markland (Stanford University)

Funding Cycle 2
OpenSim: An Open Source Biomechanics Simulator to Study Movement

To improve the usability, computational performance, maintenance and outreach of the open source software OpenSim and to support the education and training of its users around the world.

Project OpenSim
Lead

Ajay Seth (Delft University of Technology)

Funding Cycle 2
Outreach and Software Development for the Apollo Genome Annotation Editor

To hire an outreach coordinator and (part of) a software developer for the Apollo genome annotation editor, provide developer- and user-oriented workshops and training, develop a plugin framework, and integrate protein visualizations.

Projects Apollo, JBrowse
Lead

Ian Holmes (University of California, Berkeley)

Funding Cycle 1
Providing a Solid Foundation for Network Analysis

This project will solidify the use and future development of the cross-language igraph package to support network analysis in all scientific domains, including biomedical and life sciences.

Project igraph
Lead

Vincent Traag (Leiden University)

Funding Cycle 2
PsychoPy3: Essential Open Source Software for Neuroscience and Psychology

To create a smoother experience for PsychoPy users by detecting potential problems in user-created experiments before they are launched, and by increasing the test coverage within the application code itself.

Projects PsychoPy, PsychoJS
Lead

Jonathan Peirce (University of Nottingham)

Funding Cycle 1
QuPath: Open Source Bioimage Analysis and Quantitative Pathology

To accelerate biomedical research, biomarker discovery, and the translation of artificial intelligence into clinical practice by enhancing the QuPath open source platform and by integrating it with other CZI-funded software.

Project QuPath
Lead

Peter Bankhead (University of Edinburgh)

Funding Cycle 1
Real-Time Collaboration in Jupyter

This team will build a real-time data model for Jupyter to lay the foundation for real-time collaboration on notebooks.

Project JupyterLab
Lead

Saul Shanabrook (Quansight)

Funding Cycle 2
Rebuilding the Community Behind VisPy's Fast, Interactive Visualizations

To attract new users and contributors to the VisPy project through software improvements, community outreach, and instructional materials.

Project VisPy
Lead

David Hoese (University of Wisconsin - Madison)

Funding Cycle 2
Reproducibility in Bioinformatics by Sustaining Bioconda Development

To establish teaching material, improve documentation, and minimize maintenance effort of the Bioconda project by extending automation of code review, testing, and building.

Project Bioconda
Lead

Johannes Köster (University of Duisburg-Essen, Bioconda Core Team)

Funding Cycle 1
Revitalizing NetworkX for Complex Network Analysis

To meet the needs of the scientific community over the next decade, this team will revitalize NetworkX — the fundamental network analysis tool in Python — by growing its developer community, refactoring code, improving performance, and making a major release.

Project NetworkX
Lead

Stefan van der Walt (University of California, Berkeley)

Funding Cycle 2
Salmon: Improving RNA-seq Quantification & Building an Inclusive Community

To advance support and development of the open source Salmon and Alevin software for gene expression quantification of single-cell and bulk RNA-seq.

Project Salmon
Lead

Carl Kingsford (Ocean Genomics)

Funding Cycle 3
Scalable Storage of Tensor Data for Scientific Computing

To establish Zarr as a foundation for scientific data storage, with clear data format and protocol specifications, implementations in multiple programming languages, and a community process for evolving to support new scientific applications.

Project Zarr
Lead

Ryan Williams (Mount Sinai School of Medicine)

Funding Cycle 1
Scaling OpenRefine

To attract new contributors by improving OpenRefine's documentation, and implement a new data model to improve the scalability, transparency, and reproducibility of OpenRefine workflows.

Project OpenRefine
Lead

Antonin Delpeuch (Code for Science and Society)

Funding Cycle 1
Scaling Python with Dask

To provide dedicated community support and maintenance for the Dask project and support growth in the biological sciences field.

Project Dask
Lead

Matthew Rocklin (NumFOCUS)

Funding Cycle 2
Scanpy 2.0

To expand Scanpy’s core infrastructure and community platforms for stability, versatility, and sharing knowledge.

Project Scanpy
Lead

Fabian Theis (Helmholtz Zentrum Munich)

Funding Cycle 1
Scikit-learn Maintenance and Enhancement for Gradient Boosting

To improve the open-source machine learning library scikit-learn and aid in maintaining the project, while considering the new implementation of Gradient boosting.

Project scikit-learn
Lead

Andreas Mueller (Columbia University)

Funding Cycle 1
Strengthening Community and Code Foundations for Brain Imaging

To strengthen the social and code foundations of the Nibabel library by extending the API and input/output to better support metadata, supporting outputs from image registration, and through educational outreach.

Project Nibabel
Lead

Matthew Brett (NumFOCUS)

Funding Cycle 3
Strengthening NumPy’s Foundations: Growing Beyond Code

To grow the maturity of the NumPy project through governance, documentation, and website work by improving the robustness of its links with OpenBLAS, and through diversifying the core team beyond the developer role.

Projects NumPy, OpenBLAS
Lead

Ralf Gommers (Quansight, NumFOCUS)

Funding Cycle 1
Supporting Next Generation Single-Cell Genomics Experiments with Monocle

To improve Monocle with algorithms, statistical methods, and web-based visualization tools that will enable biologists using single-cell genomics to extract and disseminate new insights from their experiments.

Project Monocle
Lead

Cole Trapnell (University of Washington)

Funding Cycle 3
The GATK Methods for Bacterial Variant Discovery and Evaluation

To develop GATK methods for variant discovery and evaluation in bacteria, resolve inconsistent and diverse results from different research groups, and allow for the sharing of data and analysis globally to control bacterial transmission and antibiotic resistance.

Project GATK
Lead

Bhanu Gandham (Broad Institute of MIT and Harvard)

Funding Cycle 1
The Health of Pandas

To ensure the continued health of the pandas library by dedicating resources specifically to maintenance and implementing consistent missing data handling for all data types.

Project Pandas
Lead

Tom Augspurger (NumFOCUS)

Funding Cycle 3
The Percolator Analysis Engine for Tandem Mass Spectrometry Data

To improve Percolator, the dominant software for analyzing spectrum identifications produced by protein tandem mass spectrometry, by making the software faster, more robust, and applicable to more types of mass spectrometry data.

Project Percolator
Lead

William Noble (University of Washington)

Funding Cycle 2
Xarray: Multidimensional Labeled Arrays and Datasets in Python

To grow the use of Xarray in the biosciences as a foundational data model and computational toolkit for multidimensional labeled arrays.

Project Xarray
Lead

Joseph Hamman (NumFOCUS)

Funding Cycle 2

Sorry, there are currently no results that match those criteria. Please try clearing all search terms.