Open Science Grants
CZI’s Open Science program aims to advance the universal and immediate open sharing of all scientific knowledge, processes, and outputs. To this aim, we invest in tools, platforms, and organizations that help expand participation and access to the scientific process by making it open and reproducible, and helping scientists build on each others’ work. Browse all our grantees below.
Showing 185 results
2i2c makes interactive computing more accessible and useful for research and education by managing interactive computing infrastructure.
To empower the biomedical research community in Latin America by localizing 3D Slicer to Spanish and Portuguese, improving tutorial localization infrastructure, and holding outreach events.
To increase the accessibility of the 3D Slicer open source platform for biomedical research to clinicians and scientists in non-English speaking countries.
To bring the combined power of scikit-image and Dash to a larger number of scientists thanks to increased execution speed, interactive image annotation and processing, and outstanding documentation targeting life sciences practitioners.
The project will improve the SciPy library's statistics functionality to better serve biomedical research and downstream projects. In addition, an outreach component will engage female students, inspiring them to participate in open source code development.
To make UpSet plots accessible to low-vision and blind users, and to simplify authoring UpSet plots.
To address new challenges posed by replicated single-cell RNA-seq data and by mass spectrometry proteomics.
To support the onboarding, inclusion, and retention of people from historically marginalized groups on scientific Python projects and structurally improve the community dynamics of NumPy, SciPy, Matplotlib, and pandas.
To support the QIIME 2 user and developer communities by enabling sharing of automatically tested third-party content on the QIIME 2 Library, and hosting our first-ever co-convened user and developer workshop and networking event.
To support the growth, sustainability, and diversity of the Apache Arrow project by expanding an apprenticeship program, which recruits developers from underrepresented groups and trains them to be open source software maintainers.
Promoting preprints and open, rapid dissemination in the life sciences
To develop an automated tool that uses optimization to calibrate models of the musculoskeletal system and improve simulation results, and disseminate the tool to the OpenSim community with documentation, examples, case studies, and outreach events.
To develop key infrastructure updates and collaboration resources for state-of-the-art Bayesian modeling software libraries.
To provide ongoing maintenance and community support for the bcbio-nextgen toolkit, focusing on existing variant calling functionality and improving the epigenomic pipelines.
To reengineer the Bioconductor build system for nightly continuous integration, production, and distribution of tarballs and binaries for over 1,700 user-contributed software packages.
To provide Bioconductor training globally by redeveloping the website and developing infrastructure to deliver high quality community-led training in local languages.
To increase participation of underrepresented groups in genome data science research through alliances with organizations advancing diversity in science, increased mentoring activities for developers, and enhanced governance of Bioconductor.
Preprint servers allow scientists to rapidly share results and manuscripts before they are peer reviewed.
The Black in AI Academic Program is a resource supporting Black junior researchers as they apply to graduate programs, navigate graduate school, and enter the postgraduate job market.
To extend the open source Bokeh library to cover streaming gridded visualizations for bioscience applications that currently require expensive proprietary tools.
This grant supports implementation of biomechanical analysis features in ITK-SNAP, an open source application for medical image segmentation, with the goal of streamlining image processing, anatomical modeling, and tissue mechanics analysis from clinical image data.
To enhance MNE-Python for clinical neuroscience uses by improving spectral and spectro-temporal data handling, and by providing standardized preprocessing pipelines for data.
To leverage regional research interests and expertise to build computational capacity across Latin America, supporting a variety of activities from workshops to knowledge exchange meetings.
To develop tools, training materials, and mentorship opportunities to help women and nonbinary people in network science to use and contribute code to the igraph open source network analysis library.
To enhance metagenomic analysis by integrating cogent3, GraphBin and IQ-TREE to support innovative genomic technologies for monitoring the impact of viral and bacterial diversity on human health.
To build an open source solution to power commenting, review, assessment, and discussion initiatives and provide a platform for the evolution of peer review.
To develop extensive functionality, expand user support, and facilitate interoperability for Seurat, an open source R toolkit for integrative single-cell analysis.
To develop extensive functionality, expand user support, and initiate new modes of community outreach for Seurat, an open-source R toolkit for integrative single-cell analysis.
To reorganize the libSBML and Deviser code bases for better community involvement, spin out part of libSBML as a reusable component for Deviser and other projects, and establish protocols for long-term sustainability of these important resources.
To develop statistically robust, computationally efficient, and maximally compatible open source software for the design and analyses of multiplexed single-cell sequencing experiments.
To effectively support community building efforts across open source ecosystems in molecular sciences, improve contributor pipelines, and seek synergies and collaboration opportunities in this space.
To maintain BWA and improve the performance and robustness of BWA and its next major version BWA-MEM2.
To build Cytoscape Explore, a web-based biological network viewer and editor that will make key aspects of the widely used Cytoscape application accessible to new audiences as part of its evolution from a desktop application to a cloud ecosystem.
A global community for people underrepresented in data science
To create value for DataCite members through community-driven, innovative, open, integrated, usable, and sustainable services for research. This grant will enable DataCite to make great strides in their systems, processes, and advocacy work supporting researchers’ ability to get credit and recognition for their open research output, as well as further their efforts to build out their international capacity and engagement program that provides regional support and engagement in lesser represented regions such as Latin-America, Africa, the Middle East and Asia.
To accelerate single-cell biology methods research and empower their developers with foundational probabilistic AI software.
To interface recently developed computer vision tools with an open-source biomechanical modeling software, which should facilitate the uptake of markerless motion tracking in biomedicine.
To develop a DeepLabCut AI Residency Program for underrepresented groups in machine learning and computer science in order to recruit, fund, and nurture the next generation of open source leaders.
To provide maintenance, user-focused extensions, education, and support of the growing DeepLabCut software community.
To support code maintenance, a new code cookbook, and user education for the DeepLabCut software community and set the foundation towards becoming a sustainable software package for years to come.
To support the maintenance, new extensions, and education of users of the DeepLabCut software community.
To make preprint review and other editorial processes machine-readable and discoverable.
Single-cell biology is the application of technologies that enable multi-omics investigation at the level of a single cell. This project will streamline trajectory inference from single-cell omics data by improving integration with upstream and downstream analysis pipelines.
To enable portability of complex biomedical workflows across different clouds and on-premise environments via better documentation, community support, and tooling for Common Workflow Language (CWL) with examples using Arvados and from the Personal Genome Project.
To improve ease of use and interoperability of these packages, make methodological responses to new data challenges, refresh the documentation and structure of these packages, and prepare training materials.
To use QIIME 2 as an on-ramp to scientific computing for Native American students by engaging locally with schools primarily serving Native Americans, while expanding the global QIIME2 user, developer, and educator communities.
To increase diversity in computational mass spectrometry through teaching and mentoring with the open-source framework OpenMS.
To enhance Giotto by implementing a novel data structure and framework for the abstract representation and analysis of emerging datasets from multi-modal and multi-resolution spatial technologies.
To provide a series of GPU accelerated routines for signal processing and interpolation in CuPy to be a foundation for the research community.
To improve Spyder support for connecting to a remote machine to develop, execute, and debug Python code, as well as installing packages, managing environments and interacting with the remote filesystem.
To develop training materials, perform software maintenance, expand outreach, and provide community support for the Open Health Imaging Foundation (OHIF) web-based medical imaging framework including its underlying libraries (e.g., Cornerstone).
To make significant improvements to the SciML project, which is leveraged by pharmacologists in academia and industry for simulation of virtual clinical trials, drug design, and systems biology modeling.
To enhance bedtools’ functionality, documentation, and access to data, which will empower and expand the user community.
To provide significant modernization and enhanced usability of the R packages mixtools and tolerance for improved utilization and accessibility within the biomedical and health research communities.
To extend DESeq2 functions to develop interfaces with Bioconductor’s rich experiment and annotation data, including single-cell datasets and genomic annotations, all leveraging tximeta’s metadata functionality for computational reproducibility.
To support continued maintenance and development of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
To support the growth and health of pandas, the foundational library for tabular data structures in the Scientific Python Ecosystem, by funding continued maintenance and community building efforts.
To build a repository of infectious disease models in Stan that will provide researchers with in-depth examples, helping them to draw sound conclusions and make better predictions from their data.
To enable interactive analysis and exploration of phylogenetic data at the genomics and metagenomics scale.
To support the release and maintenance of a new version of the ETE toolkit including updated documentation and new features such as tree diff, tree-like regular expression searches, and large tree visualization.
To make scientific Python documentation more valuable by improving the user experience of linking between projects, and promote this ability within the scientific Python community.
To expand on existing web infrastructure by building highly intuitive and responsive data visualization tools for mining ‘shotgun’ metagenomic data.
To expand the capabilities and improve the robustness and maintainability of the salmon software ecosystem, further widening its scope of applicability and improving the user and developer experience.
Open mHealth created an open data standard and community for patient-generated data, and the Digital Biomarker Discovery Pipeline will enable transformation of that data into indicators of health outcomes and evaluation of novel digital biomarkers.
To extend Galaxy, a web-based computational workbench used by thousands of scientists across the world, so that it can analyze large datasets and connect with other analysis tools.
To improve the tooling around the conda ecosystem to better serve the millions of users in biological sciences, data sciences, physics, robotics and other scientific disciplines.
To provide an efficient neuroimage analysis pipeline for the medical imaging communities by consolidating advanced DL-methods into a single user-friendly, maintainable, open source software framework.
To develop an easy-to-use and validated neuroanatomical framework for the multidimensional image viewer napari, bringing state-of-the-art image analysis plugins to neuroscience.
To create a consistent, type-annotated, discoverable, and extensible API for scikit-image and facilitate interoperability in the broader image analysis ecosystem.
To sustain maintenance, improve community support, enhance usability and robustness, and add improvements for the future framework.
To inspire a diverse global community to collaborate in genome medicine, research, and education by packaging CGAP’s portal and cloud infrastructure as self-serve, orchestrated, open source software.
To maintain the popular scikit-image Python library for microscopy and medical imaging data and bring significant improvements via development of a backend system enabling multi-threading and GPU acceleration, an improved release process for more rapid cycles, and expanded biomedical examples.
To enable end-users of ImageJ/Fiji, Icy and napari to process biological imaging time-lapses or large-scale image data tile-by-tile on multiple graphics processing units (GPUs) using CLIJ.
To expand our community so new individuals can meaningfully contribute code, documentation, workflows and other software artifacts by hiring a dedicated software engineer.
To optimize GSVA functionality to analyze single-cell and spatial transcriptomic data sets, increasing its robustness and scalability and improving user interface and documentation.
To build a sustainable computational biology training infrastructure with accredited trainers, access to high quality training materials, and support for regional communities of trainees in low and middle income countries (LMICs) in Africa.
To enhance the HTSJDK Java toolkit for genomics with an extensible plugin framework that will enable support for emerging technologies required by contemporary analysis methods, such as long reads, graph/circular references, and epigenetic modifications.
To integrate ilastik with napari and Dask, replacing the outdated internal viewer and task scheduler by modern, community-supported alternatives with the aim to reduce technical debt, engage with the community, and deliver a superior user experience for the bioimage analysis community.
To enable multi-scale interactive machine learning on large datasets in ilastik through full exploitation of state-of-the-art pyramidal file formats and viewers, and extend functionality to other bioimage analysis tools.
To improve Bokeh in key areas that are relevant to bioscience research and to secure a solid foundation for long-term project health and sustainability by engaging in important maintenance and fostering new contributors.
To maintain and improve the three proposed software projects: minimap2, BWA and hifiasm, and extend them to new architectures and new data types.
To improve OpenRefine to empower users without programming experience to publish research datasets along with verifiable and reproducible workflows, and to automate such workflows.
To facilitate the detection and characterization of pathogens in microbiome data, while supporting community development and dissemination of accessible and reproducible bioinformatics applications.
To scale technical and social support for new analyses in Nilearn including the general linear model, giving access to a broad statistical framework for neuroimagers within the open source Python ecosystem.
To improve the flexibility and utility of bedtools for large-scale genomic analyses.
To improve the robustness and usability of NumPy by continuing to work in documentation and community building, modernizing its integration with Fortran tools via numpy.f2py, and ensuring the sustainability of both NumPy and OpenBLAS.
To enhance usability of MNE-Python through improvements to its computational efficiency, API, interactive visualization capabilities, and the clarity and consistency of documentation.
To complete the design, implementation, and rollout of pip's next-generation dependency resolver, and permanently improve pip's maintainer capacity and user experience.
To improve the user experience of UCSC Xena and better engage users by implementing the redesign of two core features using UX principles, standardizing training materials, and publishing a blog highlighting research use cases.
To bring systematically marginalized voices of disabled scientists into scientific computing communities via building and applying accessibility tools, standards, and community contribution practices in the Jupyter ecosystem.
To improve diversity of computational biology and open source development, providing professional industry opportunities for talent underrepresented within the field of genomics.
To build a new class of macromolecular modeling methods to study the interplay of structure, dynamics, cellular assemblies, and disease from the subatomic to nanometer scale.
To integrate the WebProtégé ontology editor with other open source tools that together constitute an ecosystem that is used widely to develop and manage biomedical ontologies.
To support a project that will enable collaboration, scalable analysis, and open workflows for communities in the Global South via community-focused interactive computing hubs in the cloud, paired with training and capacity building.
A non-profit initiative dedicated to helping focus investments in the open technology on which research and scholarship rely.
To develop an open standard and API for phylogenetic models and improve the speed and scalability of the IQ-TREE software for phylogenetic inference from ultra-large genomic data.
To broaden participation in the JupyterHub community by establishing a role dedicated to strategy and stewardship for pathways into and throughout the community, as well as programs that provide onboarding and mentorship for historically underrepresented groups.
To improve community support and technical maintenance across the JupyterHub repositories.
To maintain the established infrastructure and optimize the current features of the popular peak caller MACS for gene regulation studies, while focusing on building the data structure and features for single-cell data analysis.
To enhance the infrastructure to support the continuous development and growing community of the popular algorithm MACS for gene regulation studies, in order to expand its features and adapt to new technologies such as single-cell ATAC-seq.
To put Rocker, the de facto standard for reproducible, containerized R analyses, on a path to sustainable maintenance through refactoring, improving the quality of documentation, expanding the community, and targeting new hardware platforms.
To further the sustainability and usability of scikit-learn by reducing the maintenance backlog and extending its machine learning models and pipelines to support more complex datasets.
To improve the bcbio-nextgen toolkit, focusing on maintaining existing variant calling functionality and extending support for structural and RNA-seq variant analyses.
To enable Matplotlib to continue as the core plotting library of the scientific Python ecosystem for researchers in biomedical imaging, microscopy, and genomics by addressing the maintenance backlog and beginning Matplotlib's evolution to meet the community’s visualization challenges for the next decade.
To support the continued maintenance, growth, development, and community engagement of Matplotlib, the foundational plotting library of the Scientific Python Ecosystem.
To enable Matplotlib to continue as the core plotting library of the scientific Python ecosystem by addressing the maintenance backlog and planning Matplotlib's evolution to meet the community’s visualization challenges for the next decade.
To improve and maintain high-performance tools for analysis of biological systems at the molecular scale and incentivize scientists to drive transparent and reproducible research.
To grow the MDAnalysis community sustainably.
MetaDocencia provides online computational and scientific training to Spanish-speaking researchers, teachers, and professionals throughout Latin America.
To support the open source µManager optical microscopy acquisition platform to improve its architecture, infrastructure, and support to ensure many years of growth, both in user base and capabilities.
To construct a solid foundation for the next generation of Protege using a modern web stack that will make Protege easier to maintain, extend, and — crucially — make it easier for third parties to contribute to the code base.
To modernize the igraph interfaces to make network analysis easier.
To keep ChimeraX molecular and microscopy analysis software current with the latest technology and facilitate the migration of tens of thousands of Chimera users to ChimeraX.
To provide open-source, interoperable, and extensible statistical software for quantitative mass spectrometry, which enables experimentalists and developers of statistical methods to rapidly respond to changes in the evolving biotechnological landscape.
To support the Bio-Formats user community and develop new formats to make proprietary file formats obsolete.
To enable the analysis of thousands of next generation data-independent acquisition (DIA) mass spectrometry measurements by implementing algorithms, visualization tools, and cloud containers based on OpenMS and the OpenSWATH algorithm.
To keep the current momentum of initiatives and push forward with new actions for accessibility, internationalization, mentorships and ambassadors.
To support a fast-growing, community-building software for infrastructure agnostic, open source biomedical pipelines.
To continue support for a fast-growing community, building open source software for infrastructure agnostic, open source biomedical analysis workflows.
To coordinate and foster next-generation file formats while increasing community access to public imaging data.
To solidify NiPreps by boosting community growth, securing maintenance, and developing new components to expand the diversity of supported data such as imaging parameters, modalities, populations, and species.
To partner with hack.diversity to serve as curriculum designers and mentors, equipping their Fellows to contribute to open source web-based medical imaging, as well as mentor existing global OHIF contributors.
To develop, upgrade, and migrate documentation; perform software maintenance; and provide community support for the Open Health Imaging Foundation web-based medical imaging framework.
Open Science Training and Mentoring
This project will improve accessibility, interoperability, efficiency, and sustainability of the biomedical image registration software elastix by providing complete support of Python, better integration with other software, code improvements, and a focus on community.
To improve accessibility, interoperability, efficiency, and sustainability of the biomedical image registration software elastix, by making it a library-first package, allowing integration with other software and improving its performance.
To develop open medical ontologies and analytics that enable large-scale generation of real-world evidence on disease and the effects of medical interventions across the world’s electronic health data.
To support development, outreach, and user support for the kallisto RNA-seq and single-cell RNA-seq software project.
To create a sustainable open-source community resource for deep annotation of genetic variants.
This team will support the continued development of OpenMM to better serve its broad biomolecular modeling community, as well as support its extension to integrate machine learning that will enable genomic-scale biomolecular modeling, simulation, and prediction.
To continue to diversify contributors by building capacity in project management, as well as offering internships and eliminating cultural or linguistic biases in the Open Refine tool.
To develop accurate, fast, and researcher-friendly open tools for creating and simulating neuromuscular and musculoskeletal models to address biomedical questions in human and animal mobility.
To improve the usability, computational performance, maintenance and outreach of the open source software OpenSim and to support the education and training of its users around the world.
oSTEM empowers LGBTQ+ individuals in STEM to succeed personally, academically, and professionally
To hire an outreach coordinator and (part of) a software developer for the Apollo genome annotation editor, provide developer- and user-oriented workshops and training, develop a plugin framework, and integrate protein visualizations.
Outreachy provides internships in open source to people subject to systemic bias and underepresented in tech.
To upgrade the interactive documentation experience of IPython and Jupyter to allow inline graphs, navigation, and indexing, and to support features currently only available on hosted websites.
To develop an open Python library and community between spectroscopy and fluorescence microscopy users that is both accessible and self-sustainable in the long term.
To develop an end-to-end, predictive computational ecosystem for quantitative spatiotemporal modeling of spatial and single-cell multiomics.
To improve technology and research culture to support a more open and participatory ecosystem for the public review of preprints.
To support the maintenance, development and dissemination of ImgLib2 and BigDataViewer, key infrastructural software components for visualization and analysis of large image data on the Java-based platforms Fiji, KNIME, and Icy.
Sharing Methods to Accelerate Reproducible Science
This project will solidify the use and future development of the cross-language igraph package to support network analysis in all scientific domains, including biomedical and life sciences.
To create a smoother experience for PsychoPy users by detecting potential problems in user-created experiments before they are launched, and by increasing the test coverage within the application code itself.
To enable researchers to more deeply interrogate complex biomedical images by improving the extensibility, robustness, and interoperability of QuPath.
To accelerate biomedical research, biomarker discovery, and the translation of artificial intelligence into clinical practice by enhancing the QuPath open source platform and by integrating it with other CZI-funded software.
To hire a dedicated Communities Champion, running code-sprints and contributor workshops and translating our desktop app into additional languages.
This team will build a real-time data model for Jupyter to lay the foundation for real-time collaboration on notebooks.
To significantly speed up IQ-TREE to enable real-time genomic epidemiology during ongoing outbreaks such as COVID-19, and to introduce continuous integration and a testing framework to ease software maintenance for all developers.
To attract new users and contributors to the VisPy project through software improvements, community outreach, and instructional materials.
Reproducibility for Everyone runs workshops to train life sciences researchers in reproducibility tools and best practices.
To establish teaching material, improve documentation, and minimize maintenance effort of the Bioconda project by extending automation of code review, testing, and building.
To support the Research Software Alliance (ReSA) in bringing research software communities together to collaborate on the advancement of the research software ecosystem.
To meet the needs of the scientific community over the next decade, this team will revitalize NetworkX — the fundamental network analysis tool in Python — by growing its developer community, refactoring code, improving performance, and making a major release.
rOpenSci helps develop R packages for the sciences via community-driven learning, review and maintenance of R software
To advance support and development of the open source Salmon and Alevin software for gene expression quantification of single-cell and bulk RNA-seq.
To establish Zarr as a foundation for scientific data storage, with clear data format and protocol specifications, implementations in multiple programming languages, and a community process for evolving to support new scientific applications.
To refactor Orange Data Mining toolbox to include the latest Python libraries for parallel, server-based data analysis, allowing it to scale to large biomedical datasets.
To attract new contributors by improving OpenRefine's documentation, and implement a new data model to improve the scalability, transparency, and reproducibility of OpenRefine workflows.
To provide dedicated community support and maintenance for the Dask project and support growth in the biological sciences field.
To expand Scanpy’s core infrastructure and community platforms for stability, versatility, and sharing knowledge.
To support the creation of resource tables for preprints on bioRxiv.
To support the interoperability, standardization, and accessibility of core libraries and thus expand the global participation of scientific communities in using and contributing to Python tools.
To improve the open-source machine learning library scikit-learn and aid in maintaining the project, while considering the new implementation of Gradient boosting.
To better serve biomedical applications, SciPy will add important new features, perform essential maintenance, and disseminate the work to biomedical researchers and software developers.
To maintain and further develop a community resource for probabilistic analysis of single-cell omics data, including an application interface for rapid development of new probabilistic models.
To turn SPAdes and QUAST codebases into scalable, modular, extensible and user-friendly frameworks that will streamline future research and development in genome assembly, analysis and quality assessment.
To improve sparse structures in SciPy so they support array semantics, to deprecate SciPy’s sparse metrics, and to assist with sparse array adoption in downstream ecosystem packages.
To strengthen the social and code foundations of the Nibabel library by extending the API and input/output to better support metadata, supporting outputs from image registration, and through educational outreach.
To extend DIPY’s registration framework for generic use in biomedical research, add parallel computing, strengthen maintenance, expand documentation, and improve educational capabilities.
To grow the maturity of the NumPy project through governance, documentation, and website work by improving the robustness of its links with OpenBLAS, and through diversifying the core team beyond the developer role.
To improve Monocle with algorithms, statistical methods, and web-based visualization tools that will enable biologists using single-cell genomics to extract and disseminate new insights from their experiments.
To expand and strengthen the NetworkX developer community, reinforce connections with the scientific Python ecosystem, improve documentation and training materials for users, and refine development infrastructure and process.
To enable the biomedical community to more easily use large-scale computing to efficiently run complex workflows via Parsl.
To improve the SymPy Python symbolic mathematics library in the key areas of performance, code generation, and documentation.
The Carpentries teaches researchers computational skills through a scalable and community-centered model.
To develop GATK methods for variant discovery and evaluation in bacteria, resolve inconsistent and diverse results from different research groups, and allow for the sharing of data and analysis globally to control bacterial transmission and antibiotic resistance.
To ensure the continued health of the pandas library by dedicating resources specifically to maintenance and implementing consistent missing data handling for all data types.
To improve Percolator, the dominant software for analyzing spectrum identifications produced by protein tandem mass spectrometry, by making the software faster, more robust, and applicable to more types of mass spectrometry data.
To improve conda-forge and bioconda’s sustainability and transparency by adopting vendor-agnostic and secure infrastructure practices and develop comprehensive maintenance metrics and dashboards.
To cultivate the next generation of biomedical open source software data scientists through recruitment, mentorship, and training of underrepresented students.
To support and further develop a library for high-performance scientific visualization in Python by maintaining the VisPy package and improving documentation within the community.
To maintain and enhance seqr, a high quality rare disease genomic analysis platform, for usage across the global scientific community, enabling collaboration for rare disease diagnosis and gene discovery.
To grow the use of Xarray in the biosciences as a foundational data model and computational toolkit for multidimensional labeled arrays.
To establish Zarr as a common, cross-community mechanism for storing collections of annotated tensors with consistent access for both local and large-scale cloud data.
Sorry, there are currently no results that match those criteria. Please try selecting fewer filters or clearing all search terms.