Many computational efforts in support of the Human Cell Atlas are underway in the research community, and new methods are needed. This RFA aims to support further development and systematic comparison of methods across existing and new benchmark datasets derived from single-cell RNA sequencing, bulk RNA sequencing, proteomics, image-based transcriptomics, and other tissue imaging approaches. The RFA will also support new analysis and visualization methods, and new approaches to integrating data across modalities. The goal is to support a diverse set of well-validated tools to analyze, consume, integrate, and explore Human Cell Atlas data.
To help the resulting tools reach the widest possible audience, scientists and engineers from the Chan Zuckerberg Initiative will collaborate with researchers on projects funded by this RFA to help bring tools to the scientific community; for example, by helping to enhance or package software with an emphasis on scale, robustness, speed, interoperability, web-based dissemination, and user experience. There will also be opportunities for new tools to connect to and leverage the Human Cell Atlas Data Coordination Platform (https://www.humancellatlas.org/data-sharing), which provides infrastructure for data sharing and cloud computing.
This effort is also a pilot project for new models of collaborative computational research. With the assistance of the Chan Zuckerberg Initiative, project participants will be expected to share their proposals within the collaborating framework, attend regular meetings, workshops, and hackathons, and communicate their ongoing progress through GitHub and Slack. We welcome submissions that represent pre-existing collaborative efforts, but as part of the broader collaborative goals of this RFA, we require each principal investigator to submit a separate application, rather than serving as a co-principal investigator on a shared application.
The goals of this RFA include, but are not limited to:
- Developing standard formats and analysis pipelines for genomic, proteomic, and imaging data, in forms that enable consistent use of these pipelines by numerous experimental labs
- Identifying and solving common challenges for web-based interactive visualization of cellular and imaging data
- Developing user tools that allow scientists and physicians to extract and analyze data organized by genes, cells, or tissues of interest
- Supporting analytical methods and machine learning approaches to solving problems such as multimodal integration, inference of state transitions and developmental trajectories, and representation of spatial relationships at the cellular or molecular level
- Generating curated benchmark datasets from new or existing data for evaluating computational methods and designing future analysis competitions
- Developing new computational approaches to comparing and normalizing genomic and imaging data across assays, subjects, and species
- Generating experimental datasets that directly address computationally-guided questions in quality control, reproducibility, or multimodal integration
Although the focus of the project is analysis of human data, we are interested in new ideas and will consider proposals that focus on data from human tissues, non-human animals, organoids, and cell lines. We encourage proposals from areas of machine learning entirely outside of computational biology, e.g. deep learning. Proposals will be evaluated based on the computational novelty and viability of the method, a commitment to collaboration, the intention to interoperate with existing efforts such as the Human Cell Atlas Data Coordination Platform, and a plan to ensure that software is sharable, portable, and reproducible.