Virtual Cells

We’re building AI-powered virtual cells to help scientists explore the molecular underpinnings of human health and disease.

We’re leveraging AI to build virtual cells that are capable of predicting the behavior of healthy and diseased cells, which will have broad applications for biomedical research, disease diagnosis and therapeutic development. We believe this effort will deepen our understanding of human biology at a molecular level — bringing scientists closer to curing, preventing or managing all diseases by the end of this century.

A black computer illustration with a pink cell on the screen.

Our Approach

Biological Data

Building virtual cells will require vast amounts of diverse and multimodal biological data. CZI supports these datasets by building open source software tools like Chan Zuckerberg CELLxGENE (CZ CELLxGENE) to make machine-learning-ready, high-quality data more accessible for scientific research, funding single-cell data generation, and creating scientific institutes that also generate data to advance cell biology.

Models & Applications

Our science technology team will partner with leading AI experts and academic researchers to build models that will help unlock the mysteries of cells and how cells interact within systems. These models and their outputs will be openly accessible to the scientific community via applications.

Compute Infrastructure

Training AI on enormous amounts of biological data will require a high-powered computing system. We’re building and funding one of the largest computing systems for nonprofit life science research, which will power the next generation of AI modeling for cell biology.


Enabling AI at scale for research will require close collaboration with the scientific community. With our network of grantees and collaborative research institutes, we have a history of bringing together experts across disciplines to pursue some of the toughest, riskiest scientific challenges that can’t be done elsewhere. We’re also committed to making data, models and applications open source for research.

Initial Focus Areas

A cluster of multi-colored spots forming a misshapen circle.
Visual representation of universal cell embeddings. | Photograph courtesy of Leskovec Lab

Universal Sequence-Based Embeddings

A foundational model representing DNA, RNA and proteins will enable the universal representation of cells across tissues, types of datasets, cell types, species, and more.

An irregular red and green shape with blue ovals throughout, taking up most of a black background.
Visual representation of protein expressions over pre-existing reference markers. | Photograph courtesy of Emma Lundberg, Human Protein Atlas

Universal embeddings of cells and organelles from microscope images

A foundational model representing protein localization, cellular structures and tissues will enable the universal representation of dynamic biological systems from the Human Protein Atlas.

A group of individuals standing together at an office. Most wear red lanyards and an art installation is in the background.
AI Residents at CZI | Photograph courtesy of Donghui Li
AI Residents at CZI | Photograph courtesy of Donghui Li

Building Virtual Cells With the Scientific Community

We’re introducing a new AI residency program to coalesce AI/machine-learning leaders from the academic research community. In collaboration with CZI scientists and engineers, the projects spearheaded by these residents will lay the foundation for virtual cell models that will allow researchers to explore the molecular underpinnings of human health and disease.

We’re launching the first phase of the AI residency program with current collaborators and partners. We anticipate a formal RFA for this program in the future. Join our mailing list to stay updated on the latest virtual cell news, collaboration opportunities and more.