Today, the Chan Zuckerberg Initiative announced the funding and building of one of the largest computing systems dedicated to nonprofit life science research in the world. This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help cure, prevent, or manage all diseases by the end of this century. The high-performance computing cluster, which is planned to comprise 1,000+ GPUs, will enable AI and large language models for biomedicine at scale.
“AI is creating new opportunities in biomedicine, and building a high-performance computing cluster dedicated to life science research will accelerate progress on important scientific questions about how our cells work,” said CZI Co-founder and Co-CEO Mark Zuckerberg. “Developing digital models capable of predicting all cell types and cell states from the genome will help researchers better understand our cells and how they behave in health and disease.”
The increasing complexity, size and accessibility of scientific datasets, as well as the rapid rise of scalable AI and machine learning methods, creates a unique opportunity to apply advances in large language models (LLMs) to biomedicine. AI systems such as AlphaFold and ESM have already made significant contributions to studying human biology. By spurring adoption across the life sciences, high-performance computing (HPC) will provide the necessary support for the ever-increasing size of LLMs through significant investments in GPUs either on premise or in the cloud. Currently, scaled and robust infrastructure is cost prohibitive for many organizations, especially academic research institutions. The CZI-funded GPU cluster will be one of the first to power openly available models of human cells to allow researchers to collaboratively accelerate their work.
“Bringing the power of generative AI to biology at scale will allow researchers to incorporate these technological advances into their work, which will accelerate efforts to cure, prevent, or manage all disease,” said CZI Co-founder and Co-CEO Priscilla Chan. “AI models could predict how an immune cell responds to an infection, what happens at the cellular level when a child is born with a rare disease, or even how a patient’s body will respond to a new medication. We hope that this collaborative effort will generate new insights about the fundamental characteristics of our cells.”
These predictive models will be trained on datasets such as those integrated into the Chan Zuckerberg CELL by GENE (CZ CELLxGENE) software tool, which comprises the largest corpus of standardized single-cell datasets, with more than 50 million cells. Other data sources include resources generated by CZ Science research institutes, such as the protein location and interaction atlas OpenCell and the cell atlas Tabula Sapiens, built by the Chan Zuckerberg Biohub San Francisco. Large imaging datasets from the Chan Zuckerberg Institute for Advanced Biological Imaging (CZ Imaging Institute) will also be included, as well as publicly available datasets.
“Developing a virtual biology simulator is a natural evolution of our work in science over the past seven years,” said CZI Head of Science Stephen Quake. “We have supported researchers to generate and annotate standardized, representative datasets; built tools to integrate these datasets and make them widely available — and, through our scientific institutes, we’ve built a new model for the kind of collaboration required to undertake this ambitious vision of building predictive cell models. CZ Science has employed many AI tools in its research for years, and this focus will unify our collective efforts to create a field-wide resource for better understanding cells and cell systems.”
Current applications of AI developed by CZI’s science technology team include CellGuide, a free, interactive encyclopedia — with definitions generated by ChatGPT — that quickly gives researchers key information about over 700 cell types and sub-cell types, including definitions, canonical and computational marker genes, an expandable ontology tree visualization of a cell’s lineage, and relevant datasets. The CZ Imaging Institute, in partnership with CZI’s science technology team, is prototyping a cloud-based, open-source Cryo-ET Data Portal aimed at driving the development of automated annotations of cryo-ET datasets.
“CZI’s science technology team brings a wealth of knowledge and experience in partnering with researchers to understand their challenges and build technology that makes new science possible,” said CZI Vice President of Science Technology Patricia Brennan. “Projects like CELLxGENE have already proven to be widely useful for the field in accelerating single-cell research, with about 75% of the data originating from researchers beyond CZI who are helping us grow this data corpus. With these new AI-driven cellular models, we hope to build shared, collaborative resources that drive future breakthroughs.”
CZ Science institutes bring together interdisciplinary researchers to pursue ambitious scientific challenges that couldn’t be accomplished in conventional environments. In 2022, science, technology and AI leaders launched the Kempner Institute for the Study of Natural & Artificial Intelligence at Harvard University, where researchers are studying the basis of intelligence in natural and artificial systems. CZI is supporting the CZ Biohub Network to purchase the equipment, and the CZ Biohub Network has a team dedicated to HPC that supports its research and will bring its expertise to designing and standing up this new computing system.
Read more from CZI Co-founders and Co-CEOs about creating AI tools to accelerate biological research in MIT Technology Review.
About the Chan Zuckerberg Initiative
The Chan Zuckerberg Initiative was founded in 2015 to help solve some of society’s toughest challenges — from eradicating disease and improving education, to addressing the needs of our communities. Through collaboration, providing resources and building technology, our mission is to help build a more inclusive, just and healthy future for everyone. For more information, please visit chanzuckerberg.com.
About the Chan Zuckerberg Biohub Network
The Chan Zuckerberg Biohub Network is a group of nonprofit research institutes that bring together physicians, scientists, and engineers with the goal of pursuing grand scientific challenges on a 10- to 15-year time horizon. These institutes partner with Chan Zuckerberg Science in its goal to understand the mysteries of the cell and how cells interact within systems. This collaboration will bring us closer to our mission to cure, prevent, or manage all disease by the end of the century. To learn more, visit www.czbiohub.org.
Chan Zuckerberg Initiative, CZ Biohub Network
Jeff MacGregor, + 1 650-304-9728