Back to Project List

Statistical Analysis and Comprehension of the Human Cell Atlas in R / Bioconductor: Access and Scalable Infrastructure

Focus Bioconductor

Project Goal

To provide R / Bioconductor software to provide coherent programmatic interface to the HCA, and to enable scalable interactive statistical analysis of single-cell data.

Results & Resources

The main aim of this project was to implement fast and efficient algorithms scalable to billions of cells, with particular focus on k-means clustering. To this end, they have developed an open-source implementation of the mini-batch k-means algorithm in the R / Bioconductor package, mbkmeans. Additionally, they added HDF5 capabilities into the clusterExperiment package and helped create an open-source online e-book and review paper that can help researchers analyze large single-cell datasets with Bioconductor.


Lead Investigator

Davide Risso
Davide Risso