Statistical Analysis and Comprehension of the Human Cell Atlas in R / Bioconductor: Access and Scalable Infrastructure
To provide R / Bioconductor software to provide coherent programmatic interface to the HCA, and to enable scalable interactive statistical analysis of single-cell data.
Results & Resources
The main aim of this project was to implement fast and efficient algorithms scalable to billions of cells, with particular focus on k-means clustering. To this end, they have developed an open-source implementation of the mini-batch k-means algorithm in the R / Bioconductor package, mbkmeans. Additionally, they added HDF5 capabilities into the clusterExperiment package and helped create an open-source online e-book and review paper that can help researchers analyze large single-cell datasets with Bioconductor.