Compute a score based on adjusted rand index between a clustering result and one or more cell type label variable(s). 0 and 1 reflect a random clustering and a perfect clustering as compared to cell type labelling respectively.
Usage
ScoreARI(object, cell.var, clust.var = "seurat_clusters")
AddScoreARI(object, integration, cell.var, clust.var = "seurat_clusters")
Arguments
- object
A Seurat object
- cell.var
The name(s) of the column(s) with cell type label variable (must be in the object metadata). Multiple column names are accepted
- clust.var
The name of the column with cluster id assignment for each cell (must be in the object metadata). Only one column name is accepted
- integration
name of the integration to score
Value
ScoreARI
: a named array with as many values as there are
common strings between cell.var and the column names of the object's
metadata. Names are cell.var and values are ARI.
AddScoreARI
: the updated Seurat object
with the ARI score(s)
set for the integration.
Details
ARI is rand index corrected for chance: $$\displaystyle ARI = \frac{RI - RI_{expected}}{max(RI) - RI_{expected}}$$ More precisely, a contingency table is computed with the two variables \(L\) and \(C\) of \(r\) and \(s\) elements respectively. For \(i \in [\![1,r]\!]\) and \(j \in [\![1,s]\!]\), \(n_{ij}\) is the number of common samples (i.e. cells) between \(L_i\) and \(C_j\), \(a_i\) is the number of samples in \(L_i\) and \(b_j\) is the number of samples in \(C_j\). The ARI is: $$\displaystyle ARI = \frac{\left. \sum_{ij} \binom{n_{ij}}{2} - \left(\sum_i \binom{a_i}{2} \sum_j \binom{b_j}{2}\right) \right/ \binom{n}{2} }{ \left. \frac{1}{2} \left(\sum_i \binom{a_i}{2} + \sum_j \binom{b_j}{2}\right) - \left(\sum_i \binom{a_i}{2} \sum_j \binom{b_j}{2}\right) \right/ \binom{n}{2}}$$
References
Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M. & Theis, F. J. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 19, 41–50 (2021). DOI