Score a dimensionality reduction embedding or knn graph using the Local Inverse Simpson Index

Compute the Local Inverse Simpson's Index (LISI) to estimate batch mixing or cell type mixing (iLISI and cLISI respectively according to Luecken M.D. et al., 2022).

AddScoreLISI returns an updated Seurat object, while ScoreLISI outputs the raw LISI scores for each cell

Usage

AddScoreLISI(
  object,
  integration,
  batch.var = NULL,
  cell.var = NULL,
  reduction,
  dims = NULL,
  graph.name,
  graph.type = c("distances", "connectivities"),
  do.symmetrize = TRUE,
  save.graph = TRUE,
  new.graph = NULL,
  perplexity = 30,
  tol = 1e-05,
  do.scale = TRUE,
  largest.component.only = FALSE,
  assay = NULL,
  verbose = TRUE,
  ...
)

ScoreLISI(
  object,
  batch.var = NULL,
  cell.var = NULL,
  reduction,
  dims = NULL,
  graph.name,
  graph.type = c("distances", "connectivities"),
  do.symmetrize = TRUE,
  return.graph = FALSE,
  perplexity = 30,
  tol = 1e-05,
  largest.component.only = FALSE,
  assay = NULL,
  verbose = TRUE,
  ...
)

Arguments

object: A Seurat object
integration: name of the integration to score
batch.var: The name of the batch variable (must be in the object metadata). Can be omitted if cell.var is not NULL
cell.var: The name of the cell variable (must be in the object metadata). Can be omitted if batch.var is not NULL
reduction: The name of the reduction to score. Arguments reduction and graph.name are mutually exclusive
dims: The dimensions of reduction to consider. All dimensions are used by default. Ignored when scoring a graph
graph.name: The name of the knn graph to score. Arguments reduction and graph.name are mutually exclusive
graph.type: one of 'distances' or 'connectivities' (not supported yet). Ignored when scoring a cell embedding
do.symmetrize: whether to symmetrize the knn graphs. Set toFALSE to disable (not recommended, especially when scoring a knn graph directly. See Details)
save.graph: whether to save the graph used to compute the LISI score(s) in the Seurat object
new.graph: name of the graph used to compute the LISI score(s). When new.graph = NULL (default), a name is constructed depending on input and arguments. Ignored when save.graph = FALSE
perplexity: third of number of each cell's neighbours. When the value of perplexity and the number of neighbours (*3) are discrepant, the graph is adapted. Multiple scores with varying values of perplexity are not comparable, hence it is recommended to use the same value for each integration to score.
tol: Stop the computation of the local Simpson's index when it converges to this tolerance.
do.scale: whether to scale the output LISI values between 0 and 1.
largest.component.only: whether to compute the scores on the largest component or all sufficiently large components (default)
assay: the name of the assay to reference in the output Graph object ( when save.graph = TRUE)
verbose: whether to print progress messages
...: Additional parameters to pass to other methods (see Details).
return.graph: whether to return the graph used to compute the LISI score(s)

Value

ScoreLISI: a data frame with each cell's raw LISI score, or a list containing the aforementioned data frame and the graph used to compute it (return.graph = TRUE).

AddScoreLISI: the updated Seurat object, with cell-wise LISI scores in the meta data (identical to ScoreLISI's output), global scores in misc and a new Graph object when save.graph = TRUE.

Details

When scoring a reduction, a knn graph with enough neighbours per cell is computed. If do.symmetrize = TRUE, the graph is symmetrized and the k best neighbours are kept.

When scoring a knn graph, the graph is expanded with Dijkstra's algorithm to reach enough neighbours per cell. If do.symmetrize = TRUE, the graph is symmetrized beforehand. Note that when do.symmetrize = FALSE, Dijkstra's algorithm is applied on the asymmetric distance matrix, and components are computed. But each component is then symmetrized and Dijkstra's algorithm is computed on each of them. Indeed, there is no guarantee that the cells' number of neighbours is preserved after decomposing the directed graph. Hence, when do.symmetrize = FALSE and a graph is input, the graph is considered as directed only to find components.

In either case, it is recommended to keep do.symmetrize = TRUE.

For possible additional parameters, see FindNeighbors (when inputting a reduction) or ExpandNeighbours (when inputting a knn graph)

Note

This score is an adaptation of the LISI score as described in Korsunsky I. et al., 2019 and also used in Luecken M.D. et al., 2022.

References

Korsunsky, I., Millard, N., Fan, J., Slowikowski, K., Zhang, F., Wei, K., Baglaenko, Y., Brenner, M., Loh, P. & Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019). DOI

Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M. & Theis, F. J. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 19, 41–50 (2021). DOI