Skip to contents

Linearly regresses principal components with batch variable as a proxy to estimate batch mixing. The resulting R2 are then weighted by each dimension's contribution to variance.

Usage

ScoreRegressPC(
  object,
  batch.var = NULL,
  reduction = "pca",
  dims = NULL,
  adj.r2 = FALSE,
  weight.by = c("var", "stdev"),
  assay = NULL,
  layer = NULL
)

AddScoreRegressPC(
  object,
  integration,
  batch.var = NULL,
  reduction = "pca",
  dims = NULL,
  adj.r2 = FALSE,
  weight.by = c("var", "stdev"),
  assay = NULL,
  layer = NULL
)

Arguments

object

A Seurat object

batch.var

The name of the batch variable (must be in the object metadata)

reduction

The name of the reduction to score

dims

The dimensions to consider. All dimensions are used by default

adj.r2

Whether to use the adjusted R2 instead of the raw R2

weight.by

one of 'var' (default) or 'stdev' (standing for variance and standard deviation respectively). Use the variance or the standard deviation explained by the principal components to weight the each PC's score.

assay

assay to use. Passed to Seurat to automatically construct the batch.var when not provided. Useless otherwise

layer

layer to use. Passed to Seurat to automatically construct the batch.var when not provided. Useless otherwise

integration

name of the integration to score

Value

ScoreRegressPC: A single float corresponding to the score of the given reduction

AddScoreRegressPC: the updated Seurat object with the regression PCA score set for the integration.

Details

The linear regression is $$PC_i = Batch$$

The score is computed as follow : $$\sum_{i=1}^{p} \left ( R^2_i * V_i \right )$$

For a PCA with p dimensions, \(PC_i\) is the principal component i, \(R^2_i\) is the R squared coefficient of the linear regression for the dimension i. \(V_i\) is the proportion of variance explained by the \(PC_i\).

Note

This score is an adaptation of the principal component regression (PCR) score from Luecken M.D. et al., 2022.

References

Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M. & Theis, F. J. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 19, 41–50 (2021). DOI

See also

ScoreDensityPC for an alternative and ScoreRegressPC.CellCycle to regresses PCs by cell cycle scores.

Examples

if (FALSE) { # \dontrun{
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)

score.r2 <- ScoreRegressPC(obj, "Method", "pca", dim = 1:30)
score.adj.r2 <- ScoreRegressPC(obj, "Method", "pca", dim = 1:30, adj.r2 = TRUE)

score.r2    # ~ 0.1147
score.adj.r2     # ~ 0.1145
} # }