Score a corrected or uncorrected PCA to estimate batch mixing
Source:R/metrics_pca.R
score-regressPC.Rd
Linearly regresses principal components with batch variable as a proxy to estimate batch mixing. The resulting R2 are then weighted by each dimension's contribution to variance.
Usage
ScoreRegressPC(
object,
batch.var = NULL,
reduction = "pca",
dims = NULL,
adj.r2 = FALSE,
weight.by = c("var", "stdev"),
assay = NULL,
layer = NULL
)
AddScoreRegressPC(
object,
integration,
batch.var = NULL,
reduction = "pca",
dims = NULL,
adj.r2 = FALSE,
weight.by = c("var", "stdev"),
assay = NULL,
layer = NULL
)
Arguments
- object
A Seurat object
- batch.var
The name of the batch variable (must be in the object metadata)
- reduction
The name of the reduction to score
- dims
The dimensions to consider. All dimensions are used by default
- adj.r2
Whether to use the adjusted R2 instead of the raw R2
- weight.by
one of 'var' (default) or 'stdev' (standing for variance and standard deviation respectively). Use the variance or the standard deviation explained by the principal components to weight the each PC's score.
- assay
assay to use. Passed to Seurat to automatically construct the
batch.var
when not provided. Useless otherwise- layer
layer to use. Passed to Seurat to automatically construct the
batch.var
when not provided. Useless otherwise- integration
name of the integration to score
Value
ScoreRegressPC
: A single float corresponding to the score of
the given reduction
AddScoreRegressPC
: the updated Seurat object
with the regression
PCA score set for the integration.
Details
The linear regression is $$PC_i = Batch$$
The score is computed as follow : $$\sum_{i=1}^{p} \left ( R^2_i * V_i \right )$$
For a PCA with p dimensions, \(PC_i\) is the principal component i, \(R^2_i\) is the R squared coefficient of the linear regression for the dimension i. \(V_i\) is the proportion of variance explained by the \(PC_i\).
Note
This score is an adaptation of the principal component regression (PCR) score from Luecken M.D. et al., 2022.
References
Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M. & Theis, F. J. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 19, 41–50 (2021). DOI
See also
ScoreDensityPC
for an alternative and
ScoreRegressPC.CellCycle
to regresses PCs by cell cycle scores.
Examples
if (FALSE) { # \dontrun{
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)
score.r2 <- ScoreRegressPC(obj, "Method", "pca", dim = 1:30)
score.adj.r2 <- ScoreRegressPC(obj, "Method", "pca", dim = 1:30, adj.r2 = TRUE)
score.r2 # ~ 0.1147
score.adj.r2 # ~ 0.1145
} # }