A wrapper to run bbknn
on multi-layered Seurat V5 object.
Requires a conda environment with bbknn
and necessary dependencies
Usage
bbknnIntegration(
object,
orig,
groups = NULL,
groups.name = NULL,
layers = "data",
scale.layer = "scale.data",
conda_env = NULL,
new.graph = "bbknn",
new.reduction = "pca.bbknn",
reduction.key = "bbknnPCA_",
reconstructed.assay = "bbknn.ridge",
ndims = 50L,
ndims.use = 30L,
ridge_regression = T,
graph.use = c("connectivities", "distances"),
verbose = TRUE,
seed.use = 42L,
...
)
Arguments
- object
A
Seurat
object (or anAssay5
object if not called byIntegrateLayers
)- orig
DimReduc
object. Not to be set directly when called withIntegrateLayers
, useorig.reduction
argument instead- groups
A named data frame with grouping information. Preferably one-column when
groups.name = NULL
- groups.name
Column name from
groups
data frame that stores grouping information. Ifgroups.name = NULL
, the first column is used- layers
Name of the layers to use in the integration
- scale.layer
Name of the scaled layer in
Assay
- conda_env
Path to conda environment to run bbknn (should also contain the scipy python module). By default, uses the conda environment registered for bbknn in the conda environment manager
- new.graph
Name of the Graph object
- new.reduction
Name of the new integrated dimensional reduction
- reduction.key
Key for the new integrated dimensional reduction
- reconstructed.assay
Name for the
assay
containing the corrected expression matrix- ndims
Number of dimensions for the new PCA computed on first output of bbknn. 50 by default. Ignored when
ridge_regression = FALSE
- ndims.use
Number of dimensions from
orig
to use for bbknn, and from newly computed PCA whenridge_regression = TRUE
.- ridge_regression
When set to
TRUE
(default), new clusters are computed on the output of bbknn, then a ridge regression is performed to remove technical variables while preserving biological variables. Then, a new bbknn run is performed.- graph.use
Which graph(s) of bbknn to output. At least one of "
connectivities
" or "distances
". If both are provided (default) andridge_regression = TRUE
, the first one ("connectivities
" by default, recommended) is used as input for computing clusters.- verbose
Print messages. Set to
FALSE
to disable- seed.use
An integer to generate reproducible outputs. Set
seed.use = NULL
to disable- ...
Additional arguments to be passed to
bbknn.bbknn()
. Whenridge_regression = TRUE
, also accepts arguments to pass toSeurat::FindClusters()
,Seurat::RunPCA()
andbbknn.ridge_regression()
. See Details section
Value
A list containing at least one of:
1 or 2 new Graph(s) of name [
new_graph
]_scale.data_[graph.use
] corresponding to the output(s) of the first run of bbknna new Assay of name
reconstructed.assay
with corrected counts for each feature fromscale.layer
.a new DimReduc (PCA) of name
new.reduction
(key set toreduction.key
)1 or 2 new Graph(s) of name [
new_graph
]_ridge.residuals_[graph.use
] corresponding to the output(s) of the second run of bbknn
[graph.use
] can take two values (either "connectivities
" or
"distances
"), depending on the graph.use
parameter.
When called via IntegrateLayers
, a Seurat object with
the new reduction and/or assay is returned
Details
This wrappers calls three python functions through reticulate. Find the bbknn-specific arguments there:
bbknn function: bbknn.bbknn, which relies on bbknn.matrix.bbknn
ridge regression: bbknn.ridge_regression, which relies on sklearn.linear_model.Ridge
Note
This function requires the bbknn package to be installed (along with scipy)
References
Polański, K., Young, M. D., Miao, Z., Meyer, K. B., Teichmann, S. A. & Park, J.-E. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2019). DOI
Examples
if (FALSE) { # \dontrun{
# Preprocessing
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)
# After preprocessing, we integrate layers:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
conda_env = 'bbknn', groups = obj[[]],
groups.name = 'Method')
# To disable the ridge regression and subsequent steps:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
conda_env = 'bbknn', groups = obj[[]],
groups.name = 'Method', ridge_regression = FALSE)
} # }