A wrapper to run bbknn on multi-layered Seurat V5 object.
Requires a conda environment with bbknn and necessary dependencies
Usage
bbknnIntegration(
object,
orig,
groups = NULL,
groups.name = NULL,
layers = "data",
scale.layer = "scale.data",
conda_env = NULL,
new.graph = "bbknn",
new.reduction = "pca.bbknn",
reduction.key = "bbknnPCA_",
reconstructed.assay = "bbknn.ridge",
ndims = 50L,
ndims.use = 30L,
ridge_regression = T,
graph.use = c("connectivities", "distances"),
verbose = TRUE,
seed.use = 42L,
...
)Arguments
- object
A
Seuratobject (or anAssay5object if not called byIntegrateLayers)- orig
DimReducobject. Not to be set directly when called withIntegrateLayers, useorig.reductionargument instead- groups
A named data frame with grouping information. Preferably one-column when
groups.name = NULL- groups.name
Column name from
groupsdata frame that stores grouping information. Ifgroups.name = NULL, the first column is used- layers
Name of the layers to use in the integration
- scale.layer
Name of the scaled layer in
Assay- conda_env
Path to conda environment to run bbknn (should also contain the scipy python module). By default, uses the conda environment registered for bbknn in the conda environment manager
- new.graph
Name of the Graph object
- new.reduction
Name of the new integrated dimensional reduction
- reduction.key
Key for the new integrated dimensional reduction
- reconstructed.assay
Name for the
assaycontaining the corrected expression matrix- ndims
Number of dimensions for the new PCA computed on first output of bbknn. 50 by default. Ignored when
ridge_regression = FALSE- ndims.use
Number of dimensions from
origto use for bbknn, and from newly computed PCA whenridge_regression = TRUE.- ridge_regression
When set to
TRUE(default), new clusters are computed on the output of bbknn, then a ridge regression is performed to remove technical variables while preserving biological variables. Then, a new bbknn run is performed.- graph.use
Which graph(s) of bbknn to output. At least one of "
connectivities" or "distances". If both are provided (default) andridge_regression = TRUE, the first one ("connectivities" by default, recommended) is used as input for computing clusters.- verbose
Print messages. Set to
FALSEto disable- seed.use
An integer to generate reproducible outputs. Set
seed.use = NULLto disable- ...
Additional arguments to be passed to
bbknn.bbknn(). Whenridge_regression = TRUE, also accepts arguments to pass toSeurat::FindClusters(),Seurat::RunPCA()andbbknn.ridge_regression(). See Details section
Value
A list containing at least one of:
1 or 2 new Graph(s) of name [
new_graph]_scale.data_[graph.use] corresponding to the output(s) of the first run of bbknna new Assay of name
reconstructed.assaywith corrected counts for each feature fromscale.layer.a new DimReduc (PCA) of name
new.reduction(key set toreduction.key)1 or 2 new Graph(s) of name [
new_graph]_ridge.residuals_[graph.use] corresponding to the output(s) of the second run of bbknn
[graph.use] can take two values (either "connectivities" or
"distances"), depending on the graph.use parameter.
When called via IntegrateLayers, a Seurat object with
the new reduction and/or assay is returned
Details
This wrappers calls three python functions through reticulate. Find the bbknn-specific arguments there:
bbknn function: bbknn.bbknn, which relies on bbknn.matrix.bbknn
ridge regression: bbknn.ridge_regression, which relies on sklearn.linear_model.Ridge
Note
This function requires the bbknn package to be installed (along with scipy)
References
Polański, K., Young, M. D., Miao, Z., Meyer, K. B., Teichmann, S. A. & Park, J.-E. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2019). DOI
Examples
if (FALSE) { # \dontrun{
# Preprocessing
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)
# After preprocessing, we integrate layers:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
conda_env = 'bbknn', groups = obj[[]],
groups.name = 'Method')
# To disable the ridge regression and subsequent steps:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
conda_env = 'bbknn', groups = obj[[]],
groups.name = 'Method', ridge_regression = FALSE)
} # }
