Skip to contents

A wrapper to run bbknn on multi-layered Seurat V5 object. Requires a conda environment with bbknn and necessary dependencies

Usage

bbknnIntegration(
  object,
  orig,
  groups = NULL,
  groups.name = NULL,
  layers = "data",
  scale.layer = "scale.data",
  conda_env = NULL,
  new.graph = "bbknn",
  new.reduction = "pca.bbknn",
  reduction.key = "bbknnPCA_",
  reconstructed.assay = "bbknn.ridge",
  ndims = 50L,
  ndims.use = 30L,
  ridge_regression = T,
  graph.use = c("connectivities", "distances"),
  verbose = TRUE,
  seed.use = 42L,
  ...
)

Arguments

object

A Seurat object (or an Assay5 object if not called by IntegrateLayers)

orig

DimReduc object. Not to be set directly when called with IntegrateLayers, use orig.reduction argument instead

groups

A named data frame with grouping information. Preferably one-column when groups.name = NULL

groups.name

Column name from groups data frame that stores grouping information. If groups.name = NULL, the first column is used

layers

Name of the layers to use in the integration

scale.layer

Name of the scaled layer in Assay

conda_env

Path to conda environment to run bbknn (should also contain the scipy python module). By default, uses the conda environment registered for bbknn in the conda environment manager

new.graph

Name of the Graph object

new.reduction

Name of the new integrated dimensional reduction

reduction.key

Key for the new integrated dimensional reduction

reconstructed.assay

Name for the assay containing the corrected expression matrix

ndims

Number of dimensions for the new PCA computed on first output of bbknn. 50 by default. Ignored when ridge_regression = FALSE

ndims.use

Number of dimensions from orig to use for bbknn, and from newly computed PCA when ridge_regression = TRUE.

ridge_regression

When set to TRUE (default), new clusters are computed on the output of bbknn, then a ridge regression is performed to remove technical variables while preserving biological variables. Then, a new bbknn run is performed.

graph.use

Which graph(s) of bbknn to output. At least one of "connectivities" or "distances". If both are provided (default) and ridge_regression = TRUE, the first one ("connectivities" by default, recommended) is used as input for computing clusters.

verbose

Print messages. Set to FALSE to disable

seed.use

An integer to generate reproducible outputs. Set seed.use = NULL to disable

...

Additional arguments to be passed to bbknn.bbknn(). When ridge_regression = TRUE, also accepts arguments to pass to Seurat::FindClusters(), Seurat::RunPCA() and bbknn.ridge_regression(). See Details section

Value

A list containing at least one of:

  • 1 or 2 new Graph(s) of name [new_graph]_scale.data_[graph.use] corresponding to the output(s) of the first run of bbknn

  • a new Assay of name reconstructed.assay with corrected counts for each feature from scale.layer.

  • a new DimReduc (PCA) of name new.reduction (key set to reduction.key)

  • 1 or 2 new Graph(s) of name [new_graph]_ridge.residuals_[graph.use] corresponding to the output(s) of the second run of bbknn

[graph.use] can take two values (either "connectivities" or "distances"), depending on the graph.use parameter.

When called via IntegrateLayers, a Seurat object with the new reduction and/or assay is returned

Details

This wrappers calls three python functions through reticulate. Find the bbknn-specific arguments there:

Note

This function requires the bbknn package to be installed (along with scipy)

References

Polański, K., Young, M. D., Miao, Z., Meyer, K. B., Teichmann, S. A. & Park, J.-E. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2019). DOI

Examples

if (FALSE) { # \dontrun{
# Preprocessing
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)

# After preprocessing, we integrate layers:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
                       conda_env = 'bbknn', groups = obj[[]],
                       groups.name = 'Method')

# To disable the ridge regression and subsequent steps:
obj <- IntegrateLayers(object = obj, method = bbknnIntegration,
                       conda_env = 'bbknn', groups = obj[[]],
                       groups.name = 'Method', ridge_regression = FALSE)
} # }