Run scANVI on Seurat's Assay5 object through IntegrateLayers
Source: R/scANVI.R
scANVIIntegration.Rd
A wrapper to run scANVI
on multi-layered Seurat V5 object.
Requires a conda environment with scvi-tools
and necessary dependencies
Recommendations: use raw counts and all features
(features = Features(object), layers = "counts"
)
Usage
scANVIIntegration(
object,
groups = NULL,
groups.name = NULL,
labels.name = NULL,
labels.null = NULL,
features = NULL,
layers = "counts",
scale.layer = "scale.data",
conda_env = NULL,
new.reduction = "integrated.scANVI",
reduction.key = "scANVIlatent_",
torch.intraop.threads = 4L,
torch.interop.threads = NULL,
model.save.dir = NULL,
ndims.out = 10,
n_hidden = 128L,
n_layers = 1L,
dropout_rate = 0.1,
dispersion = c("gene", "gene-batch", "gene-label", "gene-cell"),
gene_likelihood = c("zinb", "nb", "poisson"),
linear_classifier = FALSE,
max_epochs = NULL,
train_size = 0.9,
batch_size = 128L,
seed.use = 42L,
verbose = TRUE,
verbose.scvi = c("INFO", "NOTSET", "DEBUG", "WARNING", "ERROR", "CRITICAL"),
...
)
Arguments
- object
A
Seurat
object (or anAssay5
object if not called byIntegrateLayers
)- groups
A named data frame with grouping information. Can also contain cell labels to guide scANVI.
- groups.name
Column name from
groups
data frame that stores grouping information. Ifgroups.name = NULL
, the first column is used- labels.name
Column name from
groups
data frame that stores cell label information. Iflabels.name = NULL
, all cells are assigned the same label.- labels.null
One value of
groups$labels.name
that indicates unlabeled observations.labels.null = NULL
means all labels are valid. Only applies whenlabels.name != NULL
.- features
Vector of feature names to input to the integration method. When
features = NULL
(default), theVariableFeatures
are used. To pass all features, use the output ofFeatures()
- layers
Name of the layers to use in the integration. 'counts' is highly recommended
- scale.layer
Name of the scaled layer in
Assay
- conda_env
Path to conda environment to run scANVI (should also contain the scipy python module). By default, uses the conda environment registered for scANVI in the conda environment manager
- new.reduction
Name of the new integrated dimensional reduction
- reduction.key
Key for the new integrated dimensional reduction
- torch.intraop.threads
Number of intra-op threads available to torch when training on CPU instead of GPU. Set via
torch.set_num_threads()
.- torch.interop.threads
Number of intra-op threads available to torch when training on CPU instead of GPU. Set via
torch.set_num_interop_threads()
. Can only be changed once, on first call.- model.save.dir
Path to a directory to save the model to. Uses
SCANVI.save()
. Does not save anndata. Note that neither the trainer optimizer state nor the trainer history are saved.model.save.dir = NULL
(default) disables saving the model.- ndims.out
Number of dimensions for
new.reduction
output. Corresponds ton_latent
argument in the original API of SCANVINumber of nodes per hidden layer.
- n_layers
Number of hidden layers used for encoder and decoder NNs.
- dropout_rate
Dropout rate for neural networks.
- dispersion
One of the following:
gene
: dispersion parameter of NB is constant per gene across cells (default)gene-batch
: dispersion can differ between different batchesgene-label
: dispersion can differ between different labelsgene-cell
: dispersion can differ for every gene in every cell
- gene_likelihood
One of the following:
zinb
: Zero-inflated negative binomial distribution (default)nb
: Negative binomial distributionpoisson
: Poisson distribution
- linear_classifier
When switched to
TRUE
, uses a single linear layer for classification instead of a multi-layer perceptron.- max_epochs
Number of passes through the dataset for semisupervised training.
- train_size
Size of training set in the range
[0.0, 1.0]
- batch_size
Minibatch size to use during training.
- seed.use
An integer to generate reproducible outputs. Set
seed.use = NULL
to disable- verbose
Print messages. Set to
FALSE
to disable- verbose.scvi
Verbosity level of scANVI. From quietest to talkiest: CRITICAL, ERROR, WARNING, INFO (default), DEBUG, NOTSET
- ...
Additional arguments to be passed to
scvi.model.SCANVI
,SCANVI.setup_anndata
orSCANVI.train
(see Details section)
Value
A list containing:
a new DimReduc of name
new.reduction
(key set toreduction.key
) consisting of the latent space of the model withndims.out
dimensions.
When called via IntegrateLayers
, a Seurat object with
the new reduction and/or assay is returned
Details
This wrappers calls three python functions through reticulate. Find the scVANVI-specific arguments there:
model initiation: scvi.model.SCANVI, which relies on scvi.module.SCANVAE which in turn relies on scvi.module.VAE
anndata setup: SCANVI.setup_anndata
training: SCANVI.train
Note
This function requires the scvi-tools package to be installed (along with scipy)
References
Kingma, D. P., Rezende, D. J., Mohamed, S. & Welling, M. Semi- Supervised Learning with Deep Generative Models. Preprint at arXiv (2014). DOI
Xu, C., Lopez, R., Mehlman, E., Regier, J., Jordan, M. I. & Yosef, N. Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models. Molecular Systems Biology 17, (2021). DOI
Examples
if (FALSE) { # \dontrun{
# Preprocessing
obj <- SeuratData::LoadData("pbmcsca")
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)
# After preprocessing, we integrate layers:
obj <- IntegrateLayers(object = obj, method = scANVIIntegration,
features = Features(obj), conda_env = 'scvi-tools',
layers = 'counts', groups = obj[[]], groups.name = 'Method',
labels.name = 'CellType', labels.null = 'Unassigned')
# To enable saving the model, add other 'nuisance' factors and increase number of threads used:
obj <- IntegrateLayers(object = obj, method = scANVIIntegration,
features = Features(obj), conda_env = 'scvi-tools',
layers = 'counts', groups = obj[[]], groups.name = "Method",
labels.name = "CellType", labels.null = "Unassigned",
categorical_covariate_keys = "Experiment",
continuous_covariate_keys = "percent.mito",
ncores = 8, model.save.dir = '~/Documents/scANVI.model')
} # }