Run Harmony on Seurat's Assay5 object through IntegrateLayers
Source: R/Harmony.R
HarmonyIntegration.Rd
A wrapper to run Harmony
on multi-layered Seurat V5 object
Can be called via SeuratIntegrate::HarmonyIntegration()
or
HarmonyIntegration.fix()
Usage
HarmonyIntegration(
object,
orig,
groups = NULL,
groups.name = NULL,
layers = NULL,
scale.layer = "scale.data",
features = NULL,
new.reduction = "harmony",
dims = NULL,
key = "harmony_",
seed.use = 42L,
theta = NULL,
sigma = 0.1,
lambda = NULL,
nclust = NULL,
ncores = 1L,
max_iter = 10,
early_stop = TRUE,
plot_convergence = FALSE,
.options = harmony_options(),
verbose = TRUE,
...
)
HarmonyIntegration.fix(...)
Arguments
- object
A
Seurat
object (or anAssay5
object if not called byIntegrateLayers
)- orig
DimReduc
object. Not to be set directly when called withIntegrateLayers
, useorig.reduction
argument instead- groups
A named data frame with grouping information. Preferably one-column when
groups.name = NULL
- groups.name
Column name from
groups
data frame that stores grouping information. Ifgroups.name = NULL
, the first column is used- layers
Ignored unless
groups = NULL
, then used to create grouping variable to correct batch-effect.- scale.layer
Ignored
- features
Ignored
- new.reduction
Name of the new integrated dimensional reduction
- dims
Dimensions of dimensional reduction to use for integration. All used by default
- key
Prefix for the dimension names computed by harmony.
- seed.use
An integer to generate reproducible outputs. Set
seed.use = NULL
to disable- theta
Diversity clustering penalty parameter. Specify for each variable in vars_use Default theta=2. theta=0 does not encourage any diversity. Larger values of theta result in more diverse clusters.
- sigma
Width of soft kmeans clusters. Default sigma=0.1. Sigma scales the distance from a cell to cluster centroids. Larger values of sigma result in cells assigned to more clusters. Smaller values of sigma make soft kmeans cluster approach hard clustering.
- lambda
Ridge regression penalty. Default lambda=1. Bigger values protect against over correction. If several covariates are specified, then lambda can also be a vector which needs to be equal length with the number of variables to be corrected. In this scenario, each covariate level group will be assigned the scalars specified by the user. If set to NULL, harmony will start lambda estimation mode to determine lambdas automatically and try to minimize overcorrection (Use with caution still in beta testing).
- nclust
Number of clusters in model. nclust=1 equivalent to simple linear regression.
- ncores
Number of processors to be used for math operations when optimized BLAS is available. If BLAS is not supporting multithreaded then this option has no effect. By default, ncore=1 which runs as a single-threaded process. Although Harmony supports multiple cores, it is not optimized for multithreading. Increase this number for large datasets iff single-core performance is not adequate.
- max_iter
Maximum number of rounds to run Harmony. One round of Harmony involves one clustering and one correction step.
- early_stop
Enable early stopping for harmony. The harmonization process will stop when the change of objective function between corrections drops below 1e-4
- plot_convergence
Whether to print the convergence plot of the clustering objective function. TRUE to plot, FALSE to suppress. This can be useful for debugging.
- .options
Setting advanced parameters of RunHarmony. This must be the result from a call to `harmony_options`. See ?`harmony_options` for parameters not listed above and more details.
- verbose
Print messages. Set to
FALSE
to disable- ...
Ignored for
HarmonyIntegration()
, or all of the above forHarmonyIntegration.fix()
Value
The function itself returns a list containing:
a new DimReduc of name
reduction.name
(key set toreduction.key
) with corrected cell embeddings matrix oflength(dims)
columns.
When called via IntegrateLayers
, a Seurat object with
the new reduction is returned
Note
This function requires the harmony package to be installed
References
Korsunsky, I., Millard, N., Fan, J., Slowikowski, K., Zhang, F., Wei, K., Baglaenko, Y., Brenner, M., Loh, P. & Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019). DOI
Examples
if (FALSE) { # \dontrun{
# Preprocessing
obj <- UpdateSeuratObject(SeuratData::LoadData("pbmcsca"))
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)
# After preprocessing, we integrate layers based on the "Method" variable:
obj <- IntegrateLayers(object = obj, method = SeuratIntegrate::HarmonyIntegration,
verbose = TRUE)
# We can also change parameters such as the batch-effect variable.
# Here we change the groups variable, the number of dimension used from the original
# PCA and minor options from `harmony_options()`:
harmonyOptions <- harmony::harmony_options()
harmonyOptions$max.iter.cluster <- 10 # 20 by default
harmonyOptions$block.size <- .1 # .05 by default
obj <- IntegrateLayers(object = obj, method = SeuratIntegrate::HarmonyIntegration,
dims = 1:30, plot_convergence = TRUE,
groups = obj[[]]$Experiment,
new.reduction = "harmony_custom",
.options = harmonyOptions, verbose = TRUE)
} # }