Skip to contents

A wrapper to run Harmony on multi-layered Seurat V5 object

Can be called via SeuratIntegrate::HarmonyIntegration() or HarmonyIntegration.fix()

Usage

HarmonyIntegration(
  object,
  orig,
  groups = NULL,
  groups.name = NULL,
  layers = NULL,
  scale.layer = "scale.data",
  features = NULL,
  new.reduction = "harmony",
  dims = NULL,
  key = "harmony_",
  seed.use = 42L,
  theta = NULL,
  sigma = 0.1,
  lambda = NULL,
  nclust = NULL,
  ncores = 1L,
  max_iter = 10,
  early_stop = TRUE,
  plot_convergence = FALSE,
  .options = harmony_options(),
  verbose = TRUE,
  ...
)

HarmonyIntegration.fix(...)

Arguments

object

A Seurat object (or an Assay5 object if not called by IntegrateLayers)

orig

DimReduc object. Not to be set directly when called with IntegrateLayers, use orig.reduction argument instead

groups

A named data frame with grouping information. Preferably one-column when groups.name = NULL

groups.name

Column name from groups data frame that stores grouping information. If groups.name = NULL, the first column is used

layers

Ignored unless groups = NULL, then used to create grouping variable to correct batch-effect.

scale.layer

Ignored

features

Ignored

new.reduction

Name of the new integrated dimensional reduction

dims

Dimensions of dimensional reduction to use for integration. All used by default

key

Prefix for the dimension names computed by harmony.

seed.use

An integer to generate reproducible outputs. Set seed.use = NULL to disable

theta

Diversity clustering penalty parameter. Specify for each variable in vars_use Default theta=2. theta=0 does not encourage any diversity. Larger values of theta result in more diverse clusters.

sigma

Width of soft kmeans clusters. Default sigma=0.1. Sigma scales the distance from a cell to cluster centroids. Larger values of sigma result in cells assigned to more clusters. Smaller values of sigma make soft kmeans cluster approach hard clustering.

lambda

Ridge regression penalty. Default lambda=1. Bigger values protect against over correction. If several covariates are specified, then lambda can also be a vector which needs to be equal length with the number of variables to be corrected. In this scenario, each covariate level group will be assigned the scalars specified by the user. If set to NULL, harmony will start lambda estimation mode to determine lambdas automatically and try to minimize overcorrection (Use with caution still in beta testing).

nclust

Number of clusters in model. nclust=1 equivalent to simple linear regression.

ncores

Number of processors to be used for math operations when optimized BLAS is available. If BLAS is not supporting multithreaded then this option has no effect. By default, ncore=1 which runs as a single-threaded process. Although Harmony supports multiple cores, it is not optimized for multithreading. Increase this number for large datasets iff single-core performance is not adequate.

max_iter

Maximum number of rounds to run Harmony. One round of Harmony involves one clustering and one correction step.

early_stop

Enable early stopping for harmony. The harmonization process will stop when the change of objective function between corrections drops below 1e-4

plot_convergence

Whether to print the convergence plot of the clustering objective function. TRUE to plot, FALSE to suppress. This can be useful for debugging.

.options

Setting advanced parameters of RunHarmony. This must be the result from a call to `harmony_options`. See ?`harmony_options` for parameters not listed above and more details.

verbose

Print messages. Set to FALSE to disable

...

Ignored for HarmonyIntegration(), or all of the above for HarmonyIntegration.fix()

Value

The function itself returns a list containing:

  • a new DimReduc of name reduction.name (key set to reduction.key) with corrected cell embeddings matrix of length(dims) columns.

When called via IntegrateLayers, a Seurat object with the new reduction is returned

Note

This function requires the harmony package to be installed

References

Korsunsky, I., Millard, N., Fan, J., Slowikowski, K., Zhang, F., Wei, K., Baglaenko, Y., Brenner, M., Loh, P. & Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019). DOI

Examples

if (FALSE) { # \dontrun{
# Preprocessing
obj <- UpdateSeuratObject(SeuratData::LoadData("pbmcsca"))
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)

# After preprocessing, we integrate layers based on the "Method" variable:
obj <- IntegrateLayers(object = obj, method = SeuratIntegrate::HarmonyIntegration,
                       verbose = TRUE)

# We can also change parameters such as the batch-effect variable.
# Here we change the groups variable, the number of dimension used from the original
# PCA and minor options from `harmony_options()`:
harmonyOptions <- harmony::harmony_options()
harmonyOptions$max.iter.cluster <- 10   #  20 by default
harmonyOptions$block.size <- .1         # .05 by default
obj <- IntegrateLayers(object = obj, method = SeuratIntegrate::HarmonyIntegration,
                       dims = 1:30, plot_convergence = TRUE,
                       groups = obj[[]]$Experiment,
                       new.reduction = "harmony_custom",
                       .options = harmonyOptions, verbose = TRUE)
} # }