Install and load the data
We will use the pbmcsca
dataset available from the
package SeuratData
.
# install `SeuratData` package (if not yet)
if (! requireNamespace("SeuratData", quietly = TRUE)) {
devtools::install_github('satijalab/seurat-data')
}
# increase download timeout and install the dataset
options(timeout = 300)
SeuratData::InstallData('pbmcsca')
# load the dataset (take 1,000 first cells to speed-up execution)
seu <- SeuratData::LoadData('pbmcsca')[,1:1e3]
## Warning: Assay RNA changing from Assay to Assay
## Warning: Assay RNA changing from Assay to Assay5
Inspect the dataset
Have a look at the metadata:
# rmarkdown::paged_table -> prints data frames in a less ugly way than default
rmarkdown::paged_table(head(seu[[]], n = 10))
The column Method
provides information about the
single-cell technology used to acquire each cell’s trancriptome. We
consider it as the batch of origin.
table(seu$Method)
##
## 10x Chromium (v2) A CEL-Seq2 Smart-seq2
## 494 253 253
The column CellType
contains cell-types annotation that
we will use later on.
rmarkdown::paged_table(as.data.frame(table(seu$CellType)))
Let’s now define the batch variable and the cell-type variable(s):
batch.var <- 'Method' # available in the metadata of the object
cell.var <- 'CellType' # available in the metadata of the object
Seurat data processing - Standard workflow
The current Seurat object is not split, meaning that each layer contains cells from all batches.
## Layers before split: counts, data
We select the column Method
to separate the cells into
batch-specific layers. This step is indispensable to run integration
methods and enables to normalise each batch (i.e. layer)
independently.
seu[['RNA']] <- split(seu[['RNA']], f = seu$Method)
cat('Layers after split:', paste(Layers(seu), collapse = ", "), '\n')
## Layers after split: counts.Smart-seq2, counts.CEL-Seq2, counts.10x_Chromium_v2_A, data.Smart-seq2, data.CEL-Seq2, data.10x_Chromium_v2_A
Then, we proceed to the standard Seurat
workflow until
we obtain the PCA reduction.
seu <- SCTransform(seu)
seu <- RunPCA(seu, verbose = F)
seu <- FindNeighbors(seu, reduction = "pca", dims = 1:30, k.param = 20L)
We now further process the object until we can visualise the dispersion of cells on a UMAP dimension reduction.
seu <- FindClusters(seu, graph.name = 'SCT_snn', resolution = .5)
seu <- RunUMAP(seu, dims = 1:30, reduction = 'pca')
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1000
## Number of edges: 43767
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8715
## Number of communities: 8
## Elapsed time: 0 seconds
DimPlot(seu, label = T) + NoLegend() + ggplot2::coord_fixed(ratio = .7)
Let’s colour cells according to batch and cell-type label.
DimPlot(seu, group.by = batch.var) +
DimPlot(seu, group.by = cell.var) & ggplot2::coord_fixed(ratio = .7)
Albeit moderately, the dispersion of cells on the UMAP seems to be influenced by a batch effect. According to the “CellType” variable, B cells from 10x and CEL-Seq2 are not gathered together and monocyte cells do not fully overlap.
Now, we want to correct these technical differences.
Integrate batches - SeuratIntegrate workflow
The integration commands are to some extent similar to the Seurat
V5 vignette. The purpose of this package is to extend the set of
available integration methods. See the bottom of
?IntegrateLayers
to have a comprehensive list of relevant
methods.
Many methods supported by SeuratIntegrate are implemented in Python,
and the wrappers provided rely on the reticulate
package
and conda
environments. If you are not familiar with the
CondaEnvManager
, have a look at the
vignette("setup_and_tips")
.
Here are some important considerations before performing an integration:
- use the interface
DoIntegrate()
from SeuratIntegrate to integrate the cell batches (see the next section) - check the method-specific arguments
(e.g.
?bbknnIntegration
) - set up conda environments to use Python-based methods:
- with the
CondaEnvManager
viaUpdateEnvCache()
(check thevignette("setup_and_tips")
) - with a custom conda environment by passing its path - or its name -
to the
conda_env
parameter. This overrides the default behaviour by loading the specified environment instead of fetching theCondaEnvManager
’s cache.
- with the
Some methods expect a raw count matrix, others expect the scaled counts of the variable features, etc. To help you with the choice, look at the table below
Method | Layer | Features |
---|---|---|
ComBat | data | any but better with restricted number of features (e.g. variable) |
Harmony | N/A (PCA reduction) | N/A (PCA reduction) |
MNN | data | any but better with restricted number of features (e.g. variable) |
bbknn | N/A (PCA reduction) | N/A (PCA reduction) |
scVI / scANVI | counts | all |
Scanorama | counts or data | any |
trVAE |
data (recon loss “mse”) counts (otherwise) |
all |
Layers:
- counts: raw counts
- data: normalised counts
- scale.data: scaled normalised counts of variable features
/!\ IMPORTANT /!\ To use
all features when calling an integration method with
IntegrateLayers()
:
IntegrateLayers(object, features = Features(object), scale.layer = NULL)
.
Does not work for a SCTAssay
.
DoIntegrate philosophy (#do_integrate)
The function DoIntegrate()
is a very handy way to run
multiple integrations in a single command. Note that:
-
...
is the place to specify the integration commands to run. - integration commands are expected to be function
calls, i.e. of the form
FooIntegration()
,BarIntegration()
, etc. - Calls accept method-specific argument (e.g.
FooIntegration(layers = "data")
) -
use.hvg = TRUE
results in using variable features -
use.future = TRUE
is useful to run Python-based methods (a normal R session cannot load more than one conda instance, whilefuture
enables to launch background sessions, preventing the main on to load any conda environment.). It is highly recommended to set it toFALSE
for R-based methods.
Most integration methods can be used without modifying default parameters. In this vignette, we will change some arguments to meet our needs. Notably, we will change the number of cores allocated to each method (where possible).
ncores <- parallel::detectCores() - 2
In this vignette, we are going to use 3 Python-based methods, namely BBKNN, Scanorama and scANVI from the scvi-tools suite. Let’s make sure they are available straight away:
# BBKNN
if (! isValid(getCache()$bbknn)) {
UpdateEnvCache("bbknn")
}
# Scanorama
if (! isValid(getCache()$scanorama)) {
UpdateEnvCache("scanorama")
}
# scvi-tools
if (! isValid(getCache()$scanvi)) {
UpdateEnvCache("scanvi")
}
Let’s proceed to a few batch-effect corrections:
Integration part 1) Using R-based methods
seu <- DoIntegrate(
object = seu,
SeuratIntegrate::HarmonyIntegration(orig = "pca", dims = 1:30,
ncores = ncores),
CCAIntegration(orig = "pca", dims = 1:30 , new.reduction = "cca.integrated",
normalization.method = "SCT"),
RPCAIntegration(orig = "pca", dims = 1:30, new.reduction = "rpca.integrated",
normalization.method = "SCT"),
use.future = FALSE # R-based
)
## Integration 1 in 3: integrating using 'SeuratIntegrate::HarmonyIntegration'
## Integration 2 in 3: integrating using 'CCAIntegration'
## Integration 3 in 3: integrating using 'RPCAIntegration'
Note: We use SeuratIntegrate::
before
HarmonyIntegration
to avoid any confusion with
Seurat::HarmonyIntegration()
.
Integration part 2) Using python-based methods
seu <- DoIntegrate(
object = seu,
bbknnIntegration(orig = "pca", layers = "data", ndims = 30),
ScanoramaIntegration(orig = "pca", ncores = ncores),
scANVIIntegration(groups = seu[[]], groups.name = "Method",
labels.name = "CellType", layers = "counts",
torch.intraop.threads = ncores,
torch.interop.threads = ncores,
max_epochs = 20L),
use.future = TRUE, # Python-based
use.hvg = c(TRUE, TRUE, FALSE)
)
## Integration 1 in 3: integrating using 'bbknnIntegration'
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1000
## Number of edges: 6866
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.6812
## Number of communities: 10
## Elapsed time: 0 seconds
## Integration 2 in 3: integrating using 'ScanoramaIntegration'
## Integration 3 in 3: integrating using 'scANVIIntegration'
Note: set
max_epochs = 20L
forscANVIIntegration
is only to save time ! The default number of epochs (400) results in a superior integration.
If we take a look at our Seurat object, we can note that it has been enriched with many objects:
print(seu)
## An object of class Seurat
## 69594 features across 1000 samples within 4 assays
## Active assay: SCT (16450 features, 3000 variable features)
## 3 layers present: counts, data, scale.data
## 3 other assays present: RNA, bbknn.ridge, scanorama.reconstructed
## 8 dimensional reductions calculated: pca, umap, harmony, cca.integrated, rpca.integrated, pca.bbknn, integrated.scanorama, integrated.scANVI
cat("Graph objects:", paste(Graphs(seu), collapse = ", "), "\n")
cat("Neighbor objects:", paste(Neighbors(seu), collapse = ", "), "\n")
cat("Reduction dimensions:", paste(Reductions(seu), collapse = ", "), "\n")
cat("Assays:", paste(Assays(seu), collapse = ", "), "\n")
## Graph objects: SCT_nn, SCT_snn, bbknn_scale.data_connectivities, bbknn_scale.data_distances, bbknn_ridge.residuals_connectivities, bbknn_ridge.residuals_distances
## Neighbor objects:
## Reduction dimensions: pca, umap, harmony, cca.integrated, rpca.integrated, pca.bbknn, integrated.scanorama, integrated.scANVI
## Assays: RNA, SCT, bbknn.ridge, scanorama.reconstructed
Great! We have successfully performed several integrations! However, stopping here would be unsatisfactory because we still need to process each integration’s output(s) to obtain at least one UMAP projection for each. Here, we will also aim at generating assessable representations to score.
Score integrations
Process outputs
Several objects can be produced by each integration algorithm, namely a layer in a new assay (i.e. corrected counts), a dimension reduction (corrected embedding), or a knn network. Some even produce more than one output (for instance Scanorama produces corrected counts and a dimension reduction).
The type of output is important to consider, because scoring metrics are not compatible with all output types. The simplest strategy is to process each output separately in order to obtain at least a PCA out of it, or even a knn graph (essential to compute clusters). Note that most scores cannot be computed on knn graphs, hence knn graph outputs (e.g. BBKNN) can only be evaluated by a reduced set of metrics.
Below is a summary of post-processing steps for each output type (bracketed steps are not always necessary):
- corrected counts: [
ScaleData()
] ->RunPCA()
-> [FindNeighbors()
->FindOptimalClusters()
] - dimension reduction: [
RunPCA()
] -> [FindNeighbors()
->FindOptimalClusters()
] - knn graph: [
FindOptimalClusters()
]
RunPCA()
is sometimes run even on dimension reduction
objects (within scoring functions) because some scores require a
variance associated with each dimension.
Let’s process all the outputs. Here, we will go through all the steps
for a more exhaustive demonstration. However, it is to be noted that
skipping the final step FindOptimalClusters()
makes the
neighbour graph computation step (FindNeighbors()
)
unnecessary. In such a case however, one will forgo two scoring metrics,
namely ARI and NMI.
Here, we will use SymmetrizeKnn()
between
FindNeighbors()
and FindOptimalClusters()
because we set return.neighbor = TRUE
in
FindNeighbors()
. This is useful to keep the distances
between cells in the KNN
graph rather than what
FindNeighbors()
does by default, which is converting the
KNN
graph to an adjacency matrix with 0/1s and to a
SNN
network with values bounded between 0 and 1. This not
compulsory, but this is used to stay in line with BBKNN’s output. To
prevent the community detection algorithm to output a high fraction of
singletons, we “symmetrize” the matrix which makes the graph
“undirected”.
# corrected counts outputs
DefaultAssay(seu) <- "scanorama.reconstructed"
seu <- ScaleData(seu)
seu <- RunPCA(seu, npcs = 50L, reduction.name = "pca.scanorama_counts")
seu <- FindNeighbors(seu, reduction = "pca.scanorama_counts", dims = 1:30, return.neighbor = TRUE,
graph.name = "knn.scanorama_counts")
seu <- SymmetrizeKnn(seu, graph.name = "knn.scanorama_counts")
seu <- FindOptimalClusters(seu, graph.name = "knn.scanorama_counts_symmetric",
cluster.name = "scanorama_counts_{cell.var}_{metric}",
cell.var = cell.var,
optimisation.metric = c("nmi", "ari")) # default, compute both
# dimension reduction outputs
DefaultAssay(seu) <- "SCT"
seu <- FindNeighbors(seu, reduction = "pca", dims = 1:30, k.param = 20L,
return.neighbor = TRUE, graph.name = "knn.unintegrated")
seu <- SymmetrizeKnn(seu, graph.name = "knn.unintegrated")
seu <- FindOptimalClusters(seu, graph.name = "knn.unintegrated_symmetric",
cluster.name = "unintegrated_{cell.var}_{metric}",
cell.var = cell.var)
seu <- FindNeighbors(seu, reduction = "integrated.scanorama", dims = 1:30,
return.neighbor = TRUE, graph.name = "knn.scanorama_reduction")
seu <- SymmetrizeKnn(seu, graph.name = "knn.scanorama_reduction")
seu <- FindOptimalClusters(seu, graph.name = "knn.scanorama_reduction_symmetric",
cluster.name = "scanorama_reduction_{cell.var}_{metric}",
cell.var = cell.var)
seu <- FindNeighbors(seu, reduction = "harmony", dims = 1:30,
return.neighbor = TRUE, graph.name = "knn.harmony")
seu <- SymmetrizeKnn(seu, graph.name = "knn.harmony")
seu <- FindOptimalClusters(seu, graph.name = "knn.harmony_symmetric", cell.var = cell.var,
cluster.name = "harmony_{cell.var}_{metric}")
seu <- FindNeighbors(seu, reduction = "cca.integrated", dims = 1:30,
return.neighbor = TRUE, graph.name = "knn.cca")
seu <- SymmetrizeKnn(seu, graph.name = "knn.cca")
seu <- FindOptimalClusters(seu, graph.name = "knn.cca_symmetric", cell.var = cell.var,
cluster.name = "cca_{cell.var}_{metric}")
seu <- FindNeighbors(seu, reduction = "rpca.integrated", dims = 1:30,
return.neighbor = TRUE, graph.name = "knn.rpca")
seu <- SymmetrizeKnn(seu, graph.name = "knn.rpca")
seu <- FindOptimalClusters(seu, graph.name = "knn.rpca_symmetric", cell.var = cell.var,
cluster.name = "rpca_{cell.var}_{metric}")
seu <- FindNeighbors(seu, reduction = "integrated.scANVI",
return.neighbor = TRUE, graph.name = "knn.scanvi")
seu <- SymmetrizeKnn(seu, graph.name = "knn.scanvi")
seu <- FindOptimalClusters(seu, graph.name = "knn.scanvi_symmetric", cell.var = cell.var,
cluster.name = "scanvi_{cell.var}_{metric}")
# graph outputs
seu <- SymmetrizeKnn(seu, graph.name = "bbknn_ridge.residuals_distances")
seu <- FindOptimalClusters(seu, graph.name = "bbknn_ridge.residuals_distances_symmetric",
cell.var = cell.var, cluster.name = "bbknn_{cell.var}_{metric}")
Note: Instead of sticking with the default
FindNeighbors(return.neighbors = FALSE)
at the beginning, we could have switched it toTRUE
right away, process theKNN
graph withSymmetrizeKnn()
and use it for subsequent steps (umap, clustering, etc.)
FindOptimalClusters()
adds metadata columns with the
clustering results that maximized a metric (NMI or ARI) to the Seurat
object:
rmarkdown::paged_table(seu[[]][1:10, grep("CellType_[arinm]{3}$", colnames(seu[[]]))])
Add scores to the Seurat object
Now that we have computed objects ready to be scored, we will proceed
with assessing each integration output. For this step, one can either
use Score[score_name]()
and save scores in separate
variables or use AddScore[score_name]()
to directly store
scores within the Seurat object. The latter is far more convenient and
allows to compare scores graphically. This is the strategy we are going
to adopt here.
Last but not least, a cell-type variable is used in most scores, hence it is highly recommended to have an estimate of each cell’s type (or produce such an estimate) stored in the Seurat object as a column in the metadata.
Note that if one doesn’t have a variable with cell-labels (used as ground truth in multiple scores) he will have to produce it or do without several scores. Alternatively, one can use automatic cell annotation algorithms (e.g. the Azimuth package). One can also have multiple cell label variables (e.g. Azimuth typically returns as many cell label variables as levels of annotations contained in the reference). Scores requiring cell type annotations always accept multiple column names.
There is a risk of confusion between cell annotations when using automatic cell annotation tools. Furthermore, in the case of using Azimuth to annotate cells, there is a specific risk of biasing results by favouring RPCA integration because Azimuth uses RPCA to integrate the query dataset onto the reference.
First of all, let’s organise outputs in lists:
reductions <- list(
unintegrated = "pca",
scanorama_counts = "pca.scanorama_counts",
scanorama_reduction = "integrated.scanorama",
harmony = "harmony",
cca = "cca.integrated",
rpca = "rpca.integrated",
scanvi = "integrated.scANVI",
bbknn = NULL
)
graphs <- list(
unintegrated = "knn.unintegrated_symmetric",
scanorama_counts = "knn.scanorama_counts_symmetric",
scanorama_reduction = "knn.scanorama_reduction_symmetric",
harmony = "knn.harmony_symmetric",
cca = "knn.cca_symmetric",
rpca = "knn.rpca_symmetric",
scanvi = "knn.scanvi_symmetric",
bbknn = "bbknn_ridge.residuals_distances_symmetric"
)
integrations <- names(reductions)
Let’s finalise our preparations: :
- Make sure you have the kBET package installed to
run
ScoreKBET()
- To benefit from a faster implementation of the lisi algorithm, install the lisi package
- To benefit from a faster implementation of the euclidean distance computation, install the distances package
- To run
ScoreRegressPC.CellCycle()
(score of cell cycle conservation), runCellCycleScoringPerBatch()
beforehand (next chunk)
seu <- CellCycleScoringPerBatch(seu, batch.var = batch.var,
s.features = cc.genes$s.genes,
g2m.features = cc.genes$g2m.genes)
Let’s now loop through the integration outputs:
for (integ in integrations) {
reduc <- reductions[[integ]]
graph <- graphs[[integ]]
clust.var.ari <- paste(integ, cell.var, "ari", sep = "_") # produced by `FindOptimalClusters()`
clust.var.nmi <- paste(integ, cell.var, "nmi", sep = "_") # produced by `FindOptimalClusters()`
if (! is.null(reduc)) { # all TRUE except bbknn
seu <- AddScoreASW(seu, integration = integ, cell.var = cell.var,
what = reduc, dist.package = "distances")
seu <- AddScoreASWBatch(seu, integration = integ, batch.var = batch.var,
cell.var = cell.var, what = reduc,
dist.package = "distances")
seu <- AddScoreDensityPC(seu, integration = integ, batch.var = batch.var,
reduction = reduc)
seu <- AddScoreRegressPC(seu, integration = integ, batch.var = batch.var,
reduction = reduc)
seu <- AddScoreRegressPC.CellCycle(seu, integration = integ,
batch.var = batch.var, what = reduc,
compute.cc = FALSE) # because CellCycleScoringPerBatch was ran before
}
seu <- AddScoreARI(seu, integration = integ, cell.var = cell.var,
clust.var = clust.var.ari)
seu <- AddScoreNMI(seu, integration = integ, cell.var = cell.var,
clust.var = clust.var.nmi)
seu <- AddScoreConnectivity(seu, integration = integ, graph.name = graph,
cell.var = cell.var)
seu <- AddScoreKBET(seu, integration = integ, batch.var = batch.var,
cell.var = cell.var, what = reduc %||% sub("_symmetric$", "", graph),
graph.type = "distances", verbose = FALSE)
seu <- AddScoreLISI(seu, integration = integ, batch.var = batch.var,
cell.var = cell.var, reduction = reduc,
graph.name = if (is.null(reduc)) sub("_symmetric$", "", graph) else NULL,
graph.type = "distances", ncores = ncores)
}
Now that we have computed multiple scores, we can look at them using
IntegrationScores()
:
rmarkdown::paged_table(IntegrationScores(seu))
Compare the integrations
Note that those scores are raw. Let’s scale them (and make them comparable):
seu <- ScaleScores(seu)
Once again, we can print them:
rmarkdown::paged_table(IntegrationScores(seu, scaled = TRUE))
To readily compare the integrations, let’s plot the scores:
PlotScores(seu)
One might notice a difference in scale for some of the scores, if comparing the plot with the table just before. This is the case for the PCA density score for instance.
print(sort(IntegrationScores(seu, scaled = TRUE)$PCA.density))
## [1] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.05713768 0.12268377
Indeed, PlotScores()
rescales the scores using min-max
normalisation by default (rescale = TRUE
). One might chose
to disable it:
PlotScores(seu, rescale = FALSE)
We notice that PCA based methods output very low scores in this case. Since they cannot be computed on knn graphs, scores are biased in favour of BBKNN. You can exclude some scores (and recompute overall scores on the fly)
PlotScores(seu, rescale = FALSE, exclude.score = c("pca.density", "pca.regression"))
You can chose a different type of plot (radar
or
lollipop
):
library(ggplot2)
PlotScores(seu, plot.type = "radar") +
# reduce overlap between axis names and figures
theme(legend.position = "top", panel.spacing = unit(3, "cm"),
plot.margin = margin(r = 3, l = 3, unit = "cm"),
axis.text.x = element_text(size = 10))
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
In this last plot, we also exclude the non-integrated dataset. Since
the rescale
argument is true by default, the scores are
rescaled without the excluded integration’s scores.
PlotScores(seu, plot.type = "lollipop",
exclude.integration = "unintegrated")
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_errorbarh()`).
We want to compare the UMAP embeddings. For this, we first compute the dimension reductions:
for (integ in integrations) {
reduc <- reductions[[integ]]
if (! is.null(reduc)) { # all except BBKNN
seu <- RunUMAP(seu, dims = 1:min(30, ncol(seu[[reduc]])), reduction = reduc,
reduction.name = paste0(integ, ".umap"))
}
}
As BBKNN’s output is a graph, we need to use the umap Python package:
if (! reticulate::condaenv_exists('umap_0.5.4')) {
reticulate::conda_create('umap_0.5.4', packages = 'umap=0.5.4')
}
library(future)
plan(multisession)
seu %<-% {
reticulate::use_condaenv('umap_0.5.4')
RunUMAP(seu, graph = "bbknn_ridge.residuals_connectivities", umap.method = "umap-learn",
n.epochs = 200L, reduction.name = "bbknn.umap") }
seu
plan(sequential)
library(ggplot2)
plot_list <- sapply(integrations, function(integ) {
DimPlot(seu, reduction = paste0(integ, ".umap"), group.by = batch.var) +
ggtitle(integ) +
theme(axis.title = element_blank())
}, simplify = FALSE)
patchwork::wrap_plots(plot_list, guides = "collect")
library(ggplot2)
plot_list <- sapply(integrations, function(integ) {
DimPlot(seu, reduction = paste0(integ, ".umap"), group.by = cell.var) +
ggtitle(integ) +
theme(axis.title = element_blank())
}, simplify = FALSE)
patchwork::wrap_plots(plot_list, guides = "collect")
There are several observations to be made here, that require further explanations:
- some cells seem to be assigned a wrong label in
CellType
, highlighting the importance of having cell annotations of sufficient quality to be considered suitable as ground truth (which is actually not the case here) - All scaled PCA regression scores are set to zero. This is because
the unintegrated dataset has the lowest raw PCA regression score. It is
very likely the consequence of the
SCT
normalisation, which is much more efficient at masking batch-specific differences than the classicalLogNorm
. Thus, the inter-batch differences are not driving the principal components. -
ScaleScores()
produce scores that can have different scales (as long asrescale = FALSE
) . Thus, min-max rescaling is used by default inPlotScores()
, to balance each score’s contribution to the overall scores. This is especially suited for comparing a large number of integrations. However, it has some drawbacks: it can heavily distort the scale of scores when their maximum or minimum are far from 1 or 0 respectively (e.g. all cLISI scores are above 0.9). Hence, the final decision on whether to use min-max rescaling is left to the user’s discretion.
Session info
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Linux Mint 21
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Paris
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] future_1.34.0 ggplot2_3.5.1 SeuratIntegrate_0.4.0
## [4] Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.5 matrixStats_1.4.1
## [3] spatstat.sparse_3.1-0 httr_1.4.7
## [5] RColorBrewer_1.1-3 tools_4.4.2
## [7] sctransform_0.4.1 backports_1.5.0
## [9] R6_2.5.1 ResidualMatrix_1.14.1
## [11] lazyeval_0.2.2 uwot_0.2.2
## [13] mgcv_1.9-1 withr_3.0.2
## [15] gridExtra_2.3 progressr_0.15.1
## [17] cli_3.6.3 Biobase_2.64.0
## [19] textshaping_0.4.1 spatstat.explore_3.3-3
## [21] fastDummies_1.7.4 labeling_0.4.3
## [23] sass_0.4.9 spatstat.data_3.1-4
## [25] genefilter_1.86.0 ggridges_0.5.6
## [27] pbapply_1.7-2 pkgdown_2.1.1
## [29] systemfonts_1.1.0 hcabm40k.SeuratData_3.0.0
## [31] harmony_1.2.3 parallelly_1.41.0
## [33] limma_3.60.6 rstudioapi_0.17.1
## [35] RSQLite_2.3.9 FNN_1.1.4.1
## [37] generics_0.1.3 ica_1.0-3
## [39] spatstat.random_3.3-2 dplyr_1.1.4
## [41] Matrix_1.7-1 S4Vectors_0.42.1
## [43] abind_1.4-8 lifecycle_1.0.4
## [45] yaml_2.3.10 edgeR_4.2.2
## [47] SummarizedExperiment_1.34.0 SparseArray_1.4.8
## [49] Rtsne_0.17 glmGamPoi_1.16.0
## [51] grid_4.4.2 blob_1.2.4
## [53] ifnb.SeuratData_3.1.0 promises_1.3.2
## [55] crayon_1.5.3 miniUI_0.1.1.1
## [57] lattice_0.22-6 beachmat_2.20.0
## [59] cowplot_1.1.3 annotate_1.82.0
## [61] KEGGREST_1.44.1 pillar_1.10.0
## [63] knitr_1.49 GenomicRanges_1.56.2
## [65] kBET_0.99.6 future.apply_1.11.3
## [67] codetools_0.2-20 leiden_0.4.3.1
## [69] glue_1.8.0 spatstat.univar_3.1-1
## [71] data.table_1.16.4 vctrs_0.6.5
## [73] png_0.1-8 spam_2.11-0
## [75] gtable_0.3.6 cachem_1.1.0
## [77] xfun_0.49 S4Arrays_1.4.1
## [79] mime_0.12 survival_3.8-3
## [81] pbmcref.SeuratData_1.0.0 SingleCellExperiment_1.26.0
## [83] statmod_1.5.0 fitdistrplus_1.2-1
## [85] ROCR_1.0-11 nlme_3.1-166
## [87] bit64_4.5.2 RcppAnnoy_0.0.22
## [89] GenomeInfoDb_1.40.1 bslib_0.8.0
## [91] irlba_2.3.5.1 KernSmooth_2.23-24
## [93] colorspace_2.1-1 BiocGenerics_0.50.0
## [95] DBI_1.2.3 tidyselect_1.2.1
## [97] bit_4.5.0.1 compiler_4.4.2
## [99] BiocNeighbors_1.22.0 desc_1.4.3
## [101] DelayedArray_0.30.1 plotly_4.10.4
## [103] scales_1.3.0 lmtest_0.9-40
## [105] distances_0.1.11 rappdirs_0.3.3
## [107] stringr_1.5.1 digest_0.6.37
## [109] goftest_1.2-3 spatstat.utils_3.1-1
## [111] rmarkdown_2.29 XVector_0.44.0
## [113] RhpcBLASctl_0.23-42 htmltools_0.5.8.1
## [115] pkgconfig_2.0.3 sparseMatrixStats_1.16.0
## [117] MatrixGenerics_1.16.0 fastmap_1.2.0
## [119] rlang_1.1.4 htmlwidgets_1.6.4
## [121] UCSC.utils_1.0.0 shiny_1.10.0
## [123] DelayedMatrixStats_1.26.0 farver_2.1.2
## [125] jquerylib_0.1.4 zoo_1.8-12
## [127] jsonlite_1.8.9 BiocParallel_1.38.0
## [129] BiocSingular_1.20.0 magrittr_2.0.3
## [131] scuttle_1.14.0 GenomeInfoDbData_1.2.12
## [133] dotCall64_1.2 patchwork_1.3.0
## [135] pbmcsca.SeuratData_3.0.0 munsell_0.5.1
## [137] Rcpp_1.0.13-1 reticulate_1.40.0
## [139] stringi_1.8.4 zlibbioc_1.50.0
## [141] MASS_7.3-61 plyr_1.8.9
## [143] parallel_4.4.2 listenv_0.9.1
## [145] ggrepel_0.9.6 forcats_1.0.0
## [147] deldir_2.0-4 Biostrings_2.72.1
## [149] splines_4.4.2 tensor_1.5
## [151] locfit_1.5-9.10 lisi_1.0
## [153] bonemarrowref.SeuratData_1.0.0 igraph_2.1.2
## [155] spatstat.geom_3.3-4 heartref.SeuratData_1.0.0
## [157] RcppHNSW_0.6.0 reshape2_1.4.4
## [159] stats4_4.4.2 ScaledMatrix_1.12.0
## [161] XML_3.99-0.17 evaluate_1.0.1
## [163] tweenr_2.0.3 httpuv_1.6.15
## [165] batchelor_1.20.0 bmcite.SeuratData_0.3.0
## [167] RANN_2.6.2 tidyr_1.3.1
## [169] purrr_1.0.2 polyclip_1.10-7
## [171] SeuratData_0.2.2.9001 scattermore_1.2
## [173] ggforce_0.4.2 rsvd_1.0.5
## [175] broom_1.0.7 xtable_1.8-4
## [177] RSpectra_0.16-2 later_1.4.1
## [179] viridisLite_0.4.2 ragg_1.3.3
## [181] tibble_3.2.1 memoise_2.0.1
## [183] AnnotationDbi_1.66.0 IRanges_2.38.1
## [185] cluster_2.1.8 sva_3.52.0
## [187] globals_0.16.3