% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/integration.R
\name{TransferData}
\alias{TransferData}
\title{Transfer data}
\usage{
TransferData(
  anchorset,
  refdata,
  reference = NULL,
  query = NULL,
  weight.reduction = "pcaproject",
  l2.norm = FALSE,
  dims = NULL,
  k.weight = 50,
  sd.weight = 1,
  eps = 0,
  n.trees = 50,
  verbose = TRUE,
  slot = "data",
  prediction.assay = FALSE,
  store.weights = TRUE
)
}
\arguments{
\item{anchorset}{An \code{\link{AnchorSet}} object generated by
\code{\link{FindTransferAnchors}}}

\item{refdata}{Data to transfer. This can be specified in one of two ways:
\itemize{
  \item{The reference data itself as either a vector where the names
  correspond to the reference cells, or a matrix, where the column names
  correspond to the reference cells.}
  \item{The name of the metadata field or assay from the reference object
  provided. This requires the reference parameter to be specified. If pulling
  assay data in this manner, it will pull the data from the data slot. To
  transfer data from other slots, please pull the data explicitly with
  \code{\link{GetAssayData}} and provide that matrix here.}
}}

\item{reference}{Reference object from which to pull data to transfer}

\item{query}{Query object into which the data will be transferred.}

\item{weight.reduction}{Dimensional reduction to use for the weighting
anchors. Options are:
\itemize{
   \item{pcaproject: Use the projected PCA used for anchor building}
   \item{lsiproject: Use the projected LSI used for anchor building}
   \item{pca: Use an internal PCA on the query only}
   \item{cca: Use the CCA used for anchor building}
   \item{custom DimReduc: User provided \code{\link{DimReduc}} object
   computed on the query cells}
}}

\item{l2.norm}{Perform L2 normalization on the cell embeddings after
dimensional reduction}

\item{dims}{Set of dimensions to use in the anchor weighting procedure. If
NULL, the same dimensions that were used to find anchors will be used for
weighting.}

\item{k.weight}{Number of neighbors to consider when weighting anchors}

\item{sd.weight}{Controls the bandwidth of the Gaussian kernel for weighting}

\item{eps}{Error bound on the neighbor finding algorithm (from
\code{\link{RANN}})}

\item{n.trees}{More trees gives higher precision when using annoy approximate
nearest neighbor search}

\item{verbose}{Print progress bars and output}

\item{slot}{Slot to store the imputed data. Must be either "data" (default)
or "counts"}

\item{prediction.assay}{Return an \code{Assay} object with the prediction
scores for each class stored in the \code{data} slot.}

\item{store.weights}{Optionally store the weights matrix used for predictions
in the returned query object.}
}
\value{
If \code{query} is not provided, for the categorical data in \code{refdata},
returns a data.frame with label predictions. If \code{refdata} is a matrix,
returns an Assay object where the imputed data has been stored in the
provided slot.

If \code{query} is provided, a modified query object is returned. For
the categorical data in refdata, prediction scores are stored as Assays
(prediction.score.NAME) and two additional metadata fields: predicted.NAME
and predicted.NAME.score which contain the class prediction and the score for
that predicted class. For continuous data, an Assay called NAME is returned.
NAME here corresponds to the name of the element in the refdata list.
}
\description{
Transfer categorical or continuous data across single-cell datasets. For
transferring categorical information, pass a vector from the reference
dataset (e.g. \code{refdata = reference$celltype}). For transferring
continuous information, pass a matrix from the reference dataset (e.g.
\code{refdata = GetAssayData(reference[['RNA']])}).
}
\details{
The main steps of this procedure are outlined below. For a more detailed
description of the methodology, please see Stuart, Butler, et al Cell 2019.
\doi{10.1016/j.cell.2019.05.031}; \doi{10.1101/460147}

For both transferring discrete labels and also feature imputation, we first
compute the weights matrix.

\itemize{
  \item{Construct a weights matrix that defines the association between each
  query cell and each anchor. These weights are computed as 1 - the distance
  between the query cell and the anchor divided by the distance of the query
  cell to the \code{k.weight}th anchor multiplied by the anchor score
  computed in \code{\link{FindIntegrationAnchors}}. We then apply a Gaussian
  kernel width a bandwidth defined by \code{sd.weight} and normalize across
  all \code{k.weight} anchors.}
}

The main difference between label transfer (classification) and feature
imputation is what gets multiplied by the weights matrix. For label transfer,
we perform the following steps:

\itemize{
  \item{Create a binary classification matrix, the rows corresponding to each
  possible class and the columns corresponding to the anchors. If the
  reference cell in the anchor pair is a member of a certain class, that
  matrix entry is filled with a 1, otherwise 0.}
  \item{Multiply this classification matrix by the transpose of weights
  matrix to compute a prediction score for each class for each cell in the
  query dataset.}
}

For feature imputation, we perform the following step:
\itemize{
  \item{Multiply the expression matrix for the reference anchor cells by the
  weights matrix. This returns a predicted expression matrix for the
  specified features for each cell in the query dataset.}
}
}
\examples{
\dontrun{
# to install the SeuratData package see https://github.com/satijalab/seurat-data
library(SeuratData)
data("pbmc3k")

# for demonstration, split the object into reference and query
pbmc.reference <- pbmc3k[, 1:1350]
pbmc.query <- pbmc3k[, 1351:2700]

# perform standard preprocessing on each object
pbmc.reference <- NormalizeData(pbmc.reference)
pbmc.reference <- FindVariableFeatures(pbmc.reference)
pbmc.reference <- ScaleData(pbmc.reference)

pbmc.query <- NormalizeData(pbmc.query)
pbmc.query <- FindVariableFeatures(pbmc.query)
pbmc.query <- ScaleData(pbmc.query)

# find anchors
anchors <- FindTransferAnchors(reference = pbmc.reference, query = pbmc.query)

# transfer labels
predictions <- TransferData(anchorset = anchors, refdata = pbmc.reference$seurat_annotations)
pbmc.query <- AddMetaData(object = pbmc.query, metadata = predictions)
}

}
\references{
Stuart T, Butler A, et al. Comprehensive Integration of
Single-Cell Data. Cell. 2019;177:1888-1902 \doi{10.1016/j.cell.2019.05.031}
}
\concept{integration}
