% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ApotcClonalNetwork.R
\name{getSharedClones}
\alias{getSharedClones}
\title{Compute a list of clonotypes that are shared between seurat clusters}
\usage{
getSharedClones(
  seurat_obj,
  reduction_base = "umap",
  clonecall = "strict",
  ...,
  extra_filter = NULL,
  alt_ident = NULL,
  run_id = NULL,
  top = NULL,
  top_per_cl = NULL,
  intop = NULL,
  intop_per_cl = NULL,
  publicity = c(2L, Inf)
)
}
\arguments{
\item{seurat_obj}{Seurat object with one or more dimension reductions and
already have been integrated with a TCR/BCR library with
\code{scRepertoire::combineExpression}.}

\item{reduction_base}{character. The seurat reduction to base the clonal
expansion plotting on. Defaults to \code{'umap'} but can be any reduction present
within the reductions slot of the input seurat object, including custom ones.
If `'pca'``, the cluster coordinates will be based on PC1 and PC2.
However, generally APackOfTheClones is used for displaying UMAP and
occasionally t-SNE versions to intuitively highlight clonal expansion.}

\item{clonecall}{character. The column name in the seurat object metadata to
use. See \code{scRepertoire} documentation for more information about this
parameter that is central to both packages.}

\item{...}{additional "subsetting" keyword arguments indicating the rows
corresponding to elements in the seurat object metadata that should be
filtered by. E.g., \code{seurat_clusters = c(1, 9, 10)} will filter the cells to
those in the \code{seurat_clusters} column with any of the values 1, 9, and 10.
Unfortunately, column names in the seurat object metadata cannot
conflict with the keyword arguments. \emph{\strong{MAJOR NOTE}} if any subsetting
keyword arguments are a \emph{prefix} of any preceding argument names (e.g. a
column named \code{reduction} is a prefix of the \code{reduction_base} argument)
R will interpret it as the same argument unless \emph{both} arguments
are named. Additionally, this means any subsequent arguments \emph{must} be named.}

\item{extra_filter}{character. An additional string that should be formatted
\emph{exactly} like a statement one would pass into \link[dplyr:filter]{dplyr::filter} that does
\emph{additional} filtering to cells in the seurat object - on top of the other
keyword arguments - based on the metadata. This means that it will be
logically AND'ed with any keyword argument filters. This is a more flexible
alternative / addition to the filtering keyword arguments. For example, if
one wanted to filter by the length of the amino acid sequence of TCRs, one
could pass in something like \code{extra_filter = "nchar(CTaa) - 1 > 10"}. When
involving characters, ensure to enclose with single quotes.}

\item{alt_ident}{character. By default, cluster identity is assumed to be
whatever is in \code{Idents(seurat_obj)}, and clones will be grouped by the active
ident. However, \code{alt_ident} could be set as the name of some column in the
meta data of the seurat object to be grouped by. This column is meant to have
been a product of \code{Seurat::StashIdent} or manually added.}

\item{run_id}{character. This will be the ID associated with the data of a
run, and will be used by other important functions like \code{\link[=APOTCPlot]{APOTCPlot()}} and
\link{AdjustAPOTC}. Defaults to \code{NULL}, in which case the ID will be generated
in the following format:

\verb{reduction_base;clonecall;keyword_arguments;extra_filter}

where if keyword arguments and extra_filter are underscore characters if
there was no input for the \code{...} and \code{extra_filter} parameters.}

\item{top}{integer or numeric in (0, 1) - if not null, filters the output
clones so that only the shared clonotypes with counts the top \code{top} count /
proportion (for numeric in (0, 1) input) shared clones are kept. For cases
where several clonotypes tie in size, the clonotype(s) added are not
guaranteed but deterministic given the other arguments are identical.}

\item{top_per_cl}{integer or numeric in (0, 1) - if not null, filters the
output clones so that for each seurat cluster, only the clonotypes with the
\code{top_per_cl} frequency/count is preserved when aggregating shared clones,
in the same way as the above. Note that if inputted in conjunction with
\code{top}, it will get the \emph{intersection} of the clonotypes filtered each way.
For cases where several clonotypes tie in size, the clonotype(s) added are
not guaranteed but deterministic given the other arguments are identical.}

\item{intop}{integer or numeric in (0, 1) - if not null, filters the raw
clone sizes before computing the shared clonotypes so that only the
clonotypes that have their overall size in the top \code{intop} largest sizes
(if it is integer, else the \code{intop} proportion) are kept. To emphasize,
this argument \emph{\strong{does not necessarily}} return the \code{top} shared clones
and likely a little less, because this filters the raw clone sizes, of
which, its very likely that not all those clones end up being shared.}

\item{intop_per_cl}{integer or numeric in (0, 1) - if not null, filters
the raw \emph{clustered} clone sizes before computing shared clones, so that
for every clone in a seurat cluster, the top \code{intop_per_cl} count /
proportion (for numeric in (0, 1) input) clones are kept.}

\item{publicity}{numeric pair. A simple filter range of
\code{c(lowerbound, upperbound)} to retain only shared clones with their
"publicity" - number of clusters they are present in - within this
range.}
}
\value{
a named list where each name is a clonotype, each element is a
numeric indicating which seurat cluster(s) its in, in no particular order.
If no shared clones are present, the output is an empty list.
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#stable}{\figure{lifecycle-stable.svg}{options: alt='[Stable]'}}}{\strong{[Stable]}}

This function allows users to get a list of clonotypes that are shared
between clusters based on the levels of the active cell identities / some
custom identity based on the \code{alt_ident}. A list is returned with its
\emph{\strong{names}} being the shared clonotypes, and the values are numeric vectors
indicating the index of the clusters that clonotype is found in. The index
corresponds to the index in the default levels of the factored identities.

If \code{run_id} is inputted, then the function will attempt to get the shared
clonotypes from the corresponding APackOfTheClones run generated from
\code{\link[=RunAPOTC]{RunAPOTC()}}. Otherwise, it will use the filtering / subsetting parameters
to generate the shared clones.
}
\examples{
data("combined_pbmc")

getSharedClones(combined_pbmc)

getSharedClones(
    combined_pbmc,
    orig.ident = c("P17B", "P18B"), # a named subsetting parameter
    clonecall = "aa"
)

# extract shared clones from a past RunAPOTC run
combined_pbmc <- RunAPOTC(
    combined_pbmc, run_id = "foo", verbose = FALSE
)

getSharedClones(
    combined_pbmc, run_id = "foo", top = 5
)

# doing a run and then getting the clones works too
combined_pbmc <- RunAPOTC(combined_pbmc, run_id = "run1", verbose = FALSE)
getSharedClones(combined_pbmc, run_id = "run1")

}
