% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/salso.R
\name{salso}
\alias{salso}
\title{Sequentially-Allocated Latent Structure Optimization}
\usage{
salso(
  psm,
  loss = c("VI.lb", "binder")[1],
  maxSize = 0,
  maxScans = 5,
  nPermutations = 5000,
  probExploration = 0.005,
  seconds = 10,
  parallel = TRUE
)
}
\arguments{
\item{psm}{A pairwise similarity matrix, i.e., \code{n}-by-\code{n} symmetric
matrix whose \code{(i,j)} element gives the (estimated) probability that
items \code{i} and \code{j} are in the same subset (i.e., cluster) of a
partition (i.e., clustering).}

\item{loss}{Either \code{"VI.lb"} or \code{"binder"}, to indicate the desired
loss function.}

\item{maxSize}{Either zero or a positive integer. If a positive integer, the
optimization is constrained to produce solutions whose number of subsets
(i.e., clusters) is no more than the supplied value. If zero, the size is
not constrained.}

\item{maxScans}{The maximum number of reallocation scans after the intial
allocation. The actual number of scans may be less than \code{maxScans}
since the method stops if the result does not change between scans.}

\item{nPermutations}{The desired number of permutations to consider when
searching for the minimizer.}

\item{probExploration}{The expected probability of picking the second best
micro-optimization (instead of the best).  For a given permutation, the
probability is sampled from a beta distribution with shape \code{1} and
\code{1/probExploration}.}

\item{seconds}{A time threshold in seconds after which the function will
return early (with a warning) instead of finishing all the desired
permutations. Note that the function could take considerably longer,
however, because this threshold is only checked after each permutation is
completed.}

\item{parallel}{Should the search use all CPU cores?}
}
\value{
A list of the following elements: \describe{ \item{estimate}{An
  integer vector giving a partition encoded using cluster labels.}
  \item{loss}{A character vector equal to the \code{loss} argument.}
  \item{expectedLoss}{A numeric vector of length one giving the expected
  loss.} \item{nScans}{An integer vector giving the number of scans used to
  arrive at the supplied estimate.} \item{nPermutations}{An integer vector
  giving the number of permutations actually performed.} }
}
\description{
This function provides a point estimate for a partition distribution using
the sequentially-allocated latent structure optimization (SALSO) method. The
method seeks to minimize the expectation of the Binder loss or the lower
bound of the expectation of the variation of information loss. The SALSO
method was presented at the workshop "Bayesian Nonparametric Inference:
Dependence Structures and their Applications" in Oaxaca, Mexico on December
6, 2017. See
<https://www.birs.ca/events/2017/5-day-workshops/17w5060/schedule>.
}
\examples{
probs <- psm(iris.clusterings, parallel=FALSE)
salso(probs, nPermutations=50, parallel=FALSE)

}
\references{
D. A. Binder (1978), Bayesian cluster analysis, \emph{Biometrika} \bold{65},
31-38.

D. B. Dahl (2006), Model-Based Clustering for Expression Data via a Dirichlet
Process Mixture Model, in \emph{Bayesian Inference for Gene Expression and
Proteomics}, Kim-Anh Do, Peter Müller, Marina Vannucci (Eds.), Cambridge
University Press.

J. W. Lau and P. J. Green (2007), Bayesian model based clustering procedures,
\emph{Journal of Computational and Graphical Statistics} \bold{16}, 526-558.
D. B. Dahl, M. A. Newton (2007), Multiple Hypothesis Testing by Clustering
Treatment Effects, \emph{Journal of the American Statistical Association},
\bold{102}, 517-526.

A. Fritsch and K. Ickstadt (2009), An improved criterion for clustering
based on the posterior similarity matrix, \emph{Bayesian Analysis},
\bold{4}, 367-391.

S. Wade and Z. Ghahramani (2018), Bayesian cluster analysis: Point
estimation and credible balls. \emph{Bayesian Analysis}, \bold{13:2},
559-626.
}
\seealso{
\code{\link{psm}}, \code{\link{confidence}}, \code{\link{dlso}}
}
