% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/documentation.R, R/two_samples.R
\name{two_sample}
\alias{two_sample}
\alias{dts_test}
\alias{dts_stat}
\title{DTS Test}
\usage{
dts_test(a, b, nboots = 2000, p = default.p, keep.boots = T, keep.samples = F)

two_sample(
  a,
  b,
  nboots = 2000,
  p = default.p,
  keep.boots = T,
  keep.samples = F
)

dts_stat(a, b, power = def_power)
}
\arguments{
\item{a}{a vector of numbers (or factors -- see details)}

\item{b}{a vector of numbers}

\item{nboots}{Number of bootstrap iterations}

\item{p}{power to raise test stat to}

\item{keep.boots}{Should the bootstrap values be saved in the output?}

\item{keep.samples}{Should the samples be saved in the output?}

\item{power}{also the power to raise the test stat to}
}
\value{
Output is a length 2 Vector with test stat and p-value in that order. That vector has 3 attributes -- the sample sizes of each sample, and the number of bootstraps performed for the pvalue.
}
\description{
A two-sample test based on the DTS test statistic (\code{dts_stat}). This is the recommended two-sample test in this package because of its power. The DTS statistic is the reweighted integral of the distance between the two ECDFs.
}
\details{
The DTS test compares two ECDFs by looking at the reweighted Wasserstein distance between the two. See the companion paper at \href{https://arxiv.org/abs/2007.01360}{arXiv:2007.01360} or \url{https://codowd.com/public/DTS.pdf} for details of this test statistic, and non-standard uses of the package (parallel for big N, weighted observations, one sample tests, etc).

If the \code{\link[=wass_test]{wass_test()}} extends \code{\link[=cvm_test]{cvm_test()}} to interval data, then \code{\link[=dts_test]{dts_test()}} extends \code{\link[=ad_test]{ad_test()}} to interval data. Formally -- if E is the ECDF of sample 1, F is the ECDF of sample 2, and G is the ECDF of the combined sample, then \deqn{DTS = \int_{x\in R} \left({|E(x)-F(x)| \over \sqrt{2G(x)(1-G(x))/n}}\right)^p}{DTS =  Integral (|E(x)-F(x)|/sqrt(2G(x)(1-G(x))/n))^p} for all x.
The test p-value is calculated by randomly resampling two samples of the same size using the combined sample. Intuitively the DTS test improves on the AD test by allowing more extreme observations to carry more weight. At a higher level -- CVM/AD/KS/etc only require ordinal data. DTS (and Wasserstein) gain power because they take advantages of the properties of interval data -- i.e. the distances have some meaning. However, DTS, like Anderson-Darling (AD) also downweights noisier observations relative to Wass, thus (hopefully) giving it extra power.

In the example plot below, the DTS statistic is the shaded area between the ECDFs, weighted by the variances -- shown by the color of the shading.

\figure{dts.png}{Example Wasserstein stat plot}

Inputs \code{a} and \code{b} can also be vectors of ordered (or unordered) factors, so long as both have the same levels and orderings. When possible, ordering factors will substantially increase power. The dts test will assume the distance between adjacent factors is 1.
}
\section{Functions}{
\itemize{
\item \code{dts_test()}: Permutation based two sample test

\item \code{two_sample()}: Recommended two-sample test

\item \code{dts_stat()}: Permutation based two sample test

}}
\examples{
set.seed(314159)
vec1 = rnorm(20)
vec2 = rnorm(20,0.5)
dts_stat(vec1,vec2)
out = dts_test(vec1,vec2)
out
summary(out)
plot(out)
two_sample(vec1,vec2)

# Example using ordered factors
vec1 = factor(LETTERS[1:5],levels = LETTERS,ordered = TRUE)
vec2 = factor(LETTERS[c(1,2,2,2,4)],levels = LETTERS, ordered=TRUE)
dts_test(vec1,vec2)
}
\seealso{
\code{\link[=wass_test]{wass_test()}}, \code{\link[=ad_test]{ad_test()}} for the predecessors of this test statistic. \href{https://arxiv.org/abs/2007.01360}{arXiv:2007.01360} or \url{https://codowd.com/public/DTS.pdf} for details of this test statistic
}
