% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/SNPstats.R
\name{SnpStats}
\alias{SnpStats}
\title{SNP Summary Statistics}
\usage{
SnpStats(
  GenoM,
  Pedigree = NULL,
  Duplicates = NULL,
  Plot = TRUE,
  quiet = TRUE,
  calc_HWE = TRUE,
  ErrFlavour
)
}
\arguments{
\item{GenoM}{genotype matrix, in sequoia's format: 1 column per SNP, 1 row
per individual, genotypes coded as 0/1/2/-9, and row names giving individual
IDs.}

\item{Pedigree}{dataframe with 3 columns: ID - parent1 - parent2.
Additional columns and non-genotyped individuals are ignored. Used to count
Mendelian errors per SNP and (poorly) estimate the error rate.}

\item{Duplicates}{dataframe with pairs of duplicated samples}

\item{Plot}{logical, show histograms of the results?}

\item{quiet}{logical, suppress messages?}

\item{calc_HWE}{logical, calculate chi-square test for Hardy-Weinberg
equilibrium? Can be relatively time consuming for large datasets.}

\item{ErrFlavour}{DEPRECATED AND IGNORED. Was used to estimate \code{Err.hat}}
}
\value{
A matrix with a number of rows equal to the number of SNPs
 (=number of columns of GenoM), and when no Pedigree is provided 2 columns:
\item{AF}{Allele frequency of the 'second allele' (the one for which the
  homozygote is coded 2)}
\item{Mis}{Proportion of missing calls}
\item{HWE.p}{p-value from chi-square test for Hardy-Weinberg equilibrium}

When a Pedigree is provided, there are 8 additional columns:
\item{n.dam, n.sire, n.pair}{Number of dams, sires, parent-pairs successfully
  genotyped for the SNP}
\item{OHdam, OHsire}{Count of number of opposing homozygous cases}
\item{MEpair}{Count of Mendelian errors, includes opposing homozygous cases
  when only one parent is genotyped}
\item{n.dups, n.diff}{Number of duplicate pairs successfully genotyped for
  the SNP; number of differences. The latter does not count cases where one
  duplicate is not successfully genotyped at the SNP}
}
\description{
Estimate allele frequency (AF), missingness and Mendelian
errors per SNP.
}
\details{
Calculation of these summary statistics can be done in PLINK, and
  SNPs with low minor allele frequency or high missingness should be filtered
  out prior to pedigree reconstruction. This function is provided as an aid
  to inspect the relationship between AF, missingness and genotyping error to
  find a suitable combination of SNP filtering thresholds to use.

  For pedigree reconstruction, SNPs with zero or one copies of the alternate
  allele in the dataset (MAF \eqn{\le 1/2N}) are considered fixed, and
  excluded.
}
\examples{
Genotypes <- SimGeno(Ped_HSg5, nSnp=100, CallRate = runif(100, 0.5, 0.8),
                     SnpError = 0.05)
SnpStats(Genotypes)   # only plots; data is returned invisibly
SNPstats <- SnpStats(Genotypes, Pedigree=Ped_HSg5)

}
\seealso{
\code{\link{GenoConvert}} to convert from various data formats;
  \code{\link{CheckGeno}} to check the data is in valid format for sequoia
  and exclude monomorphic SNPs etc., \code{\link{CalcOHLLR}} to calculate OH
  & ME per individual.
}
