% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/markerQC.R
\name{check_maf}
\alias{check_maf}
\title{Identification of SNPs with low minor allele frequency}
\usage{
check_maf(
  indir,
  name,
  qcdir = indir,
  macTh = 20,
  mafTh = NULL,
  verbose = FALSE,
  interactive = FALSE,
  path2plink = NULL,
  showPlinkOutput = TRUE
)
}
\arguments{
\item{indir}{[character] /path/to/directory containing the basic PLINK data
files name.bim, name.bed, name.fam files.}

\item{name}{[character] Prefix of PLINK files, i.e. name.bed, name.bim,
name.fam.}

\item{qcdir}{[character] /path/to/directory where results will be written to.
If \code{\link{perIndividualQC}} was conducted, this directory should be the
same as qcdir specified in \code{\link{perIndividualQC}}, i.e. it contains
name.fail.IDs with IIDs of individuals that failed QC. User needs writing
permission to qcdir. Per default, qcdir=indir.}

\item{macTh}{[double] Threshold for minor allele cut cut-off, if both mafTh
and macTh are specified, macTh is used (macTh = mafTh\*2\*NrSamples).}

\item{mafTh}{[double] Threshold for minor allele frequency cut-off.}

\item{verbose}{[logical] If TRUE, progress info is printed to standard out
and specifically, if TRUE, plink log will be displayed.}

\item{interactive}{[logical] Should plots be shown interactively? When
choosing this option, make sure you have X-forwarding/graphical interface
available for interactive plotting. Alternatively, set interactive=FALSE and
save the returned plot object (p_hwe) via ggplot2::ggsave(p=p_maf,
other_arguments) or pdf(outfile) print(p_maf) dev.off().}

\item{path2plink}{[character] Absolute path to PLINK executable
(\url{https://www.cog-genomics.org/plink/1.9/}) i.e.
plink should be accesible as path2plink -h. The full name of the executable
should be specified: for windows OS, this means path/plink.exe, for unix
platforms this is path/plink. If not provided, assumed that PATH set-up works
and PLINK will be found by \code{\link[sys]{exec}}('plink').}

\item{showPlinkOutput}{[logical] If TRUE, plink log and error messages are
printed to standard out.}
}
\value{
Named list with i) fail_maf containing a [data.frame] with CHR
(Chromosome code), SNP (Variant identifier), A1 (Allele 1; usually minor), A2
(Allele 2; usually major), MAF (Allele 1 frequency), NCHROBS (Number of
allele observations) for all SNPs that failed the mafTh/macTh and ii) p_maf,
a ggplot2-object 'containing' the MAF distribution histogram which can be
shown by (print(p_maf)).
}
\description{
Runs and evaluates results from plink --freq. It calculates the minor allele
frequencies for all variants in the individuals that passed the
\code{\link{perIndividualQC}}. The minor allele frequency distributions is
plotted as a histogram.
}
\details{
\code{check_maf} uses plink --remove name.fail.IDs --freq to calculate the
minor allele frequencies for all variants in the individuals that passed the
\code{\link{perIndividualQC}}. It does so without generating a new dataset
but simply removes the IDs when calculating the statistics.

For details on the output data.frame fail_maf, check the original
description on the PLINK output format page:
\url{https://www.cog-genomics.org/plink/1.9/formats#frq}.
}
\examples{
indir <- system.file("extdata", package="plinkQC")
qcdir <- tempdir()
name <- "data"
path2plink <- '/path/to/plink'
# the following code is not run on package build, as the path2plink on the
# user system is not known.
\dontrun{
fail_maf <- check_maf(indir=indir, qcdir=qcdir, name=name, macTh=15,
interactive=FALSE, verbose=TRUE, path2plink=path2plink)
}
}
