% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/AM.R
\name{AM}
\alias{AM}
\title{multiple-locus association mapping}
\usage{
AM(trait = NULL, fformula = NULL, availmemGb = 8, geno = NULL,
  pheno = NULL, map = NULL, ncpu = detectCores(), ngpu = 0,
  quiet = TRUE, maxit = 20)
}
\arguments{
\item{trait}{the name of the column in the phenotype data file that contains the trait data. The name is case sensitive and must match exactly the column name in the phenotype data file.}

\item{fformula}{the right hand side formula for the fixed effects.   See below for details. 
If
not specified, only an overall mean will be fitted.}

\item{availmemGb}{a numeric value. It specifies the amount of available memory (in Gigabytes). 
This should be set to the maximum practical value of available memory for the analysis.}

\item{geno}{the R  object obtained from running \code{\link{ReadMarker}}. This must be specified.}

\item{pheno}{the R  object  obtained  from running \code{\link{ReadPheno}}. This must be specified.}

\item{map}{the R object obtained from running \code{\link{ReadMap}}. If not specified, a generic map will 
be assumed.}

\item{ncpu}{a integer  value for the number of CPU that are available for distributed computing.  The default is to determine the number of CPU automatically.}

\item{ngpu}{a integer value for the number of gpu available for computation.  The default
is to assume there are no gpu available.  This option has not yet been implemented.}

\item{quiet}{a logical value. If set to \code{TRUE}, additional runtime output is printed. 
This is useful for error checking and monitoring the progress of a large analysis.}

\item{maxit}{an integer value for the maximum number of forward steps to be performed.  This will rarely need adjusting.}
}
\value{
A list with the following components:
\describe{
\item{trait}{column name of the trait being used by 'AM'.}
\item{fformula}{Right hand size formula of the fixed effects part of the linear mixed model.}
\item{indxNA}{a vector containing the row indexes of those individuals, whose trait and fixed effects data contain
missing values and have been removed from the analysis.}
\item{Mrk}{a vector with the names of the snp in strongest and significant association with the trait.If no loci are found to be 
significant, then this component is \code{NA}.}
\item{Chr}{the chromosomes on which the identified snp lie.}
\item{Pos}{the map positions for the identified snp.}
\item{Indx}{the column indexes in the marker file of the identified snp.} 
\item{ncpu}{number of cpu used for the calculations.}
\item{availmemGb}{amount of RAM in gigabytes that has been set by the user.}
\item{quiet}{ boolean value of the parameter.}
\item{extBIC}{numeric vector with the extended BIC values for the loci  found to be in  significant association with the trait.}
}
}
\description{
\code{AM} performs  association mapping within a multiple-locus linear mixed model framework. 
\code{AM}  finds the best set of 
marker loci in strongest association with a trait while simultaneously accounting for any fixed effects and the genetic background.
}
\details{
\subsection{How to perform a basic AM analysis}{

Suppose, 
\itemize{
\item{}{the snp data are contained in the file geno.txt which is a plain space separated
text file with no column headings. The file is located in the current working directory. 
It contains numeric genotype values 0, 1, and 2 for snp genotypes
AA, AB, and BB, respectively. It also contains the numeric value X for a missing genotype. }
\item{}{the phenotype data is contained in the file pheno.txt which is a plain space
separated text file containing a single column with the trait data. The first row of the file 
has the column heading 'y'. 
The file is located in the current working directory.}
\item{}{there is no map data.}
}

 To analyse these data, we would use the following three functions:
\preformatted{
  geno_obj <-  ReadMarker(filename='geno.txt', AA=0, AB=1, BB=2, type="text", missing='X')
  
  pheno_obj <- ReadPheno(filename='pheno.txt')

  res <- AM(trait='y', geno=geno_obj, pheno=pheno_obj)
}
A table of results is printed to the screen and saved in the R object \code{res}. 
}

\subsection{How to perform a more complicated AM analysis}{

Suppose, 
\itemize{
\item{}{the snp data are contained in the file geno.ped which is a 'PLINK' ped file. See
\code{\link{ReadMarker}} for details. The file is located in /my/dir. Let's assume 
the file is large, say 50 gigabytes,   and our computer only has 32 gigabytes of RAM.}
\item{}{the phenotype data is contained in the file pheno.txt which is a plain space
separated text file with  six columns. The first row of the file contains the column headings. 
The first column is a trait and is labeled y1.
The second column is another trait and is labeled y2. The third and fourth columns 
are nuisance variables and are labeled cov1 and cov2. The fifth and sixth columns
are the first two principal components to account for population substructure and are 
labeled pc1 and pc2. The file contains missing data that are coded as 99. 
The file is located in /my/dir.}
\item{}{the map data is contained in the file map.txt, is also located in 
 /my/dir, and the first row has the column headings.}
\item{}{An 'AM' analysis is performed where the trait of interest is y2, 
the fixed effects part of the model is cov1 + cov2 + pc1 + pc2, 
and the available memory is set to 32 gigabytes.}
} 

 To analyse these data, we would run the following:
\preformatted{
  geno_obj <-  ReadMarker(filename='/my/dir/geno.ped', type='PLINK', availmemGb=32)
  
  pheno_obj <- ReadPheno(filename='/my/dir/pheno.txt', missing=99)

  map_obj   <- ReadMap(filename='/my/dir/map.txt')

  res <- AM(trait='y2', fformula=c('cov1 + cov2 + pc1 + pc2'), 
            geno=geno_obj, pheno=pheno_obj, map=map_obj, availmemGb=32)
}
A table of results is printed to the screen and saved in the R object \code{res}. 
}

\subsection{Dealing with missing marker data}{

\code{AM} can tolerate some missing marker data. However, ideally, 
a specialized genotype imputation program such as  'BEAGLE', 'MACH', 'fastPHASE', or 'PHASE2', should be 
used to impute the missing marker data before being read into 'Eagle'.  

}

\subsection{Dealing with missing trait data}{

 \code{AM} deals automatically with individuals with missing trait data. 
These individuals are removed  from the analysis and a warning message is generated.
}

\subsection{Dealing with missing explanatory variable values}{

\code{AM} deals automatically with individuals with missing explanatory variable values. 
These individuals are removed from the analysis and a warning message is generated
}

\subsection{Error Checking}{

Most errors occur when reading in the data. However, as an extra precaution, if \code{quiet=TRUE}, then additional 
output is printed during the running of \code{AM}. If \code{AM} is failing, then this output can be useful for diagnosing 
the problem. 
}
}
\examples{
  \dontrun{ 
  # Since the following code takes longer than 5 seconds to run, it has been tagged as dontrun. 
  # However, the code can be run by the user. 
  #

  #-------------------------
  #  Example  
  #------------------------

  # read the map 
  #~~~~~~~~~~~~~~
  
  # File is a plain space separated text file with the first row 
  # the column headings
  complete.name <- system.file('extdata', 'map.txt', 
                                   package='Eagle')
  map_obj <- ReadMap(filename=complete.name) 

  # read marker data
  #~~~~~~~~~~~~~~~~~~~~
  # Reading in a PLINK ped file 
  # and setting the available memory on the machine for the reading of the data to 8  gigabytes
  complete.name <- system.file('extdata', 'geno.ped', 
                                     package='Eagle')
  geno_obj <- ReadMarker(filename=complete.name,  type='PLINK', availmemGb=8) 
 
  # read phenotype data
  #~~~~~~~~~~~~~~~~~~~~~~~

  # Read in a plain text file with data on a single trait and two covariates
  # The first row of the text file contains the column names y, cov1, and cov2. 
  complete.name <- system.file('extdata', 'pheno.txt', package='Eagle')
  
  pheno_obj <- ReadPheno(filename=complete.name)
           

 # Performing multiple-locus genome-wide association mapping with a model 
 #    with no fixed effects except for an intercept. 
 #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
  res <- AM(trait = 'y',
                           fformula=c('cov1+cov2'),
                           map = map_obj,
                           pheno = pheno_obj,
                           geno = geno_obj, availmemGb=8)
}

}
\seealso{
\code{\link{ReadMarker}}, \code{\link{ReadPheno}}, and \code{\link{ReadMap}}
}
