\name{CleanSemAceDataset}
\alias{CleanSemAceDataset}
\title{
Produces a cleaned dataset that works well with when using SEM to estimate a univariate ACE model.
}
\description{
This function takes a 'GroupSummary' \code{data.frame} (which is created by the \code{RGroupSummary} function) and returns a \code{data.frame} that is used by the \code{Ace} function.
}
\usage{
CleanSemAceDataset(dsDirty, dsGroupSummary, oName_1, oName_2, rName = "R")
}

\arguments{
  \item{dsDirty}{This is the \code{data.frame} to be cleaned.}
  \item{dsGroupSummary}{The \code{data.frame} containing information about which groups should be included in the analyses.  It should be created by the \code{RGroupSummary} function.}
  \item{oName_1}{The name of the manifest variable (in \code{dsDirty}) for the first subject in each pair.}
  \item{oName_2}{The name of the manifest variable (in \code{dsDirty}) for the second subject in each pair.}
  \item{rName}{The name of the variable (in \code{dsDirty}) indicating the pair's relatedness coefficient.}
}
\details{
  The function takes \code{dsDirty} and produces a new \code{data.frame} with the following features:
  \enumerate{
    \item Only three existing columns are retained: \code{O1}, \code{O2}, and \code{R}.  They are assigned these names.
    \item A new column called \code{GroupID} is created to reflect their group membership (which is based on the \code{R} value).  These valuesa re sequential integers, starting at 1.  The group with the weakest \code{R} is 1.  The group with the strongest \code{R} has the largest \code{GroupID} (this is typically the MZ tiwns).
    \item Any row is excluded if it has a missing data point for \code{O1}, \code{O2}, or \code{R}.
    \item The \code{data.frame} is sorted by the \code{R} value.  This helps program against the multiple-group SEM API sometimes.
  }
}
\value{
A \code{data.frame} with one row per subject pair.  The \code{data.frame} contains the following variables (which can NOT be changed by the user through optional parameters):
\item{ R }{The pair's \code{R} value.}
%\item{ Included }{Boolean value indicating if the group should be included in an SEM}
%\item{ PairCount }{Boolean value indicating if the group should be included in an SEM}
\item{ O1 }{The outcome variable for the first subject in each pair.}
\item{ O2 }{The outcome variable for the second subject in each pair.}
\item{ GroupID }{ Indicates the pair's group membership.}
}
%\references{
%% ~put references to the literature/web site here ~
%}
\author{
Will Beasley
}
%\note{
%%  ~~further notes~~
%}

%% ~Make other sections like Warning with \section{Warning }{....} ~

%\seealso{
%% ~~objects to See Also as \code{\link{help}}, ~~~
%}
\examples{
library(NlsyLinks) #Load the package into the current R session.
dsLinks <- Links79PairExpanded #Start with the built-in data.frame in NlsyLinks
dsLinks <- dsLinks[dsLinks$RelationshipPath=='Gen2Siblings', ] #Use only NLSY79-C siblings

oName_1 <- "MathStandardized_1" #Stands for Outcome1
oName_2 <- "MathStandardized_2" #Stands for Outcome2
dsGroupSummary <- RGroupSummary(dsLinks, oName_1, oName_2)

dsClean <- CleanSemAceDataset( dsDirty=dsLinks, dsGroupSummary, oName_1, oName_2, rName="R" )
summary(dsClean)

dsClean$AbsDifference <- abs(dsClean$O1 - dsClean$O2)
plot(jitter(dsClean$R), dsClean$AbsDifference, col="gray70")
}
% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{ ACE }
