% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/hyreg2_het.R
\name{hyreg2_het}
\alias{hyreg2_het}
\title{function for model estimation for EQ-5D valueset data accounting for heteroscedasticity in continous data}
\usage{
hyreg2_het(
  formula,
  formula_sigma = NULL,
  data,
  type,
  type_cont,
  type_dich,
  k = 1,
  control = NULL,
  stv = NULL,
  stv_sigma = NULL,
  offset = NULL,
  opt_method = "BFGS",
  optimizer = "optim",
  lower = -Inf,
  upper = Inf,
  latent = "both",
  id_col = NULL,
  classes_only = FALSE,
  variables_both = NULL,
  variables_dich = NULL,
  variables_cont = NULL,
  ...
)
}
\arguments{
\item{formula}{linear model \code{formula}. Using \verb{|x} will include a grouping variable \code{x}. see Details.}

\item{formula_sigma}{linear \code{formula} linear formula for sigma estimation. If \code{formula_sigma} is not provided,
\code{formula} (excluding any grouping variables) is used by default, see Details}

\item{data}{a \code{data.frame} containing the data. see Details.}

\item{type}{either the name of the column in \code{data} containing an indicator of whether an
observation is continuous or dichotomous (as \code{character}), or a \code{vector} containing the indicator.
see Details.}

\item{type_cont}{value of \code{type} referring to continuous data. see Details.}

\item{type_dich}{Value of \code{type} referring to dichotomous data. see Details.}

\item{k}{\code{numeric}. Number of latent classes to be estimated via \code{\link[flexmix:flexmix]{flexmix::flexmix()}}.}

\item{control}{control list for \code{\link[flexmix:flexmix]{flexmix::flexmix()}}.}

\item{stv}{\verb{named vector} or \code{list} of named vectors containing start values for all coefficients
formula, including sigma and theta, see Details}

\item{stv_sigma}{\verb{named vector} with start values for sigma estimation. Names must correspond to the variables
as given in \code{formula_sigma}, see Details}

\item{offset}{offset as in \code{\link[flexmix:flexmix]{flexmix::flexmix()}}}

\item{opt_method}{\code{charachter}, optimization method to be used in optimizer, default \code{"BFGS"}}

\item{optimizer}{\code{charachter},optimizer to be used in \code{\link[bbmle:mle2]{bbmle::mle2()}}, default \code{"optim"}}

\item{lower}{\code{numeric}, lower bound for censored data, default \code{-INF}. If this is used, \code{opt_method} must be set to \code{"L-BFGS-B"},}

\item{upper}{\code{numeric}, upper bound for censored data, default \code{INF}. If this is used, \code{opt_method} must be set to \code{"L-BFGS-B"},}

\item{latent}{\code{charachter},data type to use in component identification, must be one of \code{"both"}, \code{"cont"} or \code{"dich"},
default  \verb{“both”},  see Details}

\item{id_col}{\code{character}, name of the grouping variable, only needed if \verb{latent != "both”}, see Details}

\item{classes_only}{\code{logical}, default \code{FALSE}, indicates whether the function should perform only
classification, rather than both classification and model estimation, only possible for \code{latent != "both"},
see Datails}

\item{variables_both}{\verb{character vector}; variables to be fitted on both continuous and dichotomous data.
If not specified, all variables from \code{formula} are used. If provided and not all variables from \code{formula}
are included, \code{variables_cont} and \code{variables_dich} must be provided as well, while one of them can be \code{NULL},
see Details.}

\item{variables_dich}{\code{character} vector; variables to be fitted only on dichotomous data, if provided,
\code{variables_both} and \code{variables_cont} must be provided as well.}

\item{variables_cont}{\code{character} vector; variables to be fitted only on continous data. If provided,
\code{variables_both} and \code{variables_dich} must be provided as well.}

\item{...}{additional arguments for \code{\link[flexmix:flexmix]{flexmix::flexmix()}} or \code{\link[bbmle:mle2]{bbmle::mle2()}}}
}
\value{
model object of type flemix, coefficients named ..._h are coefficients for heteroscedasticity
}
\description{
Estimation of hybrid model for EQ-5D data
}
\details{
see details of different inputs listed below
}
\section{formula}{

a typical R formula of the form \verb{y ~ x1 + x2 + …} should be provided.
Additionally, it is possible to include a grouping variable for repeated measures by using
\verb{“| xg”} where \code{xg} is the column containing the group-memberships. The resulting formula will look
like this:  \verb{y ~ x1 + x2 +… | xg}.  In \code{flexmix}, this is called the concomitant variable specification:
the  model is fit conditional on grouping, so that all observations with the same group are treated
as belonging together when computing likelihood contributions. One possible grouping variable can be
an id number to identify answers by the same participants. We highly recommend using a grouping variable,
since otherwise the algorithm for k = 2 tends to classify all continuous data into one estimated class
and all dichotomous data into the other.
}

\section{data}{

a dataframe having the following columns: all independent variables (x)
and the dependent variable y used in \code{formula}, one column for the grouping variable xg if grouping
should be used, e.g. id numbers of participants with repeated measurements, one column indicating
if the observations belongs to continuous or dichotomous data with the entries \code{type_cont}
and \code{type_dich} (e.g., for a column called \code{"type"} with the entries "TTO" for continuous datapoints
and "DCE" for dichotomous datapoints, \code{type_cont} will be "TTO" and \code{type_dich} will be "DCE").
One row should match one observation (one datapoint).
}

\section{start values (stv)}{

if the same start values \code{stv} are to be used for all latent classes,
the given start values must be a \verb{named vector}. Otherwise (if different start values are assumed for
each latent class), a \code{list} of named vectors should be used . In this case, there must be one entry
in the list for each latent class.  Each start value vector must include start values for sigma and
theta. Currently, it is necessary to use the names \code{"sigma"} and \code{"theta"} for these values.
If users are unsure for which variables start values must be provided, this can be checked by
calling \code{colnames(model.matrix(formula,data))}. In this call, the \code{formula} should not include the
grouping variable.
}

\section{formula_sigma, stv_sigma}{

To account for heteroscedasticity in the data, an additional formula \code{formula_sigma} and an additional
vector of starting values for this formula (\code{stv_sigma}) can be specified.
The provided \code{formula_sigma} must be linear and the vector \code{stv_sigma} must contain start values for
all parameters used in the formula. If neither \code{formula_sigma} nor \code{stv_sigma} are provided, the same
inputs as for \code{formula} (without controlling for groups) and \code{stv} (without sigma) are used.
The estimates for \code{sigma} can be identified in the model output by the ending \code{"_h"}. It is important to note
that, when using \code{hyreg2_het}, neither \code{stv} nor \code{stv_sigma} are allowed to include \code{sigma},
because \code{sigma} is estimated with its own formula (in contrast to \code{hyreg2}, where \code{sigma} must always be
specified in \code{stv}).
}

\section{latent, id_col, classes_only}{

in some situations, it can be useful to identify the latent classes on
only one \code{type} of data while estimating the model parameters on both \code{types} of data. In such cases,
the input variable \code{latent} can be used to specify on which type of data the classification should be done.
If \verb{“cont”} or \verb{“dich”} is used, the input parameter \code{id_col} must be specified and gives the name,
i.e. a \verb{character string}, of the grouping variable for classification. Some groups may be removed from
the data, since they have only continuous or only dichotomous observations. Then in a first step,
a model is estimated only on the continuous/dichotomous data and the achieved classification is stored.
In a next step, model parameters are estimated separately for each identified class on both \code{types} of data
using this classification. The output object of \code{hyreg2} in this case is a \code{list} of k models.
Additionally, at position k+1 of the list, a data frame containing the corresponding classifications
from the first step is returned. Each element k in the list contains the estimated parameters for one
of the latent classes. When setting the input variable \code{classes_only} to \code{TRUE}, the second step is left
out and the estimated classes from step one are given as output.
}

\section{variables_both, variables_cont, variables_dich}{

It is possible to specify partial coefficients,
which are used only on continuous or dichotomous data.
\itemize{
\item Example:  Suppose different models should be specified for continuous and dichotomous  data:
\item Model continuous data: \code{y ~  x1 + x3}
\item Model dichotomous data: \code{y ~  x1 + x2}
\item The \code{formula} input to \code{hyreg2} must then include all parameters that occur in either model:
\code{y ~ x1 + x2 + x3}
\item The assignment of parameters to data types is then achieved via the input arguments \code{variables_both},
\code{variables_cont}, and \code{variables_dich}:
\item \code{variables_both} = \verb{“x1”},
\item \code{variables_cont} = \verb{“x3”} and
\item \code{variables_dich} = \verb{“x2”}.
\item Every variable included in the provided \code{formula} (except the grouping variable ) must appear in exactly
one of these vectors. One of the \code{variables_} vectors can also be \code{NULL}, if no variables should be used only on this type of the data.
}
}

\examples{

formula <- y ~  -1 + x1 + x2 + x3
formula_sigma <- y ~  x1 + x2 + x3

k <- 1
stv <- setNames(c(0.2,0,1,1),c(colnames(simulated_data_norm)[3:5],c("theta")))
stv_sigma <- setNames(c(0.2,0,1,1),c(colnames(simulated_data_norm)[3:5],c("(Intercept)")))
control = list(iter.max = 1000, verbose = 4)
rm(counter)
mod <- hyreg2_het(formula = formula,
                   formula_sigma = formula_sigma,
                    data =  simulated_data_norm,
                    type =  simulated_data_norm$type, # or "type"
                    stv = stv,
                    stv_sigma = stv_sigma,
                    k = k,
                    type_cont = "TTO",
                    type_dich = "DCE_A",
                    opt_method = "L-BFGS-B",
                    control = control,
                    latent = "both",
                    id_col = "id"
)
summary_hyreg2(mod)
}
\author{
Svenja Elkenkamp, Kim Rand and John Grosser
}
