% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cv.hfr.R
\name{cv.hfr}
\alias{cv.hfr}
\title{Cross validation for a hierarchical feature regression}
\usage{
cv.hfr(
  x,
  y,
  weights = NULL,
  kappa_grid = seq(0, 1, by = 0.1),
  q = NULL,
  intercept = TRUE,
  standardize = TRUE,
  nfolds = 10,
  foldid = NULL,
  partial_method = c("pairwise", "shrinkage"),
  ...
)
}
\arguments{
\item{x}{Input matrix or data.frame, of dimension \eqn{(N\times p)}{(N x p)}; each row is an observation vector.}

\item{y}{Response variable.}

\item{weights}{an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions.}

\item{kappa_grid}{A vector of target effective degrees of freedom of the regression.}

\item{q}{Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning.}

\item{intercept}{Should intercept be fitted. Default is \code{intercept=TRUE}.}

\item{standardize}{Logical flag for \code{x} variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is \code{standardize=TRUE}.}

\item{nfolds}{The number of folds for k-fold cross validation. Default is \code{nfolds=10}.}

\item{foldid}{An optional vector of values between \code{1} and \code{nfolds} identifying what fold each observation is in. If supplied, \code{nfolds} can be missing.}

\item{partial_method}{Indicate whether to use pairwise partial correlations, or shrinkage partial correlations.}

\item{...}{Additional arguments passed to \code{hclust}.}
}
\value{
A 'cv.hfr' regression object.
}
\description{
HFR is a regularized regression estimator that decomposes a least squares
regression along a supervised hierarchical graph, and shrinks the edges of the
estimated graph to regularize parameters. The algorithm leads to group shrinkage in the
regression parameters and a reduction in the effective model degrees of freedom.
}
\details{
This function fits an HFR to a grid of \code{kappa} hyperparameter values. The result is a
matrix of coefficients with one column for each hyperparameter. By evaluating all hyperparameters
in a single function, the speed of the cross-validation procedure is improved substantially (since
level-specific regressions are estimated only once).

When \code{nfolds > 1}, a cross validation is performed with shuffled data. Alternatively,
test slices can be passed to the function using the \code{foldid} argument. The result
of the cross validation is given by \code{best_kappa} in the output object.
}
\examples{
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit = cv.hfr(x, y, kappa_grid = seq(0, 1, by = 0.1))
coef(fit)

}
\references{
Pfitzinger, J. (2022).
Cluster Regularization via a Hierarchical Feature Regression.
arXiv 2107.04831[statML]
}
\seealso{
\code{\link{hfr}}, \code{\link{coef}}, \code{\link{plot}} and \code{\link{predict}} methods
}
\author{
Johann Pfitzinger
}
