% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/03_STEP_FWD.R
\name{stepFWD}
\alias{stepFWD}
\title{Customized stepwise (OLS & fractional logistic) regression with p-value and trend check}
\usage{
stepFWD(
  start.model,
  p.value = 0.05,
  db,
  reg.type = "ols",
  check.start.model = TRUE,
  offset.vals = NULL
)
}
\arguments{
\item{start.model}{Formula class that represents starting model. It can include some risk factors, but it can be
defined only with intercept (\code{y ~ 1} where \code{y} is target variable).}

\item{p.value}{Significance level of p-value of the estimated coefficients. For \code{WoE} coding this value is
is directly compared to the p-value of the estimated coefficients, while for \code{dummy} coding
multiple Wald test is employed and its p-value is used for comparison with selected threshold (\code{p.value}).}

\item{db}{Modeling data with risk factors and target variable. Risk factors can be categorized or continuous.}

\item{reg.type}{Regression type. Available options are: \code{"ols"} for OLS regression and \code{"frac.logit"} for
fractional logistic regression. Default is \code{"ols"}. For \code{"frac.logit"} option, target has to have
all values between 0 and 1.}

\item{check.start.model}{Logical (\code{TRUE} or \code{FALSE}), if risk factors from the starting model should be
checked for p-value and trend in stepwise process. Default is \code{TRUE}.}

\item{offset.vals}{This can be used to specify an a priori known component to be included in the linear predictor during fitting.
This should be \code{NULL} or a numeric vector of length equal to the number of cases. Default is \code{NULL}.}
}
\value{
The command \code{stepFWD} returns a list of four objects.\cr
The first object (\code{model}), is the final model, an object of class inheriting from \code{"glm"}.\cr
The second object (\code{steps}), is the data frame with risk factors selected at each iteration.\cr
The third object (\code{warnings}), is the data frame with warnings if any observed.
The warnings refer to the following checks: if risk factor has more than 10 modalities or
if any of the bins (groups) has less than 5\% of observations.\cr
The final, fourth, object \code{dev.db} returns the model development database.
}
\description{
\code{stepFWD} customized stepwise regression with p-value and trend check. Trend check is performed
comparing observed trend between target and analyzed risk factor and trend of the estimated coefficients within the
linear regression. Note that procedure checks the column names of supplied \code{db} data frame therefore some
renaming (replacement of special characters) is possible to happen. For details check help example.
}
\examples{
library(monobin)
library(LGDtoolkit)
data(lgd.ds.c)
#stepwise with discretized risk factors
#same procedure can be run on continuous risk factors and mixed risk factor types
num.rf <- sapply(lgd.ds.c, is.numeric)
num.rf <- names(num.rf)[!names(num.rf)\%in\%"lgd" & num.rf]
num.rf
#select subset of numerical risk factors
num.rf <- num.rf[1:10]
for	(i in 1:length(num.rf)) {
num.rf.l <- num.rf[i]
lgd.ds.c[, num.rf.l] <- sts.bin(x = lgd.ds.c[, num.rf.l], y = lgd.ds.c[, "lgd"])[[2]]	
}
str(lgd.ds.c)
res <- LGDtoolkit::stepFWD(start.model = lgd ~ 1, 
	   p.value = 0.05, 
	   db = lgd.ds.c[, c(num.rf, "lgd")],
	   reg.type = "ols")
names(res)
summary(res$model)$coefficients
res$steps
summary(res$model)$r.squared
}
