% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/regress.R
\name{regress}
\alias{regress}
\title{Linear Regression Model}
\usage{
regress(data, y, ..., robust = FALSE, plot = FALSE, rnd = 3)
}
\arguments{
\item{data}{Dataset}

\item{y}{Dependent variable}

\item{...}{Independent variable or multiple variables}

\item{robust}{if \code{TRUE}, robust standard errors are calculated. It is
used when heteroskedasticity is detected in data.
Otherwise, OLS standard errors are estimated.}

\item{plot}{logical: produces plots for model assumption}

\item{rnd}{specify rounding of numbers. See \code{\link{round}}.}
}
\value{
A list of three \code{data.frame} and \code{model}
}
\description{
\code{regress()} produces regression outputs which mirror
outputs from \code{STATA}.
}
\details{
\code{regress} is based on \code{\link{lm}}. All statistics presented
in the function's output are derivates of \code{\link{lm}},
except AIC value which is obtained from \code{\link{AIC}}.

\strong{Outputs}

Outputs can be divided into three parts.
\enumerate{
\item Information about the model
}

Here provides number of observations (Obs.), F value, p-value from F test,
R Squared value, Adjusted R Squared value, square root of mean square error
(Root MSE) and AIC value.
\enumerate{
\item Errors
}

Outputs from \code{anova(model)} is tabulated here. SS, DF and MS indicate
sum of square of errors, degree of freedom and mean of square of errors.
\enumerate{
\item Regression Output
}

Coefficients from summary of model are tabulated here along with 95\\%
confidence interval.

\strong{using Robust Standard Errors}

if heteroskedasticity is present in our data sample,
the ordinary least square (OLS) estimator will remain unbiased and consistent,
but not efficient. The estimated OLS standard errors
will be biased and cannot be solved with a larger sample size.
To remedy this, robust standard erros can be used to adjusted standard errors.

\deqn{Variance of Robust = (N / N - K) (X'X)^(-1) \sum{Xi X'i ei^2} (X'X)^(-1)}

where N =  number of observations, and K =  the number of regressors
(including the intercept). This returns a Variance-covariance (VCV) matrix
where the diagonal elements are the estimated heteroskedasticity-robust coefficient
variances — the ones of interest. Estimated coefficient standard errors
are the square root of these diagonal elements.

Note: Credits to Kevin Goulding, The Tarzan Blog.
}
\examples{

## use airquality dataset
data(airquality)
codebook(airquality)

summ(airquality)


## linear model for Ozone
regress(airquality, Ozone, Wind)

## run again with robust standard errors and with plots to check assumption
regress(airquality, Ozone, Wind, robust = TRUE, plot = TRUE)

## linear model with multiple predictors
regress(airquality, Ozone, Wind, Solar.R, Temp, Month, Day,
        robust = TRUE, plot = TRUE)


}
\author{
For any feedback, please contact \code{Myo Minn Oo} via:

Email: \email{dr.myominnoo@gmail.com}

Website: \url{https://myominnoo.github.io/}
}
