% Generated by roxygen2 (4.0.1): do not edit by hand
\name{modStats}
\alias{modStats}
\title{Calculate common model evaluation statistics}
\usage{
modStats(mydata, mod = "mod", obs = "obs", statistic = c("n", "FAC2",
  "MB", "MGE", "NMB", "NMGE", "RMSE", "r", "COE", "IOA"), type = "default",
  rank.name = NULL, ...)
}
\arguments{
\item{mydata}{A data frame.}

\item{mod}{Name of a variable in \code{mydata} that respresents modelled
values.}

\item{obs}{Name of a variable in \code{mydata} that respresents measured
values.}

\item{statistic}{The statistic to be calculated. See details below
for a description of each.}

\item{type}{\code{type} determines how the data are split
i.e. conditioned, and then plotted. The default is will produce
statistics using the entire data. \code{type} can be one of the
built-in types as detailed in \code{cutData} e.g. \dQuote{season},
\dQuote{year}, \dQuote{weekday} and so on. For example, \code{type
= "season"} will produce four sets of statistics --- one for each
season.

It is also possible to choose \code{type} as another variable in
the data frame. If that variable is numeric, then the data will be
split into four quantiles (if possible) and labelled
accordingly. If type is an existing character or factor variable,
then those categories/levels will be used directly. This offers
great flexibility for understanding the variation of different
variables and how they depend on one another.

More than one type can be considered e.g. \code{type = c("season",
"weekday")} will produce statistics split by season and day of the
week.}

\item{rank.name}{Simple model ranking can be carried out if
\code{rank.name} is supplied. \code{rank.name} will generally
refer to a column representing a model name, which is to
ranked. The ranking is based the COE performance, as that
indicator is arguably the best single model performance indicator
available.}

\item{...}{Other aruments to be passed to \code{cutData} e.g.
\code{hemisphere = "southern"}}
}
\value{
Returns a data frame with model evaluation statistics.
}
\description{
Function to calculate common numerical model evaluation statistics with
flexible conditioning
}
\details{
This function is under development and currently provides some common model
evaluation statistics. These include (to be mathematically defined later):

\itemize{

\item \eqn{n}, the number of complete pairs of data.

\item \eqn{FAC2}, fraction of predictions within a factor of two.

\item \eqn{MB}, the mean bias.

\item \eqn{MGE}, the mean gross error.

\item \eqn{NMB}, the normalised mean bias.

\item \eqn{NMGE}, the normalised mean gross error.

\item \eqn{RMSE}, the root mean squared error.

\item \eqn{r}, the Pearson correlation coefficient. Note, can also
supply and aurument \code{method} e.g. \code{method = "spearman"}

\item \eqn{COE}, the \emph{Coefficient of Efficiency} based on
Legates and McCabe (1999, 2012). There have been many suggestions
for measuring model performance over the years, but the COE is a
simple formulation which is easy to interpret.

A perfect model has a COE = 1. As noted by Legates and McCabe
although the COE has no lower bound, a value of COE = 0.0 has a
fundamental meaning. It implies that the model is no more able to
predict the observed values than does the observed
mean. Therefore, since the model can explain no more of the
variation in the observed values than can the observed mean, such
a model can have no predictive advantage.

For negative values of COE, the model is less effective than the
observed mean in predicting the variation in the
observations.
\item \eqn{IOA}, the Index of Agreement based on Willmott et
al. (2011), which spans between -1 and +1 with values approaching
+1 representing better model performance.

An IOA of 0.5, for example, indicates that the sum of the
error-magnitudes is one half of the sum of the observed-deviation
magnitudes.  When IOA = 0.0, it signifies that the sum of the
magnitudes of the errors and the sum of the observed-deviation
magnitudes are equivalent. When IOA = -0.5, it indicates that the
sum of the error-magnitudes is twice the sum of the perfect
model-deviation and observed-deviation magnitudes. Values of IOA
near -1.0 can mean that the model-estimated deviations about O are
poor estimates of the observed deviations; but, they also can mean
that there simply is little observed variability - so some caution
is needed when the IOA approaches -1.
}

All statistics are based on complete pairs of \code{mod} and \code{obs}.

Conditioning is possible through setting \code{type}, which can be
a vector e.g. \code{type = c("weekday", "season")}.

Details of the formulas are given in the openair manual.
}
\examples{
## the example below is somewhat artificial --- assuming the observed
## values are given by NOx and the predicted values by NO2.

modStats(mydata, mod = "no2", obs = "nox")

## evaluation stats by season

modStats(mydata, mod = "no2", obs = "nox", type = "season")
}
\author{
David Carslaw
}
\references{
Legates DR, McCabe GJ. (1999). Evaluating the use of goodness-of-fit
measures in hydrologic and hydroclimatic model validation. Water
Resources Research 35(1): 233-241.

Legates DR, McCabe GJ. (2012). A refined index of model
performance: a rejoinder, International Journal of Climatology.

Willmott, C.J., Robeson, S.M., Matsuura, K., 2011. A
refined index of model performance. International Journal of
Climatology.
}
\keyword{methods}

