% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/chi_square_goodness_of_fit.R
\name{chi_square_goodness_of_fit}
\alias{chi_square_goodness_of_fit}
\title{\emph{\strong{Chi square goodness of
fit statistics}} at each MCMC sample w.r.t. a given dataset.}
\usage{
chi_square_goodness_of_fit(
  StanS4class,
  dig = 3,
  h = StanS4class@dataList$h,
  f = StanS4class@dataList$f
)
}
\arguments{
\item{StanS4class}{An S4 object of class \emph{\code{ \link{stanfitExtended}}} which is an inherited class from the S4 class  \code{\link[rstan]{stanfit}}.
This \R object is a fitted model object
 as a return value of the function \code{\link{fit_Bayesian_FROC}()}.

It can be passed to \code{\link{DrawCurves}()}, \code{\link{ppp}()}  and ... etc}

\item{dig}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{...??}.   A positive integer representing   the Significant digits, used in stan Cancellation.
Default = 5,}

\item{h}{A vector of positive integers,
representing the number of hits.
This variable was made in order to
 substitute the hits data drawn
 from the posterior predictive distributions.
In famous Gelman's book, he explain how
to use the test statistics in the Bayesian context.
In this context I need to substitute the replication
 data from the posterior predictive distributions.}

\item{f}{A vector of positive integers,
representing the number of false alarms.
This variable was made in order to
 substitute the false alarms data drawn
 from the posterior predictive distributions.
In famous Gelman's book, he explain
how to use the test statistics in the
Bayesian context. In this context I need
to substitute the replication data from
the posterior predictive distributions.}
}
\value{
Chi squares for each MCMC sample.
\deqn{\chi^2 = \chi^2 (D|\theta_i),i=1,2,...,N}
So, the return values is a vector of length \eqn{N} which denotes
 the number of MCMC iterations
 except the warming up period.
 Of course if MCMC is not only one chain,
  then all samples of chains are used to calculate the chi square.

  In the sequel, we use the notations

   for a prior \eqn{\pi(\theta)},

   posterior \eqn{\pi(\theta|D)},

     likelihood \eqn{f(D|\theta)},

     parameter \eqn{\theta},

   datasets \eqn{D}, for example, we can write as follows;

 \deqn{ \pi(\theta|D) \propto f(D|\theta) \pi(\theta).}



 Let us denote the \strong{posterior MCMC samples} of
 size  \eqn{N} for a given data-set \eqn{D} by

   \deqn{\theta_1, \theta_2, \theta_3,...,\theta_N}


   which are drawn from
   posterior \eqn{\pi(\theta|D)} of given data \eqn{D}.


Recall that the chi square goodness of fit statistics \eqn{\chi}
depends on the model parameter \eqn{\theta} and data \eqn{D},
namely,

\deqn{\chi^2 = \chi^2 (D|\theta)}.

The function calculates a vector
of length \eqn{N} whose components is given by:



\deqn{\chi^2 (D|\theta_1), \chi^2 (D|\theta_2), \chi^2 (D|\theta_3),...,\chi^2 (D|\theta_N),}

So, the return value is a vector of size  \eqn{N}.

 As an application of this return value  \eqn{(\chi^2(D|\theta_i);i=1,...,N)},
  we  can calculate
   the posterior mean of \eqn{\chi = \chi (D|\theta)},
    namely, we get

\deqn{ \chi^2 (D) =\int \chi^2 (D|\theta)  \pi(\theta|D)     d\theta.}


as its Monte Carlo integral



      \deqn{\frac{1}{N} \sum _{i=1} ^N \chi^2(D|\theta_i),}


In my model, almost all example,
 result of calculation shows that


\deqn{  \int \chi^2 (D|\theta)  \pi(\theta|D)     d\theta > \chi^2 (D| \int \theta   \pi(\theta|D)     d\theta) }


The above inequality is true for all \eqn{D}?? I conjecture it.

Revised 2019 August 18
Revised 2019 Sept. 1
Revised 2019 Nov 28


 Our data is \strong{2C categories}, that is,


 the number of hits        :h[1], h[2], h[3],...,h[C] and

 the number of false alarms: f[1],f[2], f[3],...,f[C].


   Our model has \strong{C+2 parameters}, that is,

   the thresholds of the bi normal assumption z[1],z[2],z[3],...,z[C] and

   the mean and standard deviation of the signal distribution.


 So, the degree of freedom of this statistics is calculated by


     No. of categories  -  No. of parameters  - 1 = 2C-(C+2)-1 =C -3.

This differ from Chakraborty's  result C-2. Why ?
}
\description{
Calculates a vector, consisting of \emph{\strong{the Goodness of Fit (Chi Square)}}
for a given dataset \eqn{D} and
 each posterior MCMC samples \eqn{\theta_i=\theta_i(D), i=1,2,3,....}, namely,


   \deqn{ \chi^2 (D|\theta_i)}



for \eqn{i=1,2,3,....} and thus its dimension is the number of MCMC iterations..

Note that
 In MRMC cases, it is defined as follows.

\deqn{\chi^2(D|\theta) := \sum_{r=1}^R \sum_{m=1}^M \sum_{c=1}^C \biggr( \frac{[ H_{c,m,r}-N_L\times p_{c,m,r}(\theta)]^2}{N_L\times p_{c,m,r}(\theta)}+\frac{[F_{c,m,r}-(\lambda _{c} -\lambda _{c+1} )\times N_{L}]^2}{(\lambda_{c}(\theta) -\lambda_{c+1}(\theta) )\times N_{L} }\biggr).}
where  a dataset \eqn{D} consists of the
pairs of the number of False Positives and the number of True
Positives  \eqn{ (F_{c,m,r}, H_{c,m,r}) }
together with the number of lesions \eqn{N_L}
and the number of images \eqn{N_I} and
 \eqn{\theta} denotes the model parameter.
}
\details{
To calculate the chi square (goodness of fit) \eqn{\chi^2 (y|\theta)}
test statistics, the two variables
are required; one is an observed dataset \eqn{y}
 and the other is an estimated parameter \eqn{\theta}.
In the classical chi square values,
MLE(maximal likelihood estimator) is used
for an estimated parameter \eqn{\theta} in \eqn{\chi^2 (y|\theta)}.
However, in the Bayesian context,
the parameter is not deterministic and
we consider it is a random variable such as
samples from the posterior distribution.
And such samples are obtained in the Hamiltonian
Monte Carlo Simulation.
Thus we can calculate chi square values for each MCMC sample.
}
\examples{

\dontrun{
#========================================================================================
#                Synthesize the MCMC samples from a dataset.
#========================================================================================

       fit <- fit_Bayesian_FROC(BayesianFROC::dataList.Chakra.1,
                           ite = 1111,
                           summary =FALSE,
                           cha = 2)

#========================================================================================
#   The chi square discrepancies are calculated by the following code
#========================================================================================

         Chi.Square.for.each.MCMC.samples   <-   chi_square_goodness_of_fit(fit)





#========================================================================================
# With Warning
#========================================================================================

         chi_square_goodness_of_fit(fit)

#========================================================================================
# Without warning
#========================================================================================

          chi_square_goodness_of_fit(fit,
                                     h=fit@dataList$h,
                                     f=fit@dataList$f)






#========================================================================================
#  Get posterior mean of the chi square discrepancy.
#========================================================================================


                    m<-   mean(Chi.Square.for.each.MCMC.samples)



#========================================================================================
# The author read at 2019 Sept. 1, it helps him. Thanks me!!
#
# Calculate the p-value for the posterior mean of the chi square discrepancy.
#========================================================================================

                                 stats::pchisq(m,df=1)

#========================================================================================
# Difference between chi sq. at EAP and EAP of chi sq.
#========================================================================================


   mean( fit@chisquare - chi_square_goodness_of_fit(fit))


}# dottest

}
