% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/classiKernel.R
\name{classiKernel}
\alias{classiKernel}
\title{Create a kernel estimator for functional data classification}
\usage{
classiKernel(classes, fdata, grid = 1:ncol(fdata), h = 1, metric = "L2",
  ker = "Ker.norm", nderiv = 0L, derived = FALSE,
  deriv.method = "base.diff", custom.metric = function(x, y, ...) {    
  return(sqrt(sum((x - y)^2))) }, custom.ker = function(u) {    
  return(dnorm(u)) }, ...)
}
\arguments{
\item{classes}{[\code{factor(nrow(fdata))}]\cr
factor of length \code{nrow(fdata)} containing the classes of the observations.}

\item{fdata}{[\code{matrix}]\cr
matrix containing the functional observations as rows.}

\item{grid}{[\code{numeric(ncol(fdata))}]\cr
numeric vector of length \code{ncol(fdata)} containing the grid on which the functional observations were
evaluated.}

\item{h}{[numeric(1)]\cr
controls the bandwidth of the kernel function. All kernel functions \code{ker} should be
implemented to have bandwidth = 1. The bandwidth is controlled via \code{h}
by using \code{K(x) = ker(x/h)} as the kernel function.}

\item{metric}{[\code{character(1)}]\cr
character string specifying the (semi-)metric to be used.
For a an overview of what is available see the
\code{method} argument in \code{\link{computeDistMat}}. For a full list
execute \code{\link{metricChoices}()}.}

\item{ker}{[numeric(1)]\cr
character string describing the kernel function to use. Available are
amongst others all kernel functions from \code{\link[fda.usc]{Kernel}}.
For the full list execute \code{\link{kerChoices}()}.
The usage of customized kernel function is symbolized by
\code{ker = "custom.ker"}. The customized function can be specified in
\code{custom.ker}}

\item{nderiv}{[\code{integer(1)}]\cr
order of derivation on which the metric shall be computed.
The default is 0L.}

\item{derived}{[\code{logical(1)}]\cr
Is the data given in \code{fdata} already derived? Default is set to \code{FALSE},
which will lead to numerical derivation if \code{nderiv >= 1L} by applying
\code{\link[fda]{deriv.fd}} on a \code{\link[fda]{Data2fd}} representation of
\code{fdata}.}

\item{deriv.method}{[\code{character(1)}]\cr
character indicate which method should be used for derivation. Currently
implemented are \code{"base.diff"}, the default, and \code{"fda.deriv.fd"}.
\code{"base.diff"} uses the method \code{base::\link[base]{diff}} for equidistant measures
without missing values, which is faster than transforming the data into the
class \code{\link[fda]{fd}} and deriving this using \code{fda::\link[fda]{deriv.fd}}.
The second variant implies smoothing, which can be preferable for calculating
high order derivatives.}

\item{custom.metric}{[\code{function(x, y, ...)}]\cr
only used if \code{deriv.method = "custom.method"}.
A function of functional observations
\code{x} and \code{y} returning their distance.
The default is the L2 distance.
See how to implement your distance function in \code{\link[proxy]{dist}}.}

\item{custom.ker}{[function(u)]\cr
customized kernel function. This has to be a function with exactly one parameter
\code{u}, returning the numeric value of the kernel function
\code{ker(u)}. This function is only used if \code{ker == "custom.ker"}.
The bandwidth should be constantly equal to 1 and is controlled via \code{h}.}

\item{...}{further arguments to and from other methods. Hand over additional arguments to
\code{\link{computeDistMat}}, usually additional arguments for the specified
(semi-)metric. Also, if \code{deriv.method == "fda.deriv.fd"} or
\code{fdata} is not observed on a regular grid, additional arguments to
\code{\link{fdataTransform}} can be specified which will be passed on to
\code{\link[fda]{Data2fd}}.}
}
\value{
\code{classiKernel} returns an object of class \code{'classiKernel'}.
This is a list containing  at least the
following components:
 \describe{
  \item{\code{classes}}{a factor of length nrow(fdata) coding the response of
  the training data set.}
  \item{\code{fdata}}{the raw functional data as a matrix with the individual
  observations as rows.}
  \item{\code{proc.fdata}}{the preprocessed data (missing values interpolated,
  derived and evenly spaced). This data is \code{this.fdataTransform(fdata)}.
  See \code{this.fdataTransform} for more details.}
  \item{\code{grid}}{numeric vector containing the grid on which \code{fdata}
  is observed)}
  \item{\code{h}}{numeric value giving the bandwidth to be used in the kernel function.}
  \item{\code{ker}}{character encoding the kernel function to use.}
  \item{\code{metric}}{character string coding the distance metric to be used
  in \code{\link{computeDistMat}}.}
  \item{\code{nderiv}}{integer giving the order of derivation that is applied
  to fdata before computing the distances between the observations.}
  \item{\code{this.fdataTransform}}{preprocessing function taking new data as
  a matrix. It is used to transform \code{fdata} into \code{proc.fdata} and
  is required to preprocess new data in order to predict it. This function
  ensures, that preprocessing (derivation, respacing and interpolation of
  missing values) is done in the exact same way for the original
  training data set and future (test) data sets.}
  \item{\code{call}}{the original function call.}
 }
}
\description{
Creates an efficient kernel estimator for functional data
classification. Currently
supported distance measures are all \code{metrics} implemented in \code{\link[proxy]{dist}}
and all semimetrics suggested in Fuchs et al. (2015).
Additionally, all (semi-)metrics can be used on a derivative of arbitrary
order of the functional observations.
For kernel functions all kernels implemented in \code{\link[fda.usc:fda.usc-package]{fda.usc}}
are available as well as custom kernel functions.
}
\examples{
# How to implement your own kernel function
data("ArrowHead")
classes = ArrowHead[,"target"]

set.seed(123)
train_inds = sample(1:nrow(ArrowHead), size = 0.8 * nrow(ArrowHead), replace = FALSE)
test_inds = (1:nrow(ArrowHead))[!(1:nrow(ArrowHead)) \%in\% train_inds]

ArrowHead = ArrowHead[,!colnames(ArrowHead) == "target"]

# custom kernel
myTriangularKernel = function(u) {
  return((1 - abs(u)) * (abs(u) < 1))
}

# create the model
mod1 = classiKernel(classes = classes[train_inds], fdata = ArrowHead[train_inds,],
                    ker = "custom.ker", h = 2, custom.ker = myTriangularKernel)

# calculate the model predictions
pred1 = predict(mod1, newdata = ArrowHead[test_inds,], predict.type = "response")

# prediction accuracy
mean(pred1 == classes[test_inds])

# create another model using an existing kernel function
mod2 = classiKernel(classes = classes[train_inds], fdata = ArrowHead[train_inds,],
                    ker = "Ker.tri", h = 2)

# calculate the model predictions
pred2 = predict(mod1, newdata = ArrowHead[test_inds,], predict.type = "response")

# prediction accuracy
mean(pred2 == classes[test_inds])
\dontrun{
# Parallelize across 2 CPU's
library(parallelMap)
parallelStartSocket(2L) # parallelStartMulticore for Linux
predict(mod1, newdata =  fdata[test_inds,], predict.type = "prob", parallel = TRUE, batches = 2L)
parallelStop()
}
}
\references{
Fuchs, K., J. Gertheiss, and G. Tutz (2015):
Nearest neighbor ensembles for functional data with interpretable feature selection.
Chemometrics and Intelligent Laboratory Systems 146, 186 - 197.
}
\seealso{
predict.classiKernel
}
