\name{gx.2dproj}
\alias{gx.2dproj}
\title{ Function to Compute and Display 2-d Projections for Data Matrices }
\description{
Function computes and displays 2-d projections of data matrices using either Sammon Non-linear Mapping (default), Multidimensional Scaling, Kruskal's non-metric Multidimensional Scaling (see Venables and Ripley (2001) and Cox and Cox (2001)).  The original S-Plus implementation also computed the Minimum Spanning Tree plane projection (Friedman and Rafsky, 1981) as it was available in the Venables and Ripley MASS library for S-Plus.  However, the R implememntation of the MASS library does not include Minimum Spanning Trees.  In the R implementation, Projection Pursuit has been added using the fastICA procedure of Hyvarinen and Oja (2000).  Provision is made to optionally trim individuals (rows) from the input data matrix.
}
\usage{
gx.2dproj(xx, proc = "sam", ifilr = FALSE, log = FALSE, rsnd = FALSE, snd = FALSE,
	range = FALSE, main = "", setseed = FALSE, row.omits = NULL, ...)
}
\arguments{
  \item{xx}{ the\code{n} by \code{p} matrix for which the 2-d projection is required. }
  \item{proc}{ the 2-d projection procedure required, the default is \code{proc = "sam"} for Sammon Non-Linear Mapping.  For Classic (metric) Multidimensional Scaling use \code{proc = "mds"}, for Kruskal's non-metric Multidimensional Scaling use \code{"iso"}, and for Projection Pursuit use \code{"ica"}. }
  \item{ifilr}{ optional isometric log-ratio transformation, the default is no transformation.  Recommended for closed compositionl, geochemical, data, when \code{ifilr = TRUE} all other transformations are ignored. }
  \item{log}{ optional (natural) log transformation of the data, the default is no log transformation.  For a log transformation set \code{log = TRUE}. } 
  \item{rsnd}{ optional robust normalization of the data with matrix column medians and MADs, the default is no transformation.  For a robust normalization set \code{rsnd = TRUE}. }
  \item{snd}{ optional normalization of the data with matrix column means and standard deviations, the default is no transformation.  For a normalization set \code{snd = TRUE}.  If \code{rsnd = TRUE}, then \code{snd} will be set to \code{FALSE}. }
  \item{range}{ optional range transformation for the matrix columns, the data values being scaled to between zero and one for, respectively, the minimum and maximum column values.  If the data are range transformed, other normalization transformation requests will be ignored. }
  \item{main}{ an alternative plot title, see Details below. }
  \item{row.omits}{ permits rows, individuals, to be trimmed from the input matrix, the default \code{row.omits = NULL} is for no trimming.  To trim individuals enter their row numbers as a concatenated string, e.g. \code{row.omits = c(13,15,16)}.  The list may be extended by adding additional row numbers so as to display the 2-d structure of the remaining core data and whether further multivariate outliers are present. }
  \item{setseed}{ sets the random number seed for \code{fastICA} so that all runs result in the same projection, and that projection is generally similar to the Sammon projection on the \code{ilr} transformed Howarth - Sinding-Larsen data set. } 
  \item{\dots}{ further arguments to be passed to methods concerning the generated plots.  For example, if smaller plotting characters are required, specify \code{cex = 0.8}; or if some colour other than black is required for the plotting characters, specify \code{col = 2} to obtain red (see \code{\link{display.lty}} for the default colour palette).  If it is required to make the plot title smaller, add \code{cex.main = 0.9} to reduce the font size by 10\%. }
}
\details{
If \code{main} is undefined a default plot title is generated by appending the input matrix name to the text string \code{"2-d Projection for: "}.  If no plot title is required set \code{main = " "}, or if a user defined plot title is required it should be defined in \code{main}, e.g., \code{main = "Plot Title Text"}.

Firstly, it is strongly recommended that if the input data matrix is for data from a closed compositional, geochemical, data matrix that an ilr transform be applied to the data, \code{ifilr = TRUE}.  This has the effect of reducing the dimension of the data matrix from \code{p} to \code{(p-1)}.  Otherwise, it is desirable to normalize, centre and scale, or undertake a range transformation on the data to ensure the variables have equal \sQuote{weight} in the projections.  If no transformation is requested a warning message is displayed.  

The x- and y-axis labels are set appropriately to indicated the type of 2-d projection in the display.

A measure of the \sQuote{stress} in generating the 2-d projection is estimated and displayed, low stress indicates the projection faithfully represents the relative \sQuote{positions} of the data in the original \code{p}-space.
}
\value{
The following are returned as an object to be saved for further use:
  \item{main}{ the plot title. }
  \item{input}{ a text string containing the name of the \code{n} by \code{p} matrix containing the data, and a list of the row numbers of any individuals trimmed, if none are trimmed the entry is \code{NULL}.  }
  \item{usage}{ The projection option selected, and the values, \code{TRUE} or \code{FALSE}, for the ilr, log, robust normalization, normalization, and range transformation options. }
  \item{xlab}{ the 2-d projection x-axis label. }
  \item{ylab}{ the 2-d projection y-axis label. }
  \item{matnames}{ the individal, sample, row identifiers and the names of the input variables.  If there are no individual, sample, row identifiers then row numbers are used.  If an ilr transform has been used the variable names will be the \code{(p-1)} synthetic ilr variable names.  If a trim has been executed only the row identifiers for the remaining data are stored. }
  \item{row.numbers}{ the row numbers of the individuals, samples, remaining after a trim.  If a trim has been executed only the row numbers for the remaining data are stored. }
  \item{x}{ the n x-axis values for the 2-d projection. }
  \item{y}{ the n y-axis values for the 2-d projection. }
  \item{stress}{ the estimated stress of fitting 2-d projection to the \code{p}-space data. }
}
\note{
Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see \code{\link{ltdl.fix.df}}.

Any rows in the data matrix with with \code{NA}s are removed prior to computing the 2-d projection.  In the instance of an ilr transformation \code{NA}s have to be removed prior to undertaking the transformation, see \code{\link{remove.na}}.

The results of repeated executions of the \sQuote{fastICA} implementation of Projection Pursuit lead to various mirror images of one another unless \code{set.seed} is used to ensure each execution commences with the same seed.

This function requires that packages MASS (Venables and Ripley) and fastICA (Marchini, Heaton and Ripley) both be available.
}
\references{
Cox, T.F. and Cox, M.A.A., 2001. Multidimensional Scaling. Chapman and Hall, 308 p.

Friedman, J.H. and Rafsky, L.C., 1981. Graphics for the multivariate two-sample problem.  Journal of the American Statistical Association, 76(374):277-291.

Hyvarinen, A. and Oja, E., 2000. Independent Component Analysis: Algorithms and Applications. Neural Networks, 13(4-5):411-430.

Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Ltd., 362 p.

Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistics with S-Plus, 3rd Edition. Springer, 501 p.
}
\author{ Robert G. Garrett }
\seealso{ \code{\link{ltdl.fix.df}}, \code{\link{remove.na}}, \code{\link{gx.2dproj.plot}}, \code{\link{sammon}}, \code{\link{cmdscale}}, \code{\link{isoMDS}}, \code{\link{fastICA}}, \code{\link{set.seed}} }
\examples{
## Make test data available
data(sind.mat2open)

## Display default, Sammon non-linear map, 2-d projection
sind.2dproj <- gx.2dproj(sind.mat2open, ifilr = TRUE)

## Display saved object identifying input matrix row numbers (cex = 0.7),
## and with an alternate main title (cex.main = 0.8) 
gx.2dproj.plot(sind.2dproj, rowids = TRUE, cex = 0.7, cex.main = 0.8,
	main = "Howarth & Sinding-Larsen\nStream Sediment ilr Transformed Data")

## Display Kruskal's non-metric multidimensional scaling 2-d projection
sind.2dproj <- gx.2dproj(sind.mat2open, proc = "iso", ifilr = TRUE)

## Display saved object identifying input matrix row numbers (cex = 0.7),
## and with an alternate main title (cex.main = 0.8) 
gx.2dproj.plot(sind.2dproj, rowids = FALSE, cex = 0.7, cex.main = 0.8, 
	main = "Howarth & Sinding-Larsen\nStream Sediment ilr Transformed Data")

## Display default, Sammon non-linear map, 2-d projection, removing the three
## most extreme individuuals
sind.2dproj.trim3 <- gx.2dproj(sind.mat2open, ifilr = TRUE, row.omits = c(13,15,16))

## Clean-up
rm(sind.2dproj)
rm(sind.2dproj.trim3)
}
\keyword{ multivariate }
\keyword{ hplot }

