% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/lassie.R
\name{lassie}
\alias{lassie}
\title{Local Association Measures}
\usage{
lassie(x, select, continuous, breaks, measure = "z", default_breaks = 4)
}
\arguments{
\item{x}{data.frame or matrix.}

\item{select}{optional vector of column numbers or column names
specifying a subset of data to be used. By default, uses all columns.}

\item{continuous}{optional vector of column numbers or column names specifying
continuous variables that should be discretized.
By default, assumes that every variable is categorical.}

\item{breaks}{numeric vector or list passed on to \code{\link[base]{cut}} to discretize
continuous variables. When a numeric vector is specified, break points are
applied to all continuous variables. In order to specify variable-specific
breaks, lists are used. List names identify variables and list values identify
breaks. List names are column names (not numbers). If a continuous
variable has no specified breaks, then \code{default_breaks} will be applied.}

\item{measure}{name of measure to be used:
\itemize{
\item'z': Ducher's 'z'.
\item'pmi': Pointwise mutual information.
\item'npmi': Normalized pointwise mutual information.
}}

\item{default_breaks}{default break points for discretizations.
Same syntax as in \code{\link[base]{cut}}.}
}
\value{
An instance of S3 \link[base]{class} \code{\link[zebu]{lassie}} with
  the following objects:
  \itemize{
  \item data: raw and preprocessed data.frames (see \link[zebu]{preprocess}).
  \item prob probability arrays (see \link[zebu]{estimate_prob}).
  \item global global association (see \link[zebu]{local_association}).
  \item local local association arrays (see \link[zebu]{local_association}).
  \item lassie_params parameters used in lassie.
  }
}
\description{
Estimates local (and global) association measures: Ducher's Z and pointwise mutual
information and normalized pointwise mutual information.
}
\examples{
# In this example, we will use the 'mtcars' dataset

# Selecting a subset of mtcars.
# Takes column names or numbers.
# If nothing was specified, all variables would have been used.
select <- c('mpg', 'cyl') # or select <- c(1, 2)

# Specifying 'mpg' as a continuous variables using column numbers
# Takes column names or numbers.
# If nothing was specified, all variables would have been used.
continuous <- 'mpg' # or continuous <- 1

# How should breaks be specified?
# Specifying equal-width discretization with 5 bins for all continuous variables ('mpg')
# breaks <- 5

# Specifying user-defined breakpoints for all continuous variables.
# breaks <- c(10, 15, 25, 30)

# Same thing but only for 'mpg'.
# Here both notations are equivalent because 'mpg' is the only continuous variable.
# This notation is useful if you wish to specify different break points for different variables
# breaks <- list('mpg' = 5)
# breaks <- list('mpg' = c(10, 15, 25, 30))

# Calling lassie
# Not specifying breaks means that the value in default_breaks (4) will be used.
las <- lassie(mtcars, select = c(1, 2), continuous = 1)

# Print local association to console as an array
print(las)

# Print local association and probabilities
# Here only rows having a positive local association are printed
# The data.frame is also sorted by observed probability
print(las, type = 'df', range = c(0, 1), what_sort = 'obs')

# Plot results as heatmap
plot(las)

# Plot observed probabilities using different colours
plot(las, what_x = 'obs', low = 'white', mid = 'grey', high = 'black', text_colour = 'red')

# Write results to text file
write.lassie(las, file = 'test.csv')

# Retrieve results
lassie_df <- read.table('test.csv', sep = ',', header = TRUE)

}
\seealso{
Results can be visualized using \code{\link[zebu]{plot.lassie}} and
\code{\link[zebu]{print.lassie}}
methods. \code{\link[zebu]{plot.lassie}} is only available
in the bivariate case and returns
a tile plot representing the probability or local association measure matrix.
\code{\link[zebu]{print.lassie}} shows an array or a data.frame.

Results can be saved using \code{\link[zebu]{write.lassie}}.

The \code{\link[zebu]{permtest}} function accesses the significance of local and global
association values using p-values estimated by permutations.

The \code{\link[zebu]{subgroups}} function identifies if the
association between variables is dependent on the value of another variable.
}
