% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/SHAP_funcs.R
\name{shap.plot.summary}
\alias{shap.plot.summary}
\title{SHAP summary plot core function using the long-format SHAP values}
\usage{
shap.plot.summary(data_long, x_bound = NULL, dilute = FALSE,
  scientific = FALSE, my_format = NULL)
}
\arguments{
\item{data_long}{a long format data of SHAP values from \code{\link{shap.prep}}}

\item{x_bound}{in case need to limit x_axis_limit}

\item{dilute}{a number or logical, dafault to TRUE, will plot \code{nrow(data_long)/dilute} data.
for example, if dilute = 5 will plot 1/5 of the data.
If dilute = TRUE or a number, we will plot at most half points per feature, so the plot won't be too slow.
If you put dilute too high, at least 10 points per feature would be kept.
If the dataset is even smaller than that, will just plot all the data.}

\item{scientific}{show the mean|SHAP| in scientific format or not
default to F, label format is 0.000,
If true, label format is 0.0E-0,}

\item{my_format}{supply your own number format if you really want to do so}
}
\value{
returns a ggplot2 object, could add further layers.
}
\description{
The summary plot (sina plot) uses a long-format data of SHAP values. The long-format data
could be obtained from either xgboost model or a SHAP matrix using \code{\link{shap.values}}.
If you want to start with xgbmodel and data_X, use \code{\link{shap.plot.summary.wrap1}}.
If you want to use self-derived SHAP matrix, use \code{\link{shap.plot.summary.wrap2}}.
If a global list named \strong{new_labels} is provided (\code{!is.null(new_labels}),
the plots will use that list to replace default labels \code{\link{labels_within_package}}.
}
\examples{
data("iris")
X1 = as.matrix(iris[,-5])
mod1 = xgboost::xgboost(
  data = X1, label = iris$Species, gamma = 0, eta = 1,
  lambda = 0,nrounds = 1, verbose = FALSE)


# shap.values(model, X_dataset) returns the SHAP
# data matrix and ranked features by mean|SHAP|
shap_values <- shap.values(xgb_model = mod1, X_train = X1)
shap_values$mean_shap_score
shap_values_iris <- shap_values$shap_score

# shap.prep() returns the long-format SHAP data from either model or
shap_long_iris <- shap.prep(xgb_model = mod1, X_train = X1)
# is the same as: using given shap_contrib
shap_long_iris <- shap.prep(shap_contrib = shap_values_iris, X_train = X1)

# **SHAP summary plot**
shap.plot.summary(shap_long_iris, scientific = TRUE)
shap.plot.summary(shap_long_iris, x_bound  = 1.5, dilute = 10)

# Alternatives options to make the same plot:
# option 1: from the xgboost model
shap.plot.summary.wrap1(mod1, X = as.matrix(iris[,-5]), top_n = 3)

# option 2: supply a self-made SHAP values dataset
# (e.g. sometimes as output from cross-validation)
shap.plot.summary.wrap2(shap_values_iris, X1, top_n = 3)
}
