% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ts_feature_correction.R
\name{plr_kmeans_test}
\alias{plr_kmeans_test}
\title{Statistical k-means Test}
\usage{
plr_kmeans_test(
  df,
  var_list,
  mean_ratio = 0.7,
  plot = FALSE,
  file_path,
  file_name,
  set_cutoff = FALSE
)
}
\arguments{
\item{df}{A df containing pv data. Should be 'cleaned' by \code{\link{plr_cleaning}}.}

\item{var_list}{A list of the dataframe's standard variable names, obtained from
the output of \code{\link{plr_variable_check}}.}

\item{mean_ratio}{This scales the higher identified cluster's mean for comparison.
Higher values will be more likely to identify the second mean as soiled, and vice versa.
Values should range from 0 to 1.}

\item{plot}{optional; Boolean; whether to return the box plot generated by the method
to identify outliers.}

\item{file_path}{optional; location to store the boxplot if plot is set TRUE.
Note this is not necessary if you select to plot - only if you wish to save it.}

\item{file_name}{optional; name of file to save boxplot if plot is set to TRUE.}

\item{set_cutoff}{Defaults to FALSE; pass a numeric value to cut off all slopes
less than the cutoff value. This bypasses entirely the outlier and clustering
calculuations to remove slope values you believe to be soiled.}
}
\value{
The method returns a dataframe containing the values that should be
removed. If you want to discard them, try using dplyr::filter().
}
\description{
The method builds linear models by day, identifies outliers,
and performs 2-means clustering by slopes. If the lower identified
cluster is significantly less than the higher mean, and constitutes less
than 25\% of the data, it is identified as soiled and returned. Otherwise, the
outlier points are identified as soiled and returned.
}
