% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/subsample_data.r
\name{get_subsample_definitions}
\alias{get_subsample_definitions}
\title{Retrieve descriptive data for samples from literature}
\usage{
get_subsample_definitions(country = NULL, loadtype = "t", species = "PCAB")
}
\arguments{
\item{country}{Can be either the number of desired samples, or a
named vector of relative subsample sizes where the names can be
abbreviations of country names. Alternatively, \code{country} can also
be a character vector of country abbreviations.}

\item{loadtype}{Can be either \code{"be"} for "bending edgewise" or
\code{"t"} for "tension".}

\item{species}{A species code according to EN 13556:2003. Currently, only
'PCAB' (\emph{Picea abies} = Norway spruce) is supported.}
}
\value{
A data frame with country and subsample names,
relative subsample sizes and some meta-information like project and
literature references, as well as mean
strength and standard deviation of strength, static modulus of elasticity and
density.
}
\description{
In the \code{WoodSimulatR} package, means and standard deviations of grade
determining properties (GDPs) for a number of Norway spruce
(\emph{Picea abies}) samples from literature are stored for use in
\code{\link{simulate_dataset}}. They are indexed by a two-letter country code
(and a suffixed number if disambiguation is required).
}
\details{
The direct descriptive data can also be directly accessed
(\code{\link{gdp_data}}).
The present function is meant to prepare the data
as input to the \code{subsets} argument of \code{\link{simulate_dataset}}.
It allows picking multiple samples from the same country (disambiguating by
creating appropriately named entries in the column \code{subsample}) and
creating random sample data (uniformly distributed within the
range of values given in the full dataset \code{\link{gdp_data}}
for the respective \code{loadtype} and \code{species}) for sample names not
found in this dataset.

The dataset \code{\link{gdp_data}} contains a column \code{share} which gives
the number of pieces in the original sample. Unless relative subsample sizes
are explicitly asked for by providing a named numeric vector for the
argument \code{country}, the present function always resets \code{share} to
1, prompting \code{\link{simulate_dataset}} to create
(approximately) equal-sized subsamples.

The GDPs depend on the type of destructive testing done
(\code{loadtype}) -- therefore, giving the proper \code{loadtype} is
required for realistic values.

If \code{country} is \code{NULL} (or omitted), the full dataset
\code{\link{gdp_data}} for the
respective \code{loadtype} (and \code{species}) is returned.

For sample names not contained in the internal list, a warning is issued
and random sample data is returned (uniformly distributed within the
range of values given in the full table for the respective \code{loadtype}
and \code{species}).

If \code{country} is just a number (and \emph{not} a named vector), also random
sample data is returned; the different "countries" are then named "C1", "C2"
and so on.
}
\section{Notes}{


The GDP values collected in \code{\link{gdp_data}} were selected from
publications which aimed at representative sampling within the respective
countries.
All the same, care must be taken when using these values,
due to the natural high variability of timber properties.
}

\examples{

# get all subsample data for loadtype bending, or tension
get_subsample_definitions()
get_subsample_definitions(loadtype='be')

# get six random samples, explicitly state loadtype tension
get_subsample_definitions(country=6, loadtype='t')

# get subsample data for the German tension sample in different ways
get_subsample_definitions(country='de', loadtype='t')
get_subsample_definitions(country=c(de=1), loadtype='t')
get_subsample_definitions(country=c(de=6), loadtype='t')

# bending samples from Sweden (both samples), Poland, and France, equally
# weighted
get_subsample_definitions(c('se', 'se_1', 'pl', 'fr'))
get_subsample_definitions(c(se=1, se_1=1, pl=1, fr=1))
get_subsample_definitions(c(se=5, se_1=5, pl=5, fr=5))

# four tension samples from Romania, two from Ukraine and one from Slovakia,
# weighted so that each country contributes equally
get_subsample_definitions(c(ro=1, ro=1, ro=1, ro=1, ua=2, ua=2, sk=4), loadtype='t')

# non-existant subsample names get replaced by random values (which are based
# on the range of stored values for the respective loadtype)
get_subsample_definitions(c('xx', 'yy', 'zz'))
get_subsample_definitions(c('xx', 'yy', 'zz'), loadtype='t')

# subsample names are case-sensitive!
get_subsample_definitions(c('at', 'aT', 'At', 'AT'), loadtype='t')

}
