% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/user_functions.R
\name{simulate_evolData}
\alias{simulate_evolData}
\title{Simulate Data Evolution along a Tree}
\usage{
simulate_evolData(
  infoStr = NULL,
  rootData = NULL,
  tree = NULL,
  params = NULL,
  dt = 0.01,
  CFTP = FALSE,
  CFTP_step_limit = 327680000,
  n_rep = 1,
  only_tip = TRUE
)
}
\arguments{
\item{infoStr}{A data frame containing columns 'n' for the number of sites, and 'globalState' for the favoured global methylation state.
If customized initial equilibrium frequencies are given, it also contains columns 'u_eqFreq', 'p_eqFreq', and 'm_eqFreq'
with the equilibrium frequency values for unmethylated, partially methylated, and methylated.}

\item{rootData}{The output of the simulate_initialData()$data function. It represents the initial data at the root of the evolutionary tree.}

\item{tree}{A string in Newick format representing the evolutionary tree.}

\item{params}{Optional data frame with specific parameter values.
Structure as in get_parameterValues() output. If not provided, default values will be used.}

\item{dt}{Length of time step for Gillespie's Tau-Leap Approximation (default is 0.01).}

\item{CFTP}{Default FALSE. TRUE for calling cftp algorithm to set root state according to model equilibrium (Note that current implementation neglects IWE process).}

\item{CFTP_step_limit}{when CFTP = TRUE, maximum number of steps before applying an approximation method
(default 327680000 corresponding to size of CFTP info of approx 6.1 GB).}

\item{n_rep}{Number of replicates to simulate (default is 1).}

\item{only_tip}{Logical indicating whether to extract data only for tips (default is TRUE, FALSE to extract the information for all the tree branches).}
}
\value{
A list containing the parameters used (\code{$params}), the length of the time step used for the Gillespie's tau-leap approximation (\code{$dt}, default 0.01), the tree used (\code{$tree}).
simulated data and the simulated data (\code{$data}). In \code{$data}, each list element corresponds to a simulation replicate.

\itemize{
\item If only_tip is TRUE: In \code{$data}, each list element corresponds to a simulation replicate.
Each replicate includes one list per tree tip, each containing:
\itemize{
\item The name of each tip in the simulated tree (e.g. replicate 2, tip 1: \code{$data[[2]][[1]]$name}).
\item A list with the sequence of methylation states for each tip-specific structure (e.g. replicate 1, tip 2, 3rd structure: \code{$data[[1]][[2]]$seq[[3]]}.
The methylation states are encoded as 0 for unmethylated, 0.5 for partially methylated, and 1 for methylated.
}
\item If only_tip is FALSE, \code{$data} contains 2 lists:
\itemize{
\item \code{$data$branchInTree}: a list in which each element contains the information of the relationship with other branches:
\itemize{
\item Index of the parent branch (e.g. branch 2): \code{$data$branchInTree[[2]]$parent_index})
\item Index(es) of the offspring branch(es) (e.g. branch 1 (root)): \code{$data$branchInTree[[1]]$offspring_index})
}
\item \code{$data$sim_data}: A list containing simulated data. Each list element corresponds to a simulation replicate.
Each replicate includes one list per tree branch, each containing:
\itemize{
\item The name of each branch in the simulated tree. It's NULL for the tree root and inner nodes, and the name of the tips for the tree tips.
(e.g. replicate 2, branch 1: \code{$data$sim_data[[2]][[1]]$name})
\item Information of IWE events on that branch. It's NULL for the tree root and FALSE for the branches in which no IWE event was sampled,
and a list containing \code{$islands} with the index(ces) of the island structure(s) that went through the IWE event and \code{$times}
for the branch time point(s ) in which the IWE was sampled.
(e.g. replicate 1, branch 3: \code{$data$sim_data[[1]][[3]]$IWE})
\item A list with the sequence of methylation states for each structure (the index of the list corresponds to the index
of the structures). The methylation states are encoded as 0 for unmethylated, 0.5 for partially methylated, and 1 for methylated.
(e.g. replicate 3, branch 2, structure 1: \code{$data$sim_data[[3]][[2]]$seq[[1]]})
\item A list with the methylation equilibrium frequencies for each structure (the index of the list corresponds to the index
of the structures). Each structure has a vector with 3 values, the first one corresponding to the frequency of unmethylated,
the second one to the frequency of partially methylated, and the third one to the frequency of methylated CpGs.
(e.g. replicate 3, branch 2, structure 1: \code{$data$sim_data[[3]][[2]]$eqFreqs[[1]]})
}
}
}
}
\description{
This function simulates methylation data evolution along a tree. Either by simulating data at the root of the provided evolutionary tree
(if infoStr is given) or by using pre-existing data at the root (if rootData is given) and letting it evolve along the tree.
}
\examples{
# Example data
infoStr <- data.frame(n = c(10, 100, 10), globalState = c("M", "U", "M"))

# Simulate data evolution along a tree with default parameters
simulate_evolData(infoStr = infoStr, tree = "(A:0.1,B:0.1);")

# Simulate data evolution along a tree with custom parameters
custom_params <- get_parameterValues()
custom_params$iota <- 0.5
simulate_evolData(infoStr = infoStr, tree = "(A:0.1,B:0.1);", params = custom_params)

}
