% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/curate_swda_data.R
\name{curate_swda_data}
\alias{curate_swda_data}
\title{Curate SWDA data}
\usage{
curate_swda_data(dir_path)
}
\arguments{
\item{dir_path}{Character string. Path to the directory containing .utt
files. Must be an existing directory.}
}
\value{
A data frame containing the curated SWDA data with columns:
\itemize{
\item doc_id: Document identifier
\item damsl_tag: Dialog act annotation
\item speaker_id: Unique speaker identifier
\item speaker: Speaker designation (A or B)
\item turn_num: Turn number in conversation
\item utterance_num: Utterance number
\item utterance_text: Actual spoken text
}
}
\description{
Process and curate Switchboard Dialog Act (SWDA) data by reading all .utt
files from a specified directory and converting them into a structured
format.
}
\details{
The function expects a directory containing .utt files or subdirectories
with .utt files, as found in the raw SWDA data
(Linguistic Data Consortium. LDC97S62: Switchboard Dialog Act Corpus.)
}
\examples{
# Example using simulated data bundled with the package
example_data <- system.file("extdata", "simul_swda", package = "qtkit")
swda_data <- curate_swda_data(example_data)

str(swda_data)

}
