OpenRefine (formerly Google Refine) is a popular, open source data cleaning software. rrefine enables users to programmatically trigger data transfer between R and OpenRefine. Using the functions available in this package, you can import, export, apply data cleaning operations, or delete a project in OpenRefine directly from R. There are several client libraries for automating OpenRefine tasks via Python, nodeJS and Ruby. rrefine extends this functionality to R users.
The development version of rrefine is available on GitHub and can be installed via devtools:
# install.packages("devtools")
devtools::install_github("vpnagraj/rrefine")
library(rrefine)
rrefine is also available on CRAN:
install.packages("rrefine")
library(rrefine)
The package includes the following functionality to interface with OpenRefine projects:
refine_upload()
: Upload data to a projectrefine_export()
: Export data from a projectrefine_delete()
: Delete a projectrefine_metadata()
: Retrieve metadata from all
projectsrefine_project_summary()
: Get project summary datarefine_operations()
: Apply arbitrary operations to a
projectrefine_remove_column()
: Remove a column from a
projectrefine_add_column()
: Add a column to a projectrefine_rename_column()
: Rename an existing column in a
projectrefine_move_column()
: Move a column to a new indexrefine_transform()
: Apply arbitrary text
transformationsrefine_to_lower()
: Coerce text to lowercaserefine_to_upper()
: Coerce text to uppercaserefine_to_title()
: Coerce text to title caserefine_to_null()
: Set values to NULL
refine_to_empty()
: Set text values to empty string
(""
)refine_to_text()
: Coerce value to stringrefine_to_number()
: Coerce value to numericrefine_to_date()
: Coerce value to daterefine_trim_whitespace()
: Remove leading and trailing
whitespacesrefine_collapse_whitespace()
: Collapse consecutive
whitespaces to single whitespacerefine_unescape_html()
: Unescape HTML in stringDescriptions and examples of usage are available in the package manual and vignette.
Feature requests, bug reports or other questions should be directed to the issue queue.