RDataTracker: Collecting Provenance in an Interactive Scripting Environment

Barbara Staudt Lerner
Computer Science Department
Mt. Holyoke College
blerner@mtholyoke.edu

Emery Boose
Harvard Forest
Harvard University
boose@fas.harvard.edu

Abstract

Scientific data provenance is often cited as a valuable tool for scientists to use to document their data collection and analysis processes, allowing improved understanding and sharing of data and results. However, most software that supports data provenance requires scientists to adopt new technologies rather than adding these capabilities to technologies that scientists already use. In this paper, we introduce RDataTracker, an R library that supports the collection of data provenance from executed R scripts.