This repository contains a collection of R scripts used to test packages designed to conduct data quality assessments or provide useful features in some phases of the data quality evaluation.
Each folder in the scripts+reports/
directory refers to an R package and uses SHIP-based example data and metadata to generate a sample report.
Reviewed R packages (in alphabetical order):
- assertable
- assertive
- assertr
- clickR
- DataExplorer
- dataquieR
- dataReporter
- DescTools
- dlookr
- DQAstats
- ExPanDaR
- explore
- funModeling
- inspectdf
- IPDFileCheck
- MOQA
- mStats
- observer
- pointblank
- sanityTracker
- skimr
- SmartEDA
- StatMeasures
- summarytools
- testdat
- validate
- visdat
- xray
J. Mariño, E. Kasbohm, S. Struckmann, L.A. Kapsner, and C.O. Schmidt, R Packages for Data Quality Assessments and Data Monitoring: A Software Scoping Review with Recommendations for Future Developments, Applied Sciences. (2022) 26. doi:10.3390/app12094238.
@article{
title = {R {{Packages}} for {{Data Quality Assessments}} and {{Data Monitoring}}: {{A Software Scoping Review}} with {{Recommendations}} for {{Future Developments}}},
author = {Mari{\~n}o, Joany and Kasbohm, Elisa and Struckmann, Stephan and Kapsner, Lorenz A and Schmidt, Carsten O},
year = {2022},
journal = {Applied Sciences},
pages = {26},
doi = {10.3390/app12094238},
langid = {english}
}
Suggestions and improvements on this collection are very welcome and can be made through issues or pull requests here on GitHub or via e-mail.