-
Notifications
You must be signed in to change notification settings - Fork 8
RBerkeley
BerkeleyDB is a noSQL database that implements a key-value store, which is very efficient and reliable, offering guarantees about atomic transactions, etc.
The RBerkeley package provided an R interface to BerkeleyDB but it was removed from CRAN in 2017.
It would be useful to have BerkeleyDB on CRAN in order to support applications/packages/algorithms that require disk-based storage. For example PeakSegDisk used to depend on BerkeleyDB STL, which provides an easy-to-use API for on-disk STL containers. PeakSegDisk provides an on-disk implementation of an optimal changepoint detection algorithm for genomic data, which scales to huge data sets because it is not limited by memory. However it was a real pain to get PeakSegDisk to compile on CRAN/win-builder, because they do not provide BerkeleyDB headers/libraries. It would have been great to be able to simply write LinkingTo: RBerkeley in the PeakSegDisk DESCRIPTION file, and be done. However it was easier to just re-write the required functionality in standard C++. The moral of the story is that R needs a package that provides Berkeley DB.
- address the issues that made CRAN remove BerkeleyDB.
- setup a git repo with CI / code coverage.
- either fork https://github.com/hrbrmstr/RBerkeley or continue development there if possible.
- add C code to support BerkeleyDB Standard Template Library (STL). blog API docs
- write more tests to increase code coverage.
- you can use this old version https://github.com/tdhock/PeakSegDisk/commit/190ce1c5e7774f27c38304e43a74cb0d860686c5 of PeakSegDisk as an example of how RBerkeley should/could be used – make a new repo with this code, add LinkingTo: RBerkeley, and use it for testing.
- add a vignette explaining how to use RBerkeley in R and in C++.
After this GSOC project the RBerkeley package will be back on CRAN, and package developers will be able to build algorithms/functions that take advantage of this powerful library.
Students, please contact mentors below after completing at least one of the tests below.
- Bob Rudis <bob@rud.is> is the maintainer of the last version that was on CRAN, and has agreed to mentor.
- Jeff Ryan <jeff.a.ryan@gmail.com> is the original author and can mentor.
- Toby Hocking <toby.hocking@r-project.org> proposed this project, and would be a user of RBerkeley if it was on CRAN.
Students, please do one or more of the following tests before contacting the mentors above.
- Easy: download the most recent version of RBerkeley and create an Rmd/html page showing R code and results of how to use RBerkeley.
- Medium:
- Hard:
Students, please post a link to your test results here.
Name: Basil Singh
University: Indian Institute of Technology Kanpur (India)
Degree: B.S. Economic Sciences
Solution for easy test: Solution Corresponding HTML page
Name: Abhinav Agarwal
College: Manipal Institute of Technology
Degree: B.Tech in Information Technology
Link: Easy