diff --git a/README.md b/README.md index 7eb76325..c754addf 100644 --- a/README.md +++ b/README.md @@ -211,6 +211,32 @@ Example usage: lines so you'll need to be careful to reconstruct this - `rails debug:saml['tmp/your_saml_to_debug.txt']` +## Data Loading + +There are three types of data that get loaded into this system: + +### Database Seeds + +These should only be loaded when the application database is initially set up (e.g. for new PR/development deploys or if the staging database needs to be destroyed and recreated). These seeds contain default values for certain tables such as copyrights, licenses, hold sources, and degree types. + +The above seed data is loaded automatically during PR builds from Github. During local development it can be loaded during first deployment by running `rails db:seed`. + +Additionally, degrees and departments can be manually seeded from a CSV file if desired by running `rails db:seed_degrees ` and `rails db:seed_departments `, respectively. See Jira project documentation for link to a Google doc with the initial list of departments and degrees that were loaded into the production database (not maintained). + +Seed data is not maintained to match the production database values, which can be changed by admin users as needed. The production database *shouldn't* ever need to be reseeded. + +### QA/Stakeholder Testing Data + +We're working on a process to load test data for stakeholder testing/QA in an automated fashion. Note this is different from fixture data used for automated tests. Check back soon for more info! + +### Registrar Data + +Thesis and author data for each term is loaded from a CSV file downloaded from the Registrar. This process is handled manually in the UI by the thesis processing team, and they have their own documentation on how they obtain the right data to load. + +Loading registrar data may also add new degrees and departments, which are then manually updated and maintained by stakeholders. + +Note: if registrar data needs to be loaded in a local, PR, or staging deployment it should be anonymized first to ensure no protected user data is added to a non-secure database. The test fixtures (test/fixtures/files) include both full and small sample files containing anonymized registrar data that can be used for this purpose. + ## Publishing workflow - stakeholders process theses until they are valid and accurate