Skip to content

emilio-berti/gateway-database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pipeline

To run the whole pipeline at once:

bash pipeline.sh

This runs (in order):

  1. Clean v.1.0 for encoding/parsing errors: scripts/clean-gateway.
  2. Process new data to add to the database: scripts/mulder.R and scripts/tagus.R.
  3. Extract species names: scripts/extract-species-names.R.
  4. Query the species names against GBIF: harmonize-taxonomy.py.
  5. Combine v.1.0 with new data: scripts/combine.R.
  6. Harmonize taxonomy of the database: harmonize-taxonomy.R.
  7. Saves the new database as gateway-v.2.0.csv.
  8. Create summary tables to be displayed on the website: scripts/summarize.R.
  9. Display some summary statistics on the terminal.

Some of this steps can take some time. To avoid re-running already completed steps, once the step is completed succesfully an hidden (empty) file is added to the steps folder. Steps that have such files will not be re-ran. You can re-run the whole pipeline from scratch specifying the option --clean:

bash pipeline.sh --clean

To see available options and usage: bash pipeline.sh --help.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published