Skip to content
This repository has been archived by the owner on Feb 1, 2019. It is now read-only.

Latest commit

 

History

History
53 lines (40 loc) · 2.3 KB

roadmap.md

File metadata and controls

53 lines (40 loc) · 2.3 KB

Long term project milestones

  1. Identify soil carbon data sets of interest to community
  2. Harmonization scripts targeting the identified data sets
  3. Educational material for soil science community regarding data management and best practices

Versions (with DOI) will be established at major content development milestones that can either include a significant number of datasets harmonized (10+) or refinement of other content.

We always need:

  1. Copy editing for code and documentation
  2. Code review

Current TODO

  1. QA/QC scripts that identify likely duplicate values and flag likely outliers for different variables
  2. Guides for
    • data contribution
    • repository use for meta-analysis
    • how to hold a hackathon
  3. Scripts input and output from/to the MPI/Powell Center radiocarbon and fractionation data efforts
  4. Scripts outputing to ISRIC
  5. Transfer identified data sets from 'Issues' to a markdown table
  6. Function to fetch files
  7. Add filter to ingest scripts to only load certain variables

One year goal (2018)

  • Establish working protocols for project contributions that follow best practices for open source community
  • 10 new datasets identified and harmonized by members of the community
  • 2 output format scripts developed that tie into other projects
  • Develop hackathon/workshop material
  • First DOI for project repository
    • needs: script to remove duplicates, script to flag bad value ranges, output script, input script
  • Identify interested funders
  • Evaluate possible migration to SQL or other database
  • Transfer ownership of repository to ISCN organizational account

Five year goal (2018-2023)

  • Continue haromonizing 12 new datasets per year
  • Transfer variables of interest to ISRIC
  • Online lesson for data management, contributions to the project, and utilization of data product for meta analysis
  • Formal ontology based on data ingestion keys to automate common data ingestions
  • Use of harmonization package in 1 scientific product a year by general community
  • Best practice for data rescue from non-machine readable source
  • Funding for project coordinators and community events

Ten year goal (2018-2028)

  • Expand backend to deal with high dimentional data (spectral, 'omics, and high resolution temporal)
  • Funding for project coordinators, community events, and key contributors