Skip to content

June 2023 hackathon priorities

Tim Miller edited this page Jun 2, 2023 · 11 revisions

We will have a mostly in-person hackathon in Boston on June 7 12-4 PM. The goal will be to get close to a 0.6 release that can be completed within a week or so of the hackathon (if not same day). This page can be used to document priorities for the hackathon for different individuals. I will make some suggestions here but feel free to move around.

High priority

  • TM - Update docker models and implement new general models
  • TM - Get arbitrary multiple task/multiple dataset invocations working (#137, #138)
  • AD - DAPT implementation
  • EG - Add a flag so that --do_eval can print out gold labels in a separate column in predictions file for easier error analysis (no issue yet)
  • WZ - Implement --do_predict to write outputs for all task types (no issue yet)
  • SW - Create script for transforming label studio export outputs into cnlpt input format for MDRs

Medium priority

  • TM - Clean up train_system (move arguments into separate file)
  • TM - Add cnlpt version to model config
  • TM - Check version number when reading models and do something intelligent
  • AD - Warned dataclasses (#46)
  • DX - Finalize concept normalization (#3)
  • EG - Use pipelines? (#60)
  • WZ - Model selection score argument (#96)

Low priority

  • TM, EG - Add generation as a task mode to allow encoder-decoder models