June 2023 hackathon priorities

We will have a mostly in-person hackathon in Boston on June 7 12-4 PM. The goal will be to get close to a 0.6 release that can be completed within a week or so of the hackathon (if not same day). This page can be used to document priorities for the hackathon for different individuals. I will make some suggestions here but feel free to move around.

High priority

TM - Update docker models and implement new general models
~~TM - Get arbitrary multiple task/multiple dataset invocations working (#137, #138)~~
AD - DAPT implementation
EG - Add a flag so that --do_eval can print out gold labels in a separate column in predictions file for easier error analysis (no issue yet)
~~WZ - Implement --do_predict to write outputs for all task types (no issue yet)~~
SW - Create script for transforming label studio export outputs into cnlpt input format for MDRs

Medium priority

~~TM - Clean up train_system (move arguments into separate file)~~
~~TM - Add cnlpt version to model config~~
TM - Check version number when reading models and do something intelligent
AD - Warned dataclasses (#46)
DX - Finalize concept normalization (#3)
EG - Use pipelines? (#60)
WZ - Model selection score argument (#96)

Low priority

TM, EG - Add generation as a task mode to allow encoder-decoder models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

June 2023 hackathon priorities

High priority

Medium priority

Low priority

Clone this wiki locally