- chore: updating semantic release config (
b30ea37
)
- feat: remove various transformer warnings and fix training documentation (#25)
Co-authored-by: Curtis Ruck <ruckc@DESKTOP-ME5SH6R> (6f2e1d1
)
-
chore: adding torch 1.13 to dev deps to help CI run tests (
8abff7b
) -
chore: updating docs to discourage multiple sentences per string (
e77c5a2
) -
chore: adding an integration test for unk chars strings (
1c21345
)
- feat: removing logger config for release (
7947f8b
)
-
Add ability to set which gpu to use (#23)
-
updated handling of 'use_gpu' option to allow specifying gpu index to use
-
added some error handling with logging messages to the FrameSemanticTransformer constructor
-
fixed the last update so that the format string now is only set for local logger and not for the root logger (
c62aa07
)
-
chore: updating bibtex (
0e4af39
) -
chore: pin poetry v1.4.0 so furo will install in CI (
24dc8c7
) -
chore: pin poetry v1.4.0 so furo will install in CI (
f8ca051
) -
chore: add CITATION.cff (
223cfce
) -
chore: adding citation bibtex info to README (
b5435fa
)
-
fix: Align trigger-marked sentence to original sentence (#19)
-
trigger identification task alignment fix
-
black formatting
-
updated test cases
Co-authored-by: Jacob Striebel <striebel@users.noreply.github.com> (6683a22
)
- fix: auto-download omw-1.4 for inference (
343906c
)
-
feat: new models trained on Framenet exemplars (#18)
-
include exemplars in framenet training
-
skipping invalid trigger exemplars
-
skip exemplars by default during training
-
fixing tests
-
improving data augmentations
-
ensure wordnet download for inference
-
updating snapshots
-
adding more info when augmentations fail validation
-
adding more augmentations from nlpaug
-
fixing linting
-
fixing keyboard augmentation
-
more checks on keyboard augmentation
-
tweaking augmentations
-
fixing tests
-
adding safety check to uppercase augmentation
-
lower augmentation rate
-
adding more augmentations
-
tweaking augs
-
removing debugging output
-
reduce augmentation
-
tweaking augmentation probs
-
tweaking augmentation probs
-
fixing type import
-
adding option to delete non-optimal models as training progresses
-
tweaking augmentations
-
updating models
-
updating README with new model stats (
3f937fb
)
-
chore: serialize eval logs before writing json (
40983ff
) -
chore: fixing missing links for readthedocs (
fbca04e
) -
chore: explicitly installing furo in readthedocs (
a581825
) -
chore: try using python 3.9 for readthedocs build (
55c712f
) -
chore: manually install torch in readthedocs (
1ad58a9
) -
chore: setting up docs page with sphinx and readthedocs (#17)
-
setting up docs page with sphinx and readthedocs
-
ignore docs for flake8 linting (
8d3c5fd
) -
chore: create outputs dir when logging if it doesn't exist (
7629560
) -
chore: adding option to log eval failures (
f68fb61
)
-
feat: Propbank (#16)
-
setting up propbank loaders
-
skip light verbs for now
-
trying a different approach to avoid lv verbs
-
fixing typo
-
just ignore non-existent frames
-
use similar lu norm to framenet
-
adding option to resume from checkpoint
-
switching to propbank 3.4 instead of 3.1
-
fixing propbank nltk paths
-
removing debuggin prints
-
fixing test
-
adding optional LR decay (
4c53887
)
- add readthedocs link to readme (
bbeeed8
)
-
perf: minor performance improvements for arg extraction and frame detection (#15)
-
save best model chkpt based on val loss
-
Adding loader setup to training
-
add more optional config for loggers/callbacks during training
-
adding explicit logging for test/train/val loss to end of epochs
-
rever to default PL logging behavior if no loggers are provided
-
adding helpers for model evaluations
-
try to standardize arg extraction output
-
standardize punct in args extraction
-
use fast tokenizer for sent clenanup
-
switch to just using tokenizer cleanup for speed
-
run clean_up_tokenization just once before arg extraction, not for each arg
-
fixing val_metrics err (
7e03969
)
- chore: remove poetry.lock from demo docker build (
2f8ffb9
)
- fix: fixing errors when no frames are found (#14) (
ef2424c
)
- feat: adding support for running inference on multiple sentences in batches (#11) (
e6423e5
)
- chore: fixing README badge after shields.io breaking change (
bd90fef
)
-
feat: Multilingual training refactor (#10)
-
WIP refactoring to make it easier to train on different framenets
-
Making evaluate runnable directly to evaluate pretrained models
-
tweaking tests
-
refactoring training / eval scripts
-
add validation that loaders match model
-
updating README
-
cleaning up typing
-
use 3.8 for CI
-
updating semantic release (
7bf7ae5
)
- fix: updating README stats (
76e4e75
)
-
feat: Frame classification hints (#3)
-
adding in lexical unit data for smarter frame classification
-
adding in stemming for lu handling
-
allow skipping validation in initial epochs for faster training
-
use self.current_epoch instead of batch_idx
-
using bigrams to reduce the amount of frame suggestions
-
refactoring bigrams stuff and adding more tests
-
fixing bug with trigger bigrams
-
updating README
-
updating model revision (
201ed51
)
-
fixing typo in demo server (
a15ef6d
) -
improving demo UI (#4)
-
improving demo UI
-
adding secret 'model' param to client (
8bf3275
) -
UI improvements for demo (
69e85af
)
- fix: make trimmed batch contiguous (#2) (
21aee70
)
- fix: add torch.no_grad() to batch trimming step (
8b8a401
)
-
fix: adding LICENSE into pypi description (
c6e0a42
) -
fix: adding README into pypi description (
1b99551
)
- chore: adding badges to README (
ac00793
)
- feat: adding a helper to trim unnecessary padding chars for faster training / generation (#1) (
58e58a8
)
-
fix: reverting to older lock file for mypy (
08f0c63
) -
fix: relaxing transformers version req (
57464a1
)
- feat: autopublish (
a0900ff
)
-
fix: pinning old version of semantic-release plugin (
85f3a62
) -
fix: adding fetch-depth 0 for release (
6dab4e6
) -
fix: autopublish to pypi (
3591c27
)
-
restrict model revisions (
9a36fc4
) -
adding explanation about lightning_logs dir (
7898bba
) -
updating README and improving train script (
94d7fac
) -
Create LICENSE (
f73fc0e
) -
augment training samples dynamically during training (
3b40f07
) -
adding tests for chain_augmentations (
57f24dc
) -
adding write permissions to publish job (
d4b3e22
) -
try checkout v2 (
7883323
) -
try adding token to checkout action (
dce1651
) -
adding link to demo in README (
f0b632b
) -
fix node version in gh action (
7db37cb
) -
add an action to publish the website (
182504f
) -
augment data for train but not val or test (
660919e
) -
adding data augmentation (
13de3f4
) -
adding small size model and lazy-loading for the nltk + models (
bda0ca8
) -
adding a demo client using create-react-app (
2a673d1
) -
try restricting batch size to 2 to avoid excessive memory use (
4f2e58c
) -
try reducing to 1 thread to save memory (
cfc4d21
) -
adding cors support to flask (
4d463cb
) -
increase gunicorn timeout (
acd42a1
) -
try adding poetry.lock to speed up docker build (
86c9bcb
) -
bump to trigger cloud run build (
c56e5b4
) -
bump to trigger cloud run build (
a6d3094
) -
remove poetry.lock from docker build (
805adf4
) -
adding a dockerizer flask server for demo purposes (
8f192ac
) -
fixing typo (
b7b9e34
) -
adding a base FrameSemanticTransformer class to make it easy to parse sentences into frames (
c0b78cf
) -
refactoring TaskSample classes into Tasks and TaskSamples (
667f85e
) -
fixing tests (
eab2f96
) -
more efficient loading of frame elements from framenet (
4655768
) -
add a check for invalid output in eval (
0e9eb88
) -
add a check for invalid output in eval (
81a829a
) -
eval arg id similar to how sesame does it (
73b2db9
) -
try adding in all possible frame elements into task intro for argument extraction (
94e3c89
) -
updating frame id samples to be closer to how sesame does it (
39376e5
) -
fixing evaltuate function to work with batches predictions (
7ca93d1
) -
force tranformers v4.18.0 to keep mypy happy (
ac2c6a6
) -
using multiple predictions when evaluating frame id task (
d28961c
) -
fixing typo (
197b93f
) -
trying to add eval into training (
d22080b
) -
limiting task rebalancing ratio (
451eb6c
) -
adding in task mix balancing (
ecde0b2
) -
moving T5Tokenizer.cleanup into standardize_punct method (
12ccf38
) -
Trying out built-in clean up tokenization method (
5f4a723
) -
allow tweaking predict params in eval (
a549041
) -
tweaking trigger processing to hopefully be more amenable to how the tokenizer works (
2103f4d
) -
more readable eval print (
726126d
) -
adding option to print eval failures (
d23bf6f
) -
adding a punct standardization step (
d08877e
) -
fixing linting (
9250636
) -
tweaking frame id eval to match sesame logic (
162f50f
) -
removing sample from dataloader, as it appears to break things (
0e1dee0
) -
fixing trigger samples and adding tests (
6f19673
) -
adding logging statements inside training function (
a28a706
) -
refactoring based on simple-t5 (
d89e5db
) -
fixing evaluate typing (
3cd3c33
) -
fixing future annotations (
16a227c
) -
fixing bug in py 3.7 (
ea953fc
) -
refactoring and adding a target id task (
37302bb
) -
adding total to tqdm iteration (
058a12c
) -
fixing device issues (
49a01f3
) -
fixing typing (
6012b09
) -
more efficient eval processing (
4bbee5c
) -
add tqdm for eval progress (
69d98ac
) -
tweaking evaluate (
0f3db83
) -
adding evaluate / predict helpers (
be08ec1
) -
adding fulltext filenames from sesame for eval (
081533b
) -
removing validation loop end as well (
1c5dfb2
) -
removing return from training loop end (
ace9621
) -
adding in closure... (
4b8e184
) -
updating optimzier_step (
ea79146
) -
fixing typo (
7aa0a49
) -
moving dataset generation out of the tuner (
ac9f90f
) -
adding future annotations stuff (
611a1a5
) -
adding future annotations stuff (
b6c8a53
) -
setting up a model for training (
abfc3d8
) -
skipping confusing frames for now (
1888b8c
) -
adding helper for parsing examples from docs (
8856b2e
) -
fixing mypy (
14e4832
) -
fixing black formatting (
a24193a
) -
initial commit (
4df6628
)