Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update DataSet and DataSetLoader #97

Merged
merged 3 commits into from
Oct 22, 2018
Merged

update DataSet and DataSetLoader #97

merged 3 commits into from
Oct 22, 2018

Conversation

choosewhatulike
Copy link
Member

now we can use different DataSetLoader to load data from files and convert into DataSet

@FengZiYjun
Copy link
Contributor

FengZiYjun commented Oct 20, 2018

For sequence labeling, in training, set_target(truth=False). In testing, set_target(truth=True).
This is because sequence labeling model with CRF computes loss in its forward() if truth given, and the loss needs an intermediate result.
But the naming is quite wired.
Can we develop a better machenism to handle this?

Currently,

If only x is given, output prediction.
x ----(forward)-----(viterbi)---> prediction

If x and ground truth are given, output loss.
x -----(forward)----(crf)-->  loss
truth

Before,

                                      truth
x ----(forward)----> y  ---(crf)----> y_t ------> loss
                    |---(viterbi)---> prediction

Maybe a tricky way,

x ---- (forward) ----->  y (record this value) -----(viterbi, still in forward)----> prediction
y + truth ----(crf, in loss method)---->  loss

Copy link
Contributor

@FengZiYjun FengZiYjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants