-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Implementation of Weighted CRF Tagger (handling unbalanced datasets) #5676
Conversation
self.label_weights is now created as a parameter so that it will be moved to GPU whenvever the model moves.
Hi @eraldoluis, thanks for this! I may not have time for a thorough review this week but this will be a priority next week. |
Thank you, @epwalsh ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @eraldoluis, this looks really great! This is my first pass through (I still need to go through some of the code in more detail to get my head around it), but I just have some minor comments. In addition to the comments below, I will also say I think we should organize these modules into a common parent module. That is, create a new folder allennlp/modules/conditional_random_field/
and move the 3 implementations into there.
if label_weights is None: | ||
raise ConfigurationError("label_weights must be given") | ||
|
||
self.label_weights = torch.nn.Parameter(torch.Tensor(label_weights), requires_grad=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to use self.register_buffer()
here instead of defining the weights as a parameter. That way we can be sure the label weights aren't passed to the optimizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
This module uses the "forward-backward" algorithm to compute | ||
the log-likelihood of its inputs assuming a conditional random field model. | ||
|
||
See, e.g. http://www.cs.columbia.edu/~mcollins/fb.pdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add something about the weighting strategy here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
This module uses the "forward-backward" algorithm to compute | ||
the log-likelihood of its inputs assuming a conditional random field model. | ||
|
||
See, e.g. http://www.cs.columbia.edu/~mcollins/fb.pdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also have a note about the weighting strategy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
This module uses the "forward-backward" algorithm to compute | ||
the log-likelihood of its inputs assuming a conditional random field model. | ||
|
||
See, e.g. http://www.cs.columbia.edu/~mcollins/fb.pdf | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would be good to have a note about the weight strategy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
VITERBI_DECODING = Tuple[List[int], float] # a list of tags, and a viterbi score | ||
|
||
|
||
def allowed_transitions(constraint_type: str, labels: Dict[int, str]) -> List[Tuple[int, int]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this any different from the same function in conditional_random_field.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simply forgot to adapt this class. I think now it is much better.
return allowed | ||
|
||
|
||
def is_transition_allowed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here: is any different from the original?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem.
for i, j in constraints: | ||
constraint_mask[i, j] = 1.0 | ||
|
||
self._constraint_mask = torch.nn.Parameter(constraint_mask, requires_grad=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be a buffer as well, but I just realized this is how it's done in the original CRF module, so I guess I'm okay with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I didn't touch it. Although I agree with you that this should be a buffer as well.
Simplified Lannoy implementation.
Thank you a lot @epwalsh for the effort you put on this. I tried to address your first concerns. Let me know what you think about my changes. I am looking forward to your feedback regarding the whole thing. Let me know if you have any questions. I will be happy to discuss this further if necessary. |
Thanks for the quick responses/fixes! Changes look good. I should clarify what I meant by:
Looks like you left |
I liked your blog post a lot by the way! |
Renamed module allennlp.modules.conditional_random_field_weight to ...conditional_random_files
Yes. I was unsure at first. But now I renamed the module to I also updated Let me know what do you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'll follow up with the PR in allennlp-models next
Closes #4619 .
Dependency of allennlp-models PR #341
Changes proposed in this pull request:
allennlp/modules/conditional_random_field_**<strategy>**.py
, where can be:wemission
,wtrans
, orlannoy
._input_likelihood(...)
and_joint_likelihood(...)
of theConditionalRandomField
class so that they now receive an argument with the transition weights. In that way, I could implement the two basic sample weighting strategies (wemission and wtrans) by just subclassing this class and weighting the corresponding weights (logits
andtransitions
) in theforward(...)
method before calling_input_likelihood(...)
and_joint_likelihood(...)
. No modification was necessary to the basic algorithms in these two methods.Before submitting
section of the
CONTRIBUTING
docs.Writing docstrings section of the
CONTRIBUTING
docs.After submitting
codecov/patch
reports high test coverage (at least 90%).You can find this under the "Actions" tab of the pull request once the other checks have finished.