Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Seperate start/end token check for source and target tokenizer #308

Conversation

JohnGiorgi
Copy link
Contributor

@JohnGiorgi JohnGiorgi commented Nov 1, 2021

Closes allenai/allennlp#5451 by separating the check for the start/end tokens in Seq2SeqDatasetReader into two independent checks, one for the source tokenizer, and one for the target tokenizer.

@JohnGiorgi
Copy link
Contributor Author

@epwalsh

Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@epwalsh epwalsh merged commit 5ff0f79 into allenai:main Nov 1, 2021
@JohnGiorgi JohnGiorgi deleted the break-start-end-token-check-into-source-and-target branch November 1, 2021 23:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check for bad start or end symbol in Seq2SeqDatasetReader considers only source tokenizer
2 participants