-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MarkdownWriter shouldn't dump pre-trained entities #4877
Comments
@akelad these entities aren't used for training of the CRF: #898 I think dumping them should be okay then, no? we explicitly removed this from rasa X at some point ages ago: https://github.com/RasaHQ/nlu-trainer-backend/pull/42 |
@ricwo how does that work? We just exported the training data from the instance to use it in the repo, and it was exported with duckling entities annotated as CRF ones:
|
synonyms from duckling get dumped as well:
|
does not look to be duckling-specific:
|
Not sure if this is related. It seems that by default the case insensitivity is leading to synonym dumps for entities that have capitals in them:
|
I noticed this on Rasa X, that if duckling or ner_spacy entities are recognised and you save it, these entities will get saved to the markdown file. These entities should be filtered out since we don't want to train the crf on it.
I'm not sure how we persist this training data to the database on Rasa X, but there might be something we need to change there too
The text was updated successfully, but these errors were encountered: