-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential bug in rasa/data.py #5106
Labels
area:rasa-oss 🎡
Anything related to the open source Rasa framework
stale
type:bug 🐛
Inconsistencies or issues which will cause an issue or problem for users or implementors.
Comments
tmbo
added
area:rasa-oss 🎡
Anything related to the open source Rasa framework
type:bug 🐛
Inconsistencies or issues which will cause an issue or problem for users or implementors.
labels
Feb 4, 2020
tmbo
added a commit
that referenced
this issue
Feb 4, 2020
Thanks a lot for the suggestion, I've made the file loading more robust 👍 |
tmbo
added a commit
that referenced
this issue
Feb 10, 2020
tmbo
added a commit
that referenced
this issue
Feb 10, 2020
tmbo
added a commit
that referenced
this issue
Feb 10, 2020
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. Please create a new issue if you need more help. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area:rasa-oss 🎡
Anything related to the open source Rasa framework
stale
type:bug 🐛
Inconsistencies or issues which will cause an issue or problem for users or implementors.
Hello Rasa team,
I have the following problem running Rasa and in consequence Rasa-X on my setup. I was able to fix it in the rasa code, but in rasa-X it is not so straight forward as it involves some some fidelling with docker files.
My setup:
I use a linux workstation (Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-37-generic x86_64)) with python3.6 as my rasa server as well as the docker host for rasa-X. I usually connect to this workstation remotely from my MacBook Pro (macOS 10.15.2) and I do all the editing (stories files, domain files etc) on this mac.
The problem:
Rasa has problems importing my story files: (here is a excerpt of the logs from Rasa-X, as I'm currently trying to get Rasa-X working as shown in Ep #9 -Rasa Masterclass)
rasa-x_1 | Job "GitService.run_background_synchronization (trigger: cron[minute='*'], next run at: 2020-01-22 09:48:00 UTC)" raised an exception
rasa-x_1 | Traceback (most recent call last):
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/apscheduler/executors/base.py", line 125, in run_job
rasa-x_1 | retval = job.func(*job.args, **job.kwargs)
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasax/community/services/git_service.py", line 742, in run_background_synchronization
rasa-x_1 | git_service.synchronize_project(force_data_injection)
rasa-x_1 | File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasax/community/services/git_service.py", line 599, in synchronize_project
rasa-x_1 | await self._inject_data()
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasax/community/services/git_service.py", line 629, in _inject_data
rasa-x_1 | str(self.repository_path()), str(data_path), self.session, SYSTEM_USER
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasax/community/initialise.py", line 299, in inject_files_from_disk
rasa-x_1 | story_files, nlu_files = rasa.data.get_core_nlu_files([data_path])
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasa/data.py", line 92, in get_core_nlu_files
rasa-x_1 | path
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasa/data.py", line 115, in _find_core_nlu_files_in_directory
rasa-x_1 | elif is_story_file(full_path):
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasa/data.py", line 153, in is_story_file
rasa-x_1 | _is_story_file = any(_contains_story_pattern(l) for l in f)
rasa-x_1 | File "/usr/local/lib/python3.6/site-packages/rasa/data.py", line 153, in
rasa-x_1 | _is_story_file = any(_contains_story_pattern(l) for l in f)
rasa-x_1 | File "/usr/local/lib/python3.6/codecs.py", line 321, in decode
rasa-x_1 | (result, consumed) = self._buffer_decode(data, self.errors, final)
rasa-x_1 | UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
I assume that the file encoding from the mac trips up the file loader on Linux.
My Fix for Rasa:
I was able to fix this for rasa by adding the parameter errors="surrogateescape" to the
with open(...
line (line 152) in the function is_story_file(file_path: Text) -> bool: in the file python3.6/site-packages/rasa/data.py :`def is_story_file(file_path: Text) -> bool:
"""Checks if a file is a Rasa story file.
`
More about this error handlers in python3's textprocessing here: http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
Conclusion:
From my point of view it looks like a bug in rasa, as the files saved by the macBook look fine, even when opened in an editor on linux and even if the file was somehow invalid, rasa should not crash/throw an exception but point out the error instead.
I would suggest that you add this additional parameter to the open function in rasa's data.py file to make rasa more robust. It would be great if you could also fix that in the docker files of Rasa-X.
Best regards,
Jochen
The text was updated successfully, but these errors were encountered: