Crashes on certain words #146

VIEWVIEWVIEW · 2021-10-06T15:43:02Z

Example export:

$ python run_duden.py --export Meme > tests/test_data/Meme.yaml

Results in the following crash:

~/duden$ python run_duden.py --export Meme > tests/test_data/Meme.yaml
Traceback (most recent call last):
  File "run_duden.py", line 4, in <module>
    main()
  File "/home/w/duden/duden/cli.py", line 173, in main
    display_word(word, args)
  File "/home/w/duden/duden/cli.py", line 61, in display_word
    yaml_string = yaml.dump(word.export(),
  File "/home/w/duden/duden/word.py", line 298, in export
    worddict[attribute] = getattr(self, attribute, None)
  File "/home/w/duden/duden/word.py", line 61, in name
    name, _ = self.title.split(', ')
ValueError: too many values to unpack (expected 2)

Apparently there is a crash on the word "Meme" due to too many splits of the word.name and word.article propety. This can be fixed by setting the maxsplit in the split call:

name, _ = self.title.split(', ', 1)

https://docs.python.org/3/library/stdtypes.html#str.split

The text was updated successfully, but these errors were encountered:

VIEWVIEWVIEW · 2021-10-06T15:43:06Z

There are two lines where this crash can occur:

duden/duden/word.py

Line 61 in edc0e36

name, _ = self.title.split(', ')

duden/duden/word.py

Line 96 in edc0e36

_, article = self.title.split(', ')

radomirbosak · 2021-10-06T15:48:06Z

Thank you for filing this issue. The word Meme is interesting since it splits into three parts in contrast to other words

VIEWVIEWVIEW · 2021-10-06T15:50:22Z

I guess the simple fix

_, article = self.title.split(', ', 1)

is not appropriate, since it would cut information off. What do you think?

radomirbosak · 2021-10-06T15:50:54Z

Right, setting maxsplit to 1 would solve the issue when getting name, but it wouldn't work for the article which would be incorrectly determined as auch Mem, das.

radomirbosak · 2021-10-06T15:51:24Z

Maybe a more robust way to get the name and article would be to locate the lemma__* spans.

radomirbosak · 2021-10-06T15:53:26Z

This could be also an opportunity to introduce a property like alternative_spellings or similar which would return the list of all lemma__alt-spelling contents. (Although I'm not sure if there are words with 2 or more alternative spellings)

radomirbosak added bug good first issue labels Oct 6, 2021

VIEWVIEWVIEW mentioned this issue Oct 7, 2021

Add "alternative_spellings" property + minor bug fix for words with alternative spellings in title #147

Merged

radomirbosak closed this as completed in a85055e Oct 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crashes on certain words #146

Crashes on certain words #146

VIEWVIEWVIEW commented Oct 6, 2021

VIEWVIEWVIEW commented Oct 6, 2021

radomirbosak commented Oct 6, 2021 •

edited

Loading

VIEWVIEWVIEW commented Oct 6, 2021 •

edited

Loading

radomirbosak commented Oct 6, 2021 •

edited

Loading

radomirbosak commented Oct 6, 2021

radomirbosak commented Oct 6, 2021

Crashes on certain words #146

Crashes on certain words #146

Comments

VIEWVIEWVIEW commented Oct 6, 2021

VIEWVIEWVIEW commented Oct 6, 2021

radomirbosak commented Oct 6, 2021 • edited Loading

VIEWVIEWVIEW commented Oct 6, 2021 • edited Loading

radomirbosak commented Oct 6, 2021 • edited Loading

radomirbosak commented Oct 6, 2021

radomirbosak commented Oct 6, 2021

radomirbosak commented Oct 6, 2021 •

edited

Loading

VIEWVIEWVIEW commented Oct 6, 2021 •

edited

Loading

radomirbosak commented Oct 6, 2021 •

edited

Loading