Parse non-tabular grammar information #21

radomirbosak · 2016-11-20T19:58:58Z

>>> python3 -c "import duden; print(duden.get('Kragen').grammar_raw)"
[]

duden.de, however has this text supplied in the grammar section:

der Kragen; Genitiv: des Kragens, Plural: die Kragen, süddeutsch, österreichisch, schweizerisch: Krägen

The script shouldn't omit this. However it is not clear in which form should it present the information in this rare case of non-table-like format.

The text was updated successfully, but these errors were encountered:

radomirbosak · 2018-04-29T19:10:55Z

The word Mönch uses a similar format for grammar.

der Mönch; Genitiv: des Mönch[e]s, Plural: die Mönche

radomirbosak · 2018-04-29T19:15:37Z

Verbs like schwindeln also use a non-tabular grammar format.

schwaches Verb; Perfektbildung mit »hat«

We should first decide what should the function return for words like these.

radomirbosak · 2018-08-02T18:43:51Z

This bug is still valid in 0.10.0 .

radomirbosak · 2018-08-03T09:25:52Z

the word Bereich has the same problem - the section doesn't contain any tables, just the text der, selten: das Bereich; Genitiv: des Bereich[e]s, Plural: die Bereiche.

Before this commit, grammar_raw returned None if no '.grammatik' element is found. This leads to TypeError in 'Word.grammar' and seems unintuitive, since in other cases an empty list is retuned (see radomirbosakGH-21). Now grammar_raw returns an empty list if no grammar section is found.

Before this commit, grammar_raw returned None if no '.grammatik' element is found. This leads to TypeError in 'Word.grammar' and seems unintuitive, since in other cases an empty list is retuned (see GH-21). Now grammar_raw returns an empty list if no grammar section is found.

radomirbosak · 2021-10-06T16:05:57Z

The word Meme is also a bit special since it has multiple spellings. This will complicate grammar parsing.

radomirbosak · 2022-09-21T00:17:04Z

A related attribute grammar_overview was added in #168 . However, it does not parse the text data there.

radomirbosak · 2022-09-21T00:19:32Z

The above new attribute should be enough and I won't be implementing more detailed parsing of a highly variable text attribute. I will close this as out of scope.

radomirbosak added the bug label Nov 20, 2016

radomirbosak added the good first issue label Aug 3, 2018

radomirbosak removed the good first issue label Aug 3, 2018

radomirbosak added enhancement bug and removed bug labels Jun 14, 2020

radomirbosak changed the title ~~Word 'Kragen' doesn't display any grammar information in v0.7.0~~ Parse non-tabular grammar information Jun 14, 2020

radomirbosak removed the bug label Jun 14, 2020

pajowu mentioned this issue Jan 9, 2021

Handle words without grammar information #120

Merged

radomirbosak added the help wanted label Oct 3, 2021

radomirbosak closed this as completed Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse non-tabular grammar information #21

Parse non-tabular grammar information #21

radomirbosak commented Nov 20, 2016

radomirbosak commented Apr 29, 2018

radomirbosak commented Apr 29, 2018

radomirbosak commented Aug 2, 2018

radomirbosak commented Aug 3, 2018 •

edited

Loading

radomirbosak commented Oct 6, 2021

radomirbosak commented Sep 21, 2022 •

edited

Loading

radomirbosak commented Sep 21, 2022

Parse non-tabular grammar information #21

Parse non-tabular grammar information #21

Comments

radomirbosak commented Nov 20, 2016

radomirbosak commented Apr 29, 2018

radomirbosak commented Apr 29, 2018

radomirbosak commented Aug 2, 2018

radomirbosak commented Aug 3, 2018 • edited Loading

radomirbosak commented Oct 6, 2021

radomirbosak commented Sep 21, 2022 • edited Loading

radomirbosak commented Sep 21, 2022

radomirbosak commented Aug 3, 2018 •

edited

Loading

radomirbosak commented Sep 21, 2022 •

edited

Loading