pronouns: fetch list from the pronoun.is GitHub repo at start #2130

dgw · 2021-06-30T21:44:05Z

Description

We don't have to wait until the 12th of Never for pronoun.is to have an API. Parsing tab-separated values is easy, and the file is already sorted according to approximate frequency for us according to their readme.

Includes a kill-switch config setting in case the repo ever gets deleted or reorganized, or the file format changes.

Checklist

I have read CONTRIBUTING.md
I can and do license this contribution under the EFLv2
No issues are reported by make qa (runs make quality and make test)
I have tested the functionality of the things this change touches

Notes

Speaking of something that's over-complicated for what it does (as Exi said on IRC yesterday)… this also finds the shortest unique set of pronouns for each option to use in links. The only outlier cases now are they/them/their/theirs/themsel(ves|f), which won't be abbreviated as they/.../themsel(ves|f). I might or might not find the motivation to fix that.

Exirel

I don't really get the tree aspect of the pronoun parsing. I feel like I'm missing the problem that required a solution.

I also tried to run the code on the current pronoun file, and the result is quite different from the default set of pronouns. For example, instead of 'they/.../themselves', the parser returns 'they/them/their/theirs/themselves'. So, is it compatible? Why use ... in the default value, knowing that the actual file don't contain them? Especially given that the file will replace the default value, so if there is a feature here, it'll be made inaccessible.

A lot of confusion!

sopel/modules/pronouns.py

dgw · 2022-02-24T03:01:54Z

I don't really get the tree aspect of the pronoun parsing. I feel like I'm missing the problem that required a solution.

Using a trie was to make it easier to match on prefixes. I think. It's been a few months. 😅

I've tried (😏) to address most of the style comments in fixup commits. Haven't gone through another round of testing just yet.

Exirel

I still think this is a lot of work that I don't know how to justify. On the other hand, I didn't have to do anything, and it all looks good. Maybe I'm just annoyed there isn't any test for that yet, but there wasn't before anyway.

Let's move on.

We don't have to wait until the 12th of Never for them to launch an API. Parsing tab-separated values is easy, and the file is already sorted according to approximate frequency for us according to their readme.

Inspiration for the actual prefix-finding: https://www.techiedelight.com/shortest-unique-prefix/

dgw · 2022-05-10T17:51:12Z

I have cleaned the history (squashing fixups) and fixed one last bug with the disambiguation output (in case the user specifies e.g. .setpronouns they, which matches multiple sets).

Testing also determined that it's not necessary to merge the fetched pronouns with the default set; they/.../themself (for example) works just fine without that entry existing in the dictionary. I dropped that change.

dgw added Feature Low Priority labels Jun 30, 2021

dgw added this to the 8.0.0 milestone Jun 30, 2021

dgw requested a review from a team June 30, 2021 21:44

dgw force-pushed the pronouns-fetch branch from 691ec30 to 2faba88 Compare June 30, 2021 21:46

Exirel requested changes Oct 30, 2021

View reviewed changes

sopel/modules/pronouns.py Outdated Show resolved Hide resolved

sopel/modules/pronouns.py Outdated Show resolved Hide resolved

sopel/modules/pronouns.py Show resolved Hide resolved

sopel/modules/pronouns.py Outdated Show resolved Hide resolved

dgw requested a review from Exirel February 24, 2022 03:02

Exirel approved these changes Apr 27, 2022

View reviewed changes

dgw added 3 commits May 10, 2022 12:47

pronouns: fetch list from the pronoun.is GitHub repo at start

7a4ec45

We don't have to wait until the 12th of Never for them to launch an API. Parsing tab-separated values is easy, and the file is already sorted according to approximate frequency for us according to their readme.

pronouns: add killswitch for startup fetch

41b7a13

pronouns: generate shortest possible prefixes for linking

14a7d4d

Inspiration for the actual prefix-finding: https://www.techiedelight.com/shortest-unique-prefix/

dgw force-pushed the pronouns-fetch branch from b6d708b to 14a7d4d Compare May 10, 2022 17:48

dgw merged commit beaa544 into master May 10, 2022

dgw deleted the pronouns-fetch branch May 10, 2022 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pronouns: fetch list from the pronoun.is GitHub repo at start #2130

pronouns: fetch list from the pronoun.is GitHub repo at start #2130

dgw commented Jun 30, 2021

Exirel left a comment

dgw commented Feb 24, 2022

Exirel left a comment

dgw commented May 10, 2022

pronouns: fetch list from the pronoun.is GitHub repo at start #2130

pronouns: fetch list from the pronoun.is GitHub repo at start #2130

Conversation

dgw commented Jun 30, 2021

Description

Checklist

Notes

Exirel left a comment

Choose a reason for hiding this comment

dgw commented Feb 24, 2022

Exirel left a comment

Choose a reason for hiding this comment

dgw commented May 10, 2022