Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow 4-character abbreviations of mnemonic words ... #168

Closed
wants to merge 0 commits into from

Conversation

yorickdowne
Copy link
Contributor

@yorickdowne yorickdowne commented Nov 20, 2020

... when using the existing-mnemonic workflow.

This addresses #167

Choices made:

Because of the way the dict lookups are written, I decided to normalize to 'NFKC' and slice to 4 characters or less. The normalization is there so that combining Unicode characters work and the string doesn't get sliced to 3 visible characters.

I have no particular attachment to this way of handling it. If you'd rather not normalize and do a partial match on dict keys instead, I am game. If there's a third method you favor, ditto.

new-mnemonic workflow has not been touched and still does a straight compare. One might discuss whether that workflow should allow abbreviations as well, but that might be a different PR.

This code passes the test suite.

Copy link

@lukas131313 lukas131313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R

@yorickdowne
Copy link
Contributor Author

Please give this another look for inclusion.

@yorickdowne
Copy link
Contributor Author

yorickdowne commented May 14, 2021

This is now failing one specific test mnemonic. Checksum calculation fails. Let me see why that is.

test_mnemonic = 'abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco abaco agitare'

@yorickdowne
Copy link
Contributor Author

I know why this fails, the logic to create a word language map is wrong. I'm glad this got caught before it made it in! Let me think on how to improve

@yorickdowne yorickdowne force-pushed the master branch 3 times, most recently from ba629ee to e50822d Compare May 14, 2021 11:46
@yorickdowne
Copy link
Contributor Author

yorickdowne commented May 14, 2021

This now passes all tests. Logic has been changed. Previously I normalized and sliced to 4 characters while constructing word_language_map. The way Python works, this turned out to be a very bad idea: There are multiple overlaps in abbreviations, and the map ended up with only one word to language mapping, not several. For example, "abac" matches words in Italian, Spanish, and Portuguese, but the way this was written previously, word_language_map would only have the Portuguese version in it.

Instead, I am now using a nested for list comprehension that compares the full word language map against abbreviations, then flatten the resulting 2D list again. If there's a more elegant way of doing this, please share!

Because abbreviations can match several languages, and then merely fail the checksum, I am now continuing the search through the languages when a checksum fails, rather than returning None on the first failure.

@yorickdowne
Copy link
Contributor Author

Please give this another look for inclusion.

Also, please give feedback on whether changing the two pass statements to continue would be welcome.

@yorickdowne
Copy link
Contributor Author

Please give this another look for inclusion. If there are concerns, please raise them.

@yorickdowne
Copy link
Contributor Author

Ping to get this considered for inclusion, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants