-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StarDict: improve memory usage #409
Comments
This will not always sort them correctly. There might be several entries with same lowercase headword, then it might produce broken output. And won't always fix the memory issue. This is for sorting Meanwhile, please try to use swap file (or increase it) to extend your memory. |
I added a new option to use SQLite to reduce memory. Please checkout / download branch named stardict-sqlite, and try again by adding flag |
i can confirm that it your patch works flawless. (i tryed it on a tablet with 3gb ram) |
Great. |
dear Saeed Rasooli,
thank you very much for writing this converter. it works very reliable and helped me a lot. i would like to help by improving it even more.
when processing stardict files with a large amount of synonyms (> 500'000) pyglossary runs out of memory on my system.
repoduce:
https://github.com/digitalpalidictionary/digitalpalidictionary/releases/download/2023-01-06/dpd-goldendict.zip
download this file (about 6.8 million synonyms) and run
$ pyglossary dpd-goldendict.ifo dpd-new.ifo
on my system it crashes after about 4 million synonyms (the conversion of the .dict file runs without problems)
possible solution:
as far as i understood: the sorting of the synonyms is done in memory by (line 738)
altIndexList.sort( key=lambda x: self.byteSortKey(x[0]) )
i dont know much about sorting (especially when its in bytes and not str), so i cant say for sure, but i think that the function is sorting the list twice (or by 2 creteria), first column b_word.lower() and than b_word (which might explain the out of memory).
when i remove the line (the proposed patch from above), it compiles without issues.
what is your opinion about this?
The text was updated successfully, but these errors were encountered: