Replies: 1 comment
-
As a side note, the penalty seen in luatex font allocation memory (and particularly severe on Windows) has been signaled to luatex maintainers and as far as I understand they told me that they see it too, and may devote some time addressing it, which is great (I do understand luatex is not designed as "number cruncher" but still, it is nice that computing primes puts forwards the pros and cons in various languages/binaries). Thus a side effect of this project may be that some improvement in upstream luatex may come out of me trying to use it here... But new luatex binaries are updated by TeXLive only once a year... (MikTeX has I think a more rolling model). Anyway I feel, as I see now on the PrimeView that the hardware used is not fast enough for the pdftex approach to make 295 or more passes, that I should make a PR to let the Dockerfile use pdftex for the benchmark. |
Beta Was this translation helpful? Give feedback.
-
This is a bit "texnical", but here is my problem: I configured the Dockerfile to use luatex because luatex has dynamic memory management. TeX has no notion of array, the closest is the set of "font dimensional parameters" associated to each declared font in the document (via
\font
declaration), which, in practice is indeed an array of 32bits words.But pdftex has a fixed size memory, and if not recompiling the binary, the maximal memory possibly configurable for such "font arrays" will allow at most 294 passes for sieving up to 1,000,000 (each such pass allocating 500,000 words, each word being 32bits).
I thus chose luatex, but it brings distortion:
2*3*5*7
wheel compared to base sieve, but the exact same code executed with luatex brings only 2.3X improvement,From what I see currently at PrimeView there is still some margin on the used hardware before my fastest algorithm (wheel48of210) makes 295 passes in 5 seconds and thus exhaust pdftex memory.
Thus, my question: wouldn't it be better for the benchmark to use pdftex then? (people with high performance hardware may not be able to run the benchmark at their locale, but the current hardware used at PrimeView is still not fast enough to trigger the problem...). Doing this would less distort the timings...
Also, this problem would not show if in place of each pass declaring a new font and reclaiming 500,000 words of (zero-ed) memory (for sieving up to 1,000,000), it would re-use after clearing only one, always the same, chunk of these 500,000 words of memory. Thus, my secondary question is whether it would be allowable to do that (as anyhow the solution is declared unfaithful already).
Please advise at your convenience :)...
Beta Was this translation helpful? Give feedback.
All reactions