-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split model training from randomness #71
Comments
Hi @Diggsey, Thanks for reporting this! Your analysis sounds spot on — and I like that Would you be up for restructuring things as you suggest? |
Yeah, it will be a breaking change though. |
That's okay, especially since the The few dependencies of the crate seem to use those high-level functions, so they should have no problem upgrading. |
Training the model with "learn" is relatively slow (much slower than generating new words) so ideally you could train it once, and then use the same model to generate several sequences with different seeds.
However, this is not possible with the current API, because the seed is supplied when you first construct the MarkovChain, and cannot be changed after that point.
AFAICT, the "learn()" function is completely deterministic (it does not use the RNG) so this is a suboptimal design.
I would suggest removing the
rng
field from theMarkovChain
entirely, and instead pass the RNG when you construct theWords
iterator. This way you can train a model once, and then generate several sequences of text with different seeds. This should also make thelipsum_words_from_seed
function much faster, even when not using a custom model.The text was updated successfully, but these errors were encountered: