-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify unit tests #27
Comments
Hi @wleoncio - I've added MLbench to the suggests area. I like having this particular test in the tests because it's a real example of finding some cellular characteristic combinations that lead to cancer. I'm not sure which version of CVtreeMLE you tested on but recently I added a fit_marginals parameter which is default to FALSE. This then only looks for interactions and doesn't fit trees on each individual exposure. This substantially reduces computational time in all the tests. For example, on that breast cancer dataset with 700 obs, 2 folds and 2 cpus, it takes 1.5 minutes. If I use more than 2 cores in the tests then I get a check flag failure, so I think that's the max I can use. Let me know if 1.5 minutes is still too long. I added a couple additional tests to make sure all the outputs connect correctly. I've removed a test on the marginal tree function which I didn't think was useful and took a long time and replaced it with a test on small sample size of simulated data and compared estimates to ground-truth. I think this is more informative. Let me know if you'd like anything else. |
Hi @wleoncio - attached is my run: |
So it's about 12 minutes - is that reasonable? I can exchange some things if that feels better. |
Thanks for the extra info, @blind-contours! It would be nice those those mixture tests closer to the minute mark, but in the end it's up to you how much time you want to spend optimizing this part of the code that doesn't directly affect users. |
Running
devtools::test()
required the installation ofmlbench
package. Also, I had to manually tweak the tests to use more CPUs to bring computing time to an acceptable state.This is something that would only bother developers, possibly scaring away contributors, so not a big deal, but it would be nice for unit tests to be faster (perhaps by using smaller datasets, fewer iterations, more CPUs) and more complex (most of them just check output class or triggered errors).
The text was updated successfully, but these errors were encountered: