Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify unit tests #27

Open
wleoncio opened this issue Jan 16, 2023 · 5 comments
Open

Simplify unit tests #27

wleoncio opened this issue Jan 16, 2023 · 5 comments

Comments

@wleoncio
Copy link

Running devtools::test() required the installation of mlbench package. Also, I had to manually tweak the tests to use more CPUs to bring computing time to an acceptable state.

This is something that would only bother developers, possibly scaring away contributors, so not a big deal, but it would be nice for unit tests to be faster (perhaps by using smaller datasets, fewer iterations, more CPUs) and more complex (most of them just check output class or triggered errors).

@blind-contours
Copy link
Owner

Hi @wleoncio - I've added MLbench to the suggests area. I like having this particular test in the tests because it's a real example of finding some cellular characteristic combinations that lead to cancer. I'm not sure which version of CVtreeMLE you tested on but recently I added a fit_marginals parameter which is default to FALSE. This then only looks for interactions and doesn't fit trees on each individual exposure. This substantially reduces computational time in all the tests. For example, on that breast cancer dataset with 700 obs, 2 folds and 2 cpus, it takes 1.5 minutes. If I use more than 2 cores in the tests then I get a check flag failure, so I think that's the max I can use. Let me know if 1.5 minutes is still too long. I added a couple additional tests to make sure all the outputs connect correctly. I've removed a test on the marginal tree function which I didn't think was useful and took a long time and replaced it with a test on small sample size of simulated data and compared estimates to ground-truth. I think this is more informative. Let me know if you'd like anything else.

@wleoncio
Copy link
Author

Tests are still taking an inordinate amount of time on my machine (8th gen i5 CPU), see screenshot below:

20min

How much time does the whole devtools::test() procedure take on your side? Since the breast cancer test takes half as long as I got, I'm guessing the whole thing takes around 10 minutes.

@blind-contours
Copy link
Owner

Hi @wleoncio - attached is my run:

Screen Shot 2023-01-30 at 9 44 04 AM

@blind-contours
Copy link
Owner

So it's about 12 minutes - is that reasonable? I can exchange some things if that feels better.

@wleoncio
Copy link
Author

Thanks for the extra info, @blind-contours! It would be nice those those mixture tests closer to the minute mark, but in the end it's up to you how much time you want to spend optimizing this part of the code that doesn't directly affect users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants