Simplify unit tests #27

wleoncio · 2023-01-16T11:06:12Z

Running devtools::test() required the installation of mlbench package. Also, I had to manually tweak the tests to use more CPUs to bring computing time to an acceptable state.

This is something that would only bother developers, possibly scaring away contributors, so not a big deal, but it would be nice for unit tests to be faster (perhaps by using smaller datasets, fewer iterations, more CPUs) and more complex (most of them just check output class or triggered errors).

The text was updated successfully, but these errors were encountered:

blind-contours · 2023-01-25T20:42:21Z

Hi @wleoncio - I've added MLbench to the suggests area. I like having this particular test in the tests because it's a real example of finding some cellular characteristic combinations that lead to cancer. I'm not sure which version of CVtreeMLE you tested on but recently I added a fit_marginals parameter which is default to FALSE. This then only looks for interactions and doesn't fit trees on each individual exposure. This substantially reduces computational time in all the tests. For example, on that breast cancer dataset with 700 obs, 2 folds and 2 cpus, it takes 1.5 minutes. If I use more than 2 cores in the tests then I get a check flag failure, so I think that's the max I can use. Let me know if 1.5 minutes is still too long. I added a couple additional tests to make sure all the outputs connect correctly. I've removed a test on the marginal tree function which I didn't think was useful and took a long time and replaced it with a test on small sample size of simulated data and compared estimates to ground-truth. I think this is more informative. Let me know if you'd like anything else.

wleoncio · 2023-01-30T13:08:54Z

Tests are still taking an inordinate amount of time on my machine (8th gen i5 CPU), see screenshot below:

How much time does the whole devtools::test() procedure take on your side? Since the breast cancer test takes half as long as I got, I'm guessing the whole thing takes around 10 minutes.

blind-contours · 2023-01-30T17:44:49Z

Hi @wleoncio - attached is my run:

blind-contours · 2023-01-30T17:45:59Z

So it's about 12 minutes - is that reasonable? I can exchange some things if that feels better.

wleoncio · 2023-01-31T05:12:08Z

Thanks for the extra info, @blind-contours! It would be nice those those mixture tests closer to the minute mark, but in the end it's up to you how much time you want to spend optimizing this part of the code that doesn't directly affect users.

wleoncio mentioned this issue Jan 16, 2023

[REVIEW]: CVtreeMLE: Efficient Estimation of Mixed Exposures using Data Adaptive Decision Trees and Cross-Validated Targeted Maximum Likelihood Estimation in R openjournals/joss-reviews#4181

Closed

40 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify unit tests #27

Simplify unit tests #27

wleoncio commented Jan 16, 2023

blind-contours commented Jan 25, 2023

wleoncio commented Jan 30, 2023

blind-contours commented Jan 30, 2023

blind-contours commented Jan 30, 2023

wleoncio commented Jan 31, 2023

Simplify unit tests #27

Simplify unit tests #27

Comments

wleoncio commented Jan 16, 2023

blind-contours commented Jan 25, 2023

wleoncio commented Jan 30, 2023

blind-contours commented Jan 30, 2023

blind-contours commented Jan 30, 2023

wleoncio commented Jan 31, 2023