You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would allow for random but deterministic sampling of the dataframe when choosing the test/train split. This way you can benchmark two models on the same dataframe and automatically have the same test/train split. Can just override the benchmark test_spec variable again (e.g. if test_spec is an int or numpy.random.RandomState object then use it as the random_sample variable).
The text was updated successfully, but these errors were encountered:
@utf current benchmarking implementation has you pass in a sklearn kfold (or StratifiedKFold) to the benchmarking function. the kfold object can accept a random state param, so this issue is essentially closed.
Would be nice to add support for the
random_sample
variable ofpandas.DataFrame.sample()
for pipeline benchmarking.E.g. implemented for this line:
Would allow for random but deterministic sampling of the dataframe when choosing the test/train split. This way you can benchmark two models on the same dataframe and automatically have the same test/train split. Can just override the benchmark
test_spec
variable again (e.g. iftest_spec
is an int ornumpy.random.RandomState
object then use it as therandom_sample
variable).The text was updated successfully, but these errors were encountered: