Make sampling more consistant with different splitting schemas #167

antoinecarme · 2021-06-25T09:25:36Z

In some cases, sampling is not compatible with dataset splitting.

For the moment, sampling is only used on very large datasets ( > 8192 rows) to speed up the training of AR-like models near the end of the training process.

Sampling is enabled by default and can be disabled (Options.mActivateSampling = False).

This can be problematic when some advanced features are used : cross-validation, time hierarchies (#163), etc.

Ensure that the dataset is sampled before the training process starts and that only the last 8192 are used.

Apply the sampling to the whole training dataset before training/forecasting

antoinecarme · 2021-06-25T18:32:33Z

Closing

antoinecarme added class:bug priority:high status:in_progress labels Jun 25, 2021

antoinecarme self-assigned this Jun 25, 2021

antoinecarme added the topic:modeling_quality label Jun 25, 2021

antoinecarme pushed a commit that referenced this issue Jun 25, 2021

Make sampling more consistant with different splitting schemas #167

0b2b61a

Apply the sampling to the whole training dataset before training/forecasting

antoinecarme closed this as completed Jun 25, 2021

antoinecarme removed the status:in_progress label Jul 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sampling more consistant with different splitting schemas #167

Make sampling more consistant with different splitting schemas #167

antoinecarme commented Jun 25, 2021 •

edited

Loading

antoinecarme commented Jun 25, 2021

Make sampling more consistant with different splitting schemas #167

Make sampling more consistant with different splitting schemas #167

Comments

antoinecarme commented Jun 25, 2021 • edited Loading

antoinecarme commented Jun 25, 2021

antoinecarme commented Jun 25, 2021 •

edited

Loading