Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sampling more consistant with different splitting schemas #167

Closed
antoinecarme opened this issue Jun 25, 2021 · 1 comment
Closed

Comments

@antoinecarme
Copy link
Owner

antoinecarme commented Jun 25, 2021

In some cases, sampling is not compatible with dataset splitting.

For the moment, sampling is only used on very large datasets ( > 8192 rows) to speed up the training of AR-like models near the end of the training process.

Sampling is enabled by default and can be disabled (Options.mActivateSampling = False).

This can be problematic when some advanced features are used : cross-validation, time hierarchies (#163), etc.

Ensure that the dataset is sampled before the training process starts and that only the last 8192 are used.

@antoinecarme antoinecarme self-assigned this Jun 25, 2021
antoinecarme pushed a commit that referenced this issue Jun 25, 2021
Apply the sampling to the whole training dataset before training/forecasting
@antoinecarme
Copy link
Owner Author

Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant