Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay random state validation #117

Merged
merged 2 commits into from
May 14, 2020

Conversation

timokau
Copy link
Collaborator

@timokau timokau commented May 12, 2020

Description

Part of #94 and #116.

Motivation and Context

The scikit-learn estimator API requires us to store all __init__ parameters unmodified. Validation should be delayed as late as possible. That is because set_params could override those parameters. See https://scikit-learn.org/stable/developers/develop.html#random-numbers.

I'm using the pattern of setting self.random_state_ at the beginning of every fit function. In theory that is not always needed, since often the random state is only used in one function. I think its less error-prone to always do it the same way though.

How Has This Been Tested?

Ran the pre-commit hooks and the test-suite.

Does this close/impact existing issues?

#94 and #116

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

This is redundant, since the model is always created on fit. Further,
`set_tunable_parameters` may be called before `fit` but proper
initialization is only guaranteed after the first call to `fit`.
The scikit-learn estimator API stipulates that __init__ should only
store its parameters and do no validation (since parameters can later be
overridden in set_parameters). Parameters need to be validated on use.

https://scikit-learn.org/stable/developers/develop.html#random-numbers
@codecov
Copy link

codecov bot commented May 12, 2020

Codecov Report

Merging #117 into master will increase coverage by 4.37%.
The diff coverage is 76.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #117      +/-   ##
==========================================
+ Coverage   55.81%   60.18%   +4.37%     
==========================================
  Files         116      116              
  Lines        7643     7656      +13     
==========================================
+ Hits         4266     4608     +342     
+ Misses       3377     3048     -329     
Impacted Files Coverage Δ
csrank/discretechoice/baseline.py 54.54% <0.00%> (ø)
csrank/objectranking/baseline.py 54.54% <0.00%> (ø)
csrank/objectranking/cmp_net.py 87.17% <0.00%> (+41.02%) ⬆️
csrank/core/feta_network.py 63.92% <66.66%> (+23.55%) ⬆️
csrank/discretechoice/generalized_nested_logit.py 88.51% <66.66%> (+0.07%) ⬆️
csrank/discretechoice/nested_logit_model.py 91.48% <75.00%> (+0.09%) ⬆️
csrank/choicefunction/generalized_linear_model.py 87.50% <100.00%> (+0.13%) ⬆️
csrank/core/cmpnet_core.py 89.38% <100.00%> (+0.09%) ⬆️
csrank/core/fate_linear.py 90.62% <100.00%> (ø)
csrank/core/fate_network.py 69.19% <100.00%> (+0.13%) ⬆️
... and 22 more

Copy link
Owner

@kiudee kiudee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks straightforward and good to me.

@timokau timokau merged commit b8f8288 into kiudee:master May 14, 2020
@timokau timokau deleted the delay-random-state-validation branch May 14, 2020 12:46
@timokau timokau mentioned this pull request Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants