-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed all the bugs of save_resume #1917
Conversation
bremen79
commented
Jun 6, 2019
- The state of FTRL models is now saved (added a parameter to save_load_online_state)
- Fixed bug in save_load_online_state: now the state is saved even for features that have w=0. This affected --l1 and FTRL models that add up gradients with equal weights
- Moved total_weight from gd to vw struct, so FTRL can use and save it
- added save_resume lines for ftrl, pistol, coin
…n by average lenght of the feature vectors
…than default one in ooa and cbify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good other than the additional use of global state. Can we avoid?
@@ -559,14 +559,14 @@ float get_pred_per_update(gd& g, example& ec) | |||
if (!stateless) | |||
{ | |||
g.all->normalized_sum_norm_x += ((double)ec.weight) * nd.norm_x; | |||
g.total_weight += ec.weight; | |||
g.all->total_weight += ec.weight; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why prefer a global?
The general rule of thumb is to use variables which are as local as possible to minimize context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me how to solve this, I am open to suggestions.
From one side, the struct vw already contains similar quantities: power_t, invariant_updates, normalized_sum_norm_x, and similar ones are specific to gd, still they are in a global place.
Also, the problem comes from using GD::save_load_online_state in ftrl. We don't have access to ftrl data in this way. We could duplicate and customize the entire save state function in ftrl? It seems painful... Or hack the GD::save_load_online_state with even more optional inputs, but it also seems a bad idea...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the three options, an extra argument seems preferred to either a global or code duplication.
Code duplication seems particularly bad---it's a recipe for non-maintainability.
The global variable is moving in the wrong direction---we are working towards atomizing the reductions so they can be composed with other learning algorithms.
The extra arguments approach seems the best. In the long term, we'd probably want to adjust the arguments so they are semantic rather than algorithm-specific. Basically, instead of having ftrl, we'd have "the number of floats per weight to store", etc... But this is a minor refactoring consistent with the extra arguments approach.
Closing in favor of #1919 which tweaks this one. |