Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

options_i command line parsing refactor #1706

Merged
merged 69 commits into from
Jan 16, 2019

Conversation

jackgerrits
Copy link
Member

@jackgerrits jackgerrits commented Jan 8, 2019

  • Add options_i interface and options_boost_po implementation for options parsing/storing/querying
  • Add options_serializer_i interface and options_serializer_boost_po implementation for serializing an options_i collection
  • Migrate 45 reductions and setup tasks to new options workflow
  • Default options are no longer serialized, updated several tests to accommodate this
  • Update loss function to advertise type rather than reading the command line argument

initialize
  parse_args DONE
    parse_diagnostics DONE
  parse_regressor_args
  parse_modules
    save_load_header (hard)
    check_interaction_settings_collision
    parse_feature_tweaks
    parse_example_tweaks
    parse_output_model
    parse_output_preds
    parse_reductions (large surface)
  parse_sources
    load_input_model
      parse_mask_regressor_args
        TODO clear file options
    parse_source (positional param) DONE
    enable_sources DONE
      parse_cache DONE
all reductions
Migration progress:
initialize
  parse_args DONE
    parse_diagnostics DONE
  parse_regressor_args
  parse_modules
    save_load_header (hard) DONE
    check_interaction_settings_collision DONE
    parse_feature_tweaks
    parse_example_tweaks
    parse_output_model
    parse_output_preds
    parse_reductions (large surface)
  parse_sources
    load_input_model
      parse_mask_regressor_args
        TODO clear file options
    parse_source (positional param) DONE
    enable_sources DONE
      parse_cache DONE
all reductions
@JohnLangford
Copy link
Member

This pull request fixes 2 alerts when merging 4dec969 into 2b64c3c - view on LGTM.com

fixed alerts:

  • 2 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

Comment posted by LGTM.com

@JohnLangford
Copy link
Member

This pull request fixes 2 alerts when merging 9b5215c into 2b64c3c - view on LGTM.com

fixed alerts:

  • 2 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

Comment posted by LGTM.com


// Used in parse_source
struct input_options {
bool daemon;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put this here? It's not otherwise used in this header.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an attempt to encapsulate related state, because it is an option parsed as part of this block I am going to keep it in the struct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was ambiguous. Why have "input_options" in the header? It's defining an unused type.

{ hash_function = m_vw->opts_n_args.vm["hash"].as<string>();
if (m_vw->options->was_supplied("hash"))
{
hash_function = m_vw->options->get_typed_option<string>("hash").value();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

->get_typed_option

test/pred-sets/ref/inv_hash.cmp Show resolved Hide resolved
vowpalwabbit/OjaNewton.cc Outdated Show resolved Hide resolved

LEARNER::learner<autolink,example>& ret =
init_learner(data, as_singleline(setup_base(arg)), predict_or_learn<true>, predict_or_learn<false>);
init_learner(data, as_singleline(setup_base(*all.options, all)), predict_or_learn<true>, predict_or_learn<false>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pass 'options' instead of *all.options? That at least looks somewhat less silly.

std::vector<std::string> dictionary_path; // where to look for dictionaries
std::vector<feature_dict*> namespace_dictionaries[256]; // each namespace has a list of dictionaries attached to it
std::vector<dictionary_info> loaded_dictionaries; // which dictionaries have we loaded from a file to memory?

void(*delete_prediction)(void*); bool audit; //should I print lots of debugging information?
bool quiet;//Should I suppress progress-printing of updates?
bool help_requested;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the intent saved within the parsed options itself?

@@ -552,6 +559,7 @@ struct vw
uint32_t holdout_period;
uint32_t holdout_after;
size_t check_holdout_every_n_passes; // default: 1, but search might want to set it higher if you spend multiple passes learning a single policy
size_t early_terminate_passes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just stick it explicitly into each reduction needing it?

vowpalwabbit/options.cc Outdated Show resolved Hide resolved
@@ -1,5 +1,7 @@
#pragma once

#include <functional>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required for std::function on line 5

{ hash_function = m_vw->opts_n_args.vm["hash"].as<string>();
if (m_vw->options->was_supplied("hash"))
{
hash_function = m_vw->options->get_typed_option<string>("hash").value();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping on this.

@JohnLangford
Copy link
Member

This pull request fixes 2 alerts when merging dce63d8 into 2b64c3c - view on LGTM.com

fixed alerts:

  • 2 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

Comment posted by LGTM.com

@JohnLangford
Copy link
Member

This pull request fixes 2 alerts when merging fa7f6d3 into 2b64c3c - view on LGTM.com

fixed alerts:

  • 2 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

Comment posted by LGTM.com

@JohnLangford
Copy link
Member

This pull request fixes 2 alerts when merging 9337bed into 6dbb7e5 - view on LGTM.com

fixed alerts:

  • 2 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

Comment posted by LGTM.com

@JohnLangford
Copy link
Member

I think the pointer vs. member question is the only thing left to worry about here.

@jackgerrits
Copy link
Member Author

I replied to that, I believe we should leave it as is, with an options_i* being in the all/vw object. Most usages of the options_i object are done through a reference (in every setup function), so this covers most exposure. The problem is that there is a situation where the all object must own the options_i struct rather than just using it. This case is when the user passes in a command line string rather than an already constructed options_i object. This is the primary legacy use case. Since it must own the object in this case, we need to delete it when it's done. Storing it as a reference really hides this use case, and given it is the most likely way this will be used I think we want to optimize understanding for that.

@JohnLangford JohnLangford merged commit b8ab396 into VowpalWabbit:master Jan 16, 2019
@JohnLangford
Copy link
Member

Merged, thanks :-) That's quite a monster.

@arielf
Copy link
Collaborator

arielf commented Jan 17, 2019

Hi there. Indeed a big change. Thanks for all the effort!

I've been testing this a bit and I'm seeing another backward incompatibility break with --quiet:

$ ./vowpalwabbit/vw --quiet f20-315.tt.gz

finished run
number of examples = 0
weighted example sum = 0.000000
weighted label sum = 0.000000
average loss = n.a.
total feature number = 0
the argument ('f20-315.tt.gz') for option '--quiet' is invalid. Valid choices are 'on|off', 'yes|no', '1|0' and 'true|false'

Would it be possible to make --quiet backward-compatible in the sense that it passed naked, it defaults to =yes and doesn't cause an abort/error ?

Thanks!

@jackgerrits
Copy link
Member Author

This isn’t restricted to quiet, but any Boolean option as we are expecting a following value. We can force Boolean options to be a switch (rather than an option that takes a value), but then it is not possible to set an option to be false. This is one of the tricky cases using a positional data option introduces. Currently Boolean options take an implicit value of true if the following token is an option, but a following value token is interpreted as a value for the Boolean option. The positional data value trips this up.

@JohnLangford
Copy link
Member

Boost manages to get this to work. Why can't we?

jackdoe pushed a commit to lucjb/vowpal_wabbit that referenced this pull request Mar 7, 2019
* Initial framework commit of options refactor

* Rename options -> arguments, option -> parameter

* equality unit test

* Working on tests

* Add kept implementation

* revert line ending change

* Update cmake and fix build

* Update arg to hold value, expose arg list and merging

* Rename arguments to options

* rename files arguments -> options

* Add config namespace

* Change back to non-duplicates and use references

* Migration progress:
initialize
  parse_args DONE
    parse_diagnostics DONE
  parse_regressor_args
  parse_modules
    save_load_header (hard)
    check_interaction_settings_collision
    parse_feature_tweaks
    parse_example_tweaks
    parse_output_model
    parse_output_preds
    parse_reductions (large surface)
  parse_sources
    load_input_model
      parse_mask_regressor_args
        TODO clear file options
    parse_source (positional param) DONE
    enable_sources DONE
      parse_cache DONE
all reductions

* Migrate model header load

Migration progress:
initialize
  parse_args DONE
    parse_diagnostics DONE
  parse_regressor_args
  parse_modules
    save_load_header (hard) DONE
    check_interaction_settings_collision DONE
    parse_feature_tweaks
    parse_example_tweaks
    parse_output_model
    parse_output_preds
    parse_reductions (large surface)
  parse_sources
    load_input_model
      parse_mask_regressor_args
        TODO clear file options
    parse_source (positional param) DONE
    enable_sources DONE
      parse_cache DONE
all reductions

* Migrate parse_feature_tweaks

* migrate parse_example_tweaks

* migrate parse_output_model and parse_output_preds

* Change to using reference instead of pointer for options object

* Migrate marginal reduction

* Change all signatures to include options_i and vw

* migrate gd

* Migrate kernel svm and fix keep arg in marginal

* migrate ftrl reduction

* migrate svrg reduction

* migrate sender

* migrate gd_mf

* work in progress

* Update loss_function to be able to dynamically identify type

* work in progress

* Finish migrating reductions, and search tasks

* fix compile issues in vw_core

* working through test fixes

* fix bugs

* Fix bug, update test model file

* Fix tests, update cppcli wrapper, update c# unit tests, fix c# versions, remove parser_helper

* Fix help printout, unregistered options check

* Fix options leaked object

* remove width of print for older boost version

* update usage of options in python and remove program_options reference in Java

* add move constructor, fix onethread unregistered issue and fix false negative unrecognized args

* try not using move constructors

* move to a unique ptr to deal with gcc4.9

* fix save resume

* trigger ci

* implement positional parameter for data and revert test changes

* properly handle invalid args for Java

* fix writing if incorrect cb_type to model

* fix --opt=val style parsing from model file args

* change handling of defaults for cb_type, change insert behavior for empty value string

* Fix merge conflict, kvm loss function check is fixed in this PR

* revert unwanted changes

* revert Resource.rc removal

* revert adding include

* fix kernel_svn -> kernel_svm

* Use const& for strings and larger objects in options

* Fix lgtm alert about copy assignment operator

* add back <algorithm> for min/max

* using namespace VW::config;

* fix java dependencies, update cpprestsdk build

* Revert "fix java dependencies, update cpprestsdk build"

This reverts commit 89b5959.

* return reference from, rename make function

* remove options.cc

* use given reference for setup_base

* remove fields from all
@jackgerrits jackgerrits deleted the jagerrit/migrate_options branch April 5, 2019 16:21
lokitoth added a commit to lokitoth/vowpal_wabbit that referenced this pull request Dec 20, 2019
* Update vowpal_wabbit submodule

* Pulls in these fixes
- fac3c06 ("Fix crash when empty multi_ex is supplied for --cb_explore_adf (VowpalWabbit#1679)")
- f085178 ("bugfix:  cb_adf is not including some examples in stats calculation (VowpalWabbit#1686)")
- c9d9af7 ("Fix memory leak in cb_explore_adf (VowpalWabbit#1698)")
- 1e8742b ("Fix some VW initialization memory leaks (VowpalWabbit#1697)")
- 75423e3 ("Fix best constant and best constant's loss calculation when using ksvm (VowpalWabbit#1704)")
- 2b64c3c ("Use std::exp instead of exp free function (VowpalWabbit#1709)")
- 6b7b160 ("Fix unused params warnings plus incomplete struct init (done to default values). (VowpalWabbit#1710)")
- 2e1602d ("Compile fixes (VowpalWabbit#1713)")
- b50fb59 ("Make cbify reduction respect is_learn parameter (VowpalWabbit#1722)")

This also includes the option parsing changes
- b8ab396 ("`options_i` command line parsing refactor  (VowpalWabbit#1706)"
- 5b020c4 ("Ataymano/mac options types fix (VowpalWabbit#1718)")

There are other small or non-rl-consumable changes

* Enable injection of msbuild path in Windows command-line build

* Update vowpal_wabbit submodule

* This pulls in the additional fix made to address the bug discovered in
  the previous attempt to update this.

  Fixed in VowpalWabbit#1742

* Updates safe_vw.cc to get rid of a no-longer-necessary cleanup step
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants