Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prototype: pytorch base learner #4660

Closed

Conversation

rajan-chari
Copy link
Member

@rajan-chari rajan-chari commented Nov 8, 2023

Current todo

  • –passes
  • fix memory leak during densification
  • VW reporting functions
    • set_output_example_prediction(output_example_prediction)
    • set_print_update(print_update)
    • set_cleanup_example(cleanup)
  • Save prediction in example
    • ec.partial_prediction, contraction
    • ec.pred.scalar
  • batch optimizer step after gradient accumulation
  • Change the LinearLayer size as input size grows. Use initial, double + copy strategy to reduce compute weight
  • n base_learners
  • Smoke test - unit test
  • save/load
    • model
    • feature_dict
  • Test - learning (cb)
  • Full coverage - unit test
  • Fixed random seed for reproducible results
  • Experimental
  • AdamW

  • Use example.weight in training
  • Check for test_only in learn()?
  • apply l1?
  • num_features calc
  • apply gd finalize_prediction? nan/max/min
  • interactions disabled
  • Triage:
    • set_multipredict(nullptr)
    • set_update(nullptr)
    • set_save_load(nullptr)
    • set_end_pass(nullptr)
    • set_merge_with_all(nullptr)
    • set_add_with_all(nullptr)
    • set_subtract_with_all(nullptr)

Current bugs

  • Parsing -0.00000

Post MVP Wish list

  • accumulate error during minibatch instead of gradients
  • Support arbitrary model created by TorchScript/Other mechanisms
  • VW binary input format
  • statically link torch dlls
  • GPU
  • Different optimizers (AdamW) for simple N layer network
  • Profile allocations and ensure no-allocations in steady state
  • Reduce Binary Size

Debugging Notes:
When building debug configuration in windows, and using system dependencies, use the debug version of libtorch libraries. This is because release version of your app will be ABI incompatible with debug version of libtorch and vice-versa leading to very odd errors. This is why std cpp types should not be used in library interfaces!

@rajan-chari rajan-chari marked this pull request as draft November 8, 2023 22:27
@lalo
Copy link
Collaborator

lalo commented Nov 14, 2023

int num_layers = 3;
int hidden_layer_size = 20;
int mini_batch_size = 10;
new_options.add(make_option("dnn", use_dnn).keep().necessary().help("Fully connected deep neural network base learner."))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mark them as experimental()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

@rajan-chari
Copy link
Member Author

update https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/ThirdPartyNotices.txt if you are planning on merging

Got it.

@rajan-chari rajan-chari reopened this Nov 25, 2023
@rajan-chari
Copy link
Member Author

update https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/ThirdPartyNotices.txt if you are planning on merging

Makes sense.

@olgavrou
Copy link
Collaborator

To be re-opened in the future

@olgavrou olgavrou closed this Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants