-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Efficient Second Order Online Learning
Ariel Faigon edited this page Aug 31, 2018
·
6 revisions
OjaNewton is a sketched variant of a second order online learning algorithm called Online Newton Step (ONS). It overcomes the quadratic running time of ONS by performing the Oja's updates to keep a small sketch of the covariance matrix of gradients in a sparse manner.
vw --OjaNewton --sketch_size=10 --alpha_inverse=1.0 -d train_file -f model_file
Here sketch_size is the number of directions that we keep for the covariance matrix (default is 10) and alpha_inverse can be viewed as a learning rate (default is 1.0).
Then, to predict from the trained model and a new data set:
vw -i model_file -d data_file -p predict_file
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: