Skip to content

Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)

License

Notifications You must be signed in to change notification settings

alison-carrera/onn

Repository files navigation

Online Neural Network (ONN)

This is a Pytorch implementation of the Online Deep Learning: Learning Deep Neural Networks on the Fly paper. This algorithm contains a new backpropagation approach called Hedge Backpropagation and it is useful for online learning. In this algorithm you model a overnetwork architeture and the algorithm will try to turn on or turn off some of the hidden layers automatically. This algorithm uses the first hidden layer to train/predict but if it is going bad it starts to use another layers automatically. For more informations read the paper in the 'References' section.

Installing

pip install onn

How to use

#Importing Library
import numpy as np
from onn.OnlineNeuralNetwork import ONN

#Starting a neural network with feature size of 2, hidden layers expansible until 5, number of neuron per hidden layer = 10 #and two classes.
onn_network = ONN(features_size=2, max_num_hidden_layers=5, qtd_neuron_per_hidden_layer=10, n_classes=2)

#Do a partial training
onn_network.partial_fit(np.asarray([[0.1, 0.2]]), np.asarray([0]))
onn_network.partial_fit(np.asarray([[0.8, 0.5]]), np.asarray([1]))

#Predict classes
predictions = onn_network.predict(np.asarray([[0.1, 0.2], [0.8, 0.5]]))

Predictions -- array([1, 0])

New features

  • The algortihm works with batch now. (It is not recommended because this is an online approach. It is useful for experimentation.)
  • The algorithm can use CUDA if available. (If the network is very small, it is not recommended. The CPU will process more fast.)

Non-linear Contextual Bandit Algorithm (ONN_THS)

The ONN_THS acts like a non-linear contextual bandit (a reinforcement learning algorithm). This algorithm works with the non-linear exploitation factor (ONN) plus an exploration factor provided by Thompson Sampling algorithm. The ONN_THS works with 'select' and 'reward' actions. For more detailed examples, please look at the jupyter notebook file in this repository. This algorithm was created by me to solve a problem in the company I work.

The great thing about this algorithm is that it can be used in an online manner and it has a non-linear exploitation. This algorithm can learn different kind of data in a reinforcement learning way.

How to use

#Importing Library
import numpy as np
from onn.OnlineNeuralNetwork import ONN_THS

#Starting a neural network with feature size of 2, hidden layers expansible until 5, number of neuron per hidden layer = 10 #and two classes.
onn_network = ONN_THS(features_size=2, max_num_hidden_layers=5, qtd_neuron_per_hidden_layer=10, n_classes=2)

#Select an action
arm_selected, exploration_factor = onn_network.predict(np.asarray([[0.1, 0.2]]))

#Reward an action
onn_network.partial_fit(np.asarray([[0.1, 0.2]]), np.asarray([arm_selected]), exploration_factor)

Contributors

References