Skip to content

Usage of policy gradient reinforcement learning to solve portfolio optimization problems (Tactical Asset Allocation).

Notifications You must be signed in to change notification settings

SvenBecker/TAA-PG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌍 EnglishGerman

Reinforcement Learning for Tactical Asset Allocation

This project contains the training and testing of multiple reinforcement learning agents given a portfolio management environment.

The interaction between environment and agent is given by:
Environment <> Runner <> Agent <> Model

Important files and folders

Dependencies

The proposed implementation was done mainly in python, therefore python version >=3.6 is required. Furthermore following additional python packages are required:

  • h5py==2.7.1
  • Keras==2.1.3
  • matplotlib==2.1.0
  • numpy==1.14.1
  • pandas==0.20.3
  • pandas-datareader==0.5.0
  • scikit-learn==0.19.1
  • scipy==1.0.0
  • seaborn==0.8.1
  • tensorflow==1.4.0
  • tensorflow-tensorboard==0.4.0rc3
  • tensorforce==0.3.5.1

Running the train file

For agent training you should run the file train.py using the console.
For example:

python ~/path/to/file/run/train.py -at "clipping" -v 1

Changes to the environment and/or run parameters can be selected through the train file, config file and through pre specified flags shown below.

Modifications of the agents and models must be specified through the belonging config file (json format).

Flags:

Flag 1 Flag 2 Meaning
-d --data path of the environment.csv file
-sp --split train/test split
-th --threaded (bool) threaded runner oder single runner
-ac --agent-config path to the agent config file
-nw --num-worker number of threads if threaded is being selected
-ep --epochs number of epochs
-e --episodes number of episodes
-hz --horizon investment horizon
-at --action-type action typ: 'signal', 'signal_softmax', 'direct', 'direct_softmax', 'clipping'
-as --action-space action space: 'unbounded', 'bounded', 'discrete'
-na --num-actions number of dicrete actions given a discrete action space
-mp --model-path saving path of the agent model
--eph --eval-path saving path of the agent model of the evaluation files
-v --verbose console verbose level
-l --load-agent if given agent will be loaded based on a prior save point (path)
-ds --discrete_states discretization of the state space if true
-ss -standardize-state (bool) standardization or normalization of the state
-rs --random-starts (bool) random starts for each new episode

Running the test file

The execution of the test file is very similar to the one of the train file. There has to be a checkpoint in saves for the selected agent.

python ~/path/to/project/run/test.py -l /project/model/saves/AgentName

The folder saved_results contains multiple parameter constellations of pretrained agents.

Flags

Flag Flag 2 Meaning
-d --data path of the environment.csv file
-ba --basic-agent selection of a BasicAgenten: 'BuyAndHoldAgent', 'RandomActionAgent'
-sp --split train/test split
-ac --agent-config path to the agent config file
-e --episodes number of episodes
-hz --horizon investment horizon
-at --action-type action typ: 'signal', 'signal_softmax', 'direct', 'direct_softmax', 'clipping'
-as --action-space action space: 'unbounded', 'bounded', 'discrete'
-na --num-actions number of dicrete actions given a discrete action space
--eph --eval-path saving path of the agent model of the evaluation files
-v --verbose console verbose level
-l --load-agent if given agent will be loaded based on a prior save point (path)
-ds --discrete_states (bool) discretization of the state space if true
-ss -standardize-state (bool) standardization or normalization of the state

TensorBoard

The files predictor.py as well as train.py integrate TensorBoard. TensorBoard can be loaded by typing

tensorboard --logdir path/to/project/env/board
tensorboard --logdir path/to/project/run/board

Got to localhost:6006 for the results.

Credits

About

Usage of policy gradient reinforcement learning to solve portfolio optimization problems (Tactical Asset Allocation).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages