Skip to content

Distributional reinforcement learning codebase to simulate/model heterogeneity in dopamine signaling

Notifications You must be signed in to change notification settings

siddharth-r-tiwari/drl_sim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diversifying Parameters in Reinforcement Learning (Distributional RL)

Package designed to simulate the expected shape of an agent using a distributional TD-$\lambda$ learning scheme, an algorithm thought to be encoded by dopamine neurons in the striatum (when learning). The following functionality is included to model experiments and agents:

  • Task contigencies (number and timing of cues, reward sizes and delays, etc.)
  • Diversity of parameters in TD error computation ($\alpha$, $\gamma$, $\lambda$)
  • Associated visualizations (learning/value at time of cue over trials, animations, etc.)
  • Saving simulation data/visualizations through function parameters

Change the objects/functions as needed, particularly in the drl.py file!

Created by Siddharth Tiwari for the Uchida Lab (2024).

Implementing Simulations

The ability to easily implement new task contingencies, observe diversity in learning, and abstract heavy algorithms was kept at the forefront of design. Simply follow three steps when simulating/visualizing experiments!

  1. (Set up Environment)
  2. Update Trial Parameters
  3. Specify Parameters, Simulate
  4. Visualize Experiments

0. Set up Environment

Run pip install -r requirements.txt to download dependencies.

1. Update Trial Parameters

Specify trial set up (time of stimulus, size/time of reward, number of trials, etc. - see documentation and examples!) to assign schedules for experiment (simulations). This will require editing the init function within the drl.py file to update how stimuli and rewards are assigned to tensors. Check out sample implementations in drl.py and sim.ipynb.

2. Specify Parameters, Simulate

After obtaining schedules for stimulus and reward, simply choose parameters for simulation! IMPORTANT: Up to 2 parameters can be distributed, where only 1 parameter can be distributed uniformly across an interval.

Data from simulation can be saved and used for subsequent visualizations.

Sample implementations in sim.ipynb.

3. Visualize Experiments

Use 1 of 3 visualizations for simulated data:

  • Value at Time (val_at_t): Value at inputted time, across trials. image info
  • TD-Error Heatmap (heatmap): TD error across trials and time steps in a given trial. image info
  • Value over Time (val_over_t): Animation of value over trials for all predictors (at all time steps). image info

Every visualization has the option to "diversify" parameters, such that we can observe how value predictors with different parameters behave in response to the same reward schedules. The specifications for this are included in sim.ipynb

Files Description

  • drl.py: Functions for DRL
  • drl.md: Documentation for functions in drl.py
  • sim.ipynb: Example simulations using DRL Functions
  • requirements.txt: Requirements for drl.py and sim.ipynb
  • figs/: Sample figures produced from current code (any new figures will be outputted to this file, if repo is cloned)
  • exps/: Saved simulation data

About

Distributional reinforcement learning codebase to simulate/model heterogeneity in dopamine signaling

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published