Temporal Explanations for Deep Reinforcement Learning

Despite significant progress in deep reinforcement learning across a range of environments, there are still limited tools to understand why agents make decisions. In particular, we consider how certain actions enable an agent to collect rewards or achieve its goals. Understanding this temporal context for actions is critical to explaining an agent’s choices. To date, however, little research has explored such explanations, and those that do depend on domain knowledge. We address this by developing three novel types of local temporal explanations, two of which do not require domain knowledge, and two novel metrics to evaluate agent skills. We conduct a comprehensive user survey of our explanations against two state-of-the-art local non-temporal explanations for Atari environments and find that our explanations are preferred by users 80.7% of the time over the state-of-the-art explanations.

Example Explanations

The video below is an example contrastive questions from the user survey conducted with an observation from breakout, our novel Plan explanation and on the right a perturbation-based saliency map. All observation / explanations used in the user survey are contained in user-survey along with the survey results and analysis.

contrastive-18.mp4

Click on the following dropdowns to see more examples with all the evaluated explanations mechanisms (Dataset Similarity Explanation, Plan Explanation, Grad-CAM and Perturbation-based Saliency Map).

Example observation for Breakout

Dataset Similarity Explanation

dataset-similarity-explanation.mp4

Skill Explanation

skill-explanation.mp4

Plan Explanation

plan-explanation.mp4

Grad-CAM Explanation

Perturbation-based Saliency Maps

Example observation for Space Invaders

Dataset Similarity Explanation

dataset-similarity-explanation.mp4

Skill Explanation

skill-explanation.mp4

Plan Explanation

plan-explanation.mp4

Grad-CAM Explanation

Perturbation-based Saliency Maps

Example observation for Seaquest

Dataset Similarity Explanation

dataset-similarity-explanation.mp4

Skill Explanation

skill-explanation.mp4

Plan Explanation

plan-explanation.mp4

Grad-CAM Explanation

Perturbation-based Saliency Maps

User Survey results

Figure 4 in the paper presenting the user ratings for each explanation mechanism across four different questions.

Figure 5 in the paper presenting a heatmap of the user preference for each question and between each explanation mechanism. Each grid element is equal to the percentage that the row explanation mechanism was preferred over the column explanation mechanism.

All observation explanations shown to the users are provided in user-survey with the raw survey data and analysis notebook.

Code

Python requirements can be found in requirement.txt and installed with pip install -r requirements.txt. Additionally, to use the project might require installing temporal_explanations_4_drl using pip install -e . in the root directory, no pypi exists currently.

To understand the project structure, we have outlined the purpose of the most important files.

temporal_explanations_4_drl/explain.py - Explanation code for all of our novel explanation, code to save the explanations with the relevant observation (both individually and to compare) along with implementations of Grad-CAM and Perturbation-based Saliency Maps.
temporal_explanations_4_drl/skill.py - Skill instance class and skill alignment and distribution metric implementations
temporal_explanations_4_drl/plan.py - Plan class with methods for computing several metrics across all skills and each skill individually
temporal_explanations_4_drl/graying_the_black_box.py - Implementation of Zahavy et al., 2016 "Graying the black box: Understanding DQNs"
datasets/annotate-domain-knowledge.py - A command line based python script to load pre-defined skills for a set of episode and provide text-based explanations of the purpose for each skill.
datasets/hand-label-skills.py - A command line based python script to hand-label skilled for individual episodes, each observation can be provided an individual skill number between 0 and 9
datasets/generate_datasets.py - A python script to generate datasets for a several environments with options for the size, agent types, etc
datasets/discover_skills.py - A python script using pre-generated datasets to discover agent skills using the algorithm proposed by Zahavy et al., 2016

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
analysis		analysis
datasets		datasets
models		models
temporal_explanations_4_drl		temporal_explanations_4_drl
user-survey		user-survey
.pre-commit-config.yaml		.pre-commit-config.yaml
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal Explanations for Deep Reinforcement Learning

Example Explanations

User Survey results

Code

About

Releases

Packages

Languages

pseudo-rnd-thoughts/temporal-explanations-4-drl

Folders and files

Latest commit

History

Repository files navigation

Temporal Explanations for Deep Reinforcement Learning

Example Explanations

User Survey results

Code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages