Logical Optimal Actions (LOA) is an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. This repository has an implementation of LOA experiments consists of Python package on TextWorld Commonsense (TWC) game.
- Anaconda 4.10.3
- Tested on Mac and Linux
git clone --recursive git@github.com:IBM/LOA.git loa
cd loa
# Setup games
git clone git@github.com:IBM/commonsense-rl.git
cp -r commonsense-rl/games ./
rm -rf commonsense-rl
# Setup environment
conda create -n loa python=3.8
conda activate loa
conda install pytorch=1.10.0 torchvision torchaudio nltk=3.6.3 -c pytorch
pip install -r requirements.txt
python -m spacy download en
cd third_party/amr-cslogic
# Execute installation scripts in INSTALLATION.md for seting up AMR-CSLogic
export FLASK_APP=./amr_verbnet_semantics/web_app/__init__.py
python -m flask run --host=0.0.0.0 --port 5000 &
cd ../../
# If you don't want to run the server
mkdir -p cache
wget -O cache/amr_cache.pkl https://ibm.box.com/shared/static/klsvx54skc5wlf35qg3klo35ex25dbb0.pkl
# Note: This cache only contains sentences for "easy" game which is default in train.py
python train.py
# if you have AMR server
python train.py --amr_server_ip localhost --amr_server_port 5000
This repository provides code for the following paper, please cite the paper and give a star if you find the paper and code useful for your work.
-
Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, and Alexander Gray, "LOA: Logical Optimal Actions for Text-based Interaction Games", ACL-IJCNLP 2021.
Details and bibtex
The paper presents an initial demonstration of logical optimal action (LOA) on TextWorld (TW) Coin collector, TW Cooking, TW Commonsense, and Jericho. In this version, the human player can select an action by hand and recommendation action list from LOA with visualizing acquired knowledge for improvement of interpretability of trained rules.
@inproceedings{kimura-etal-2021-loa, title = "{LOA}: Logical Optimal Actions for Text-based Interaction Games", author = "Kimura, Daiki and Chaudhury, Subhajit and Ono, Masaki and Tatsubori, Michiaki and Agravante, Don Joven and Munawar, Asim and Wachi, Akifumi and Kohita, Ryosuke and Gray, Alexander", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.acl-demo.27", doi = "10.18653/v1/2021.acl-demo.27", pages = "227--231" }
-
Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar and Alexander Gray, "Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning", ACL 2023.
Details and bibtex
Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. This is because of their advantages ranging from inherent interpretability, the lesser requirement of training data, and being generalizable in scenarios with unseen data. Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. Our experiments on established text-based game benchmarks show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better generalization to unseen test games and learning from fewer training interactions.
-
Daiki Kimura, Masaki Ono, Subhajit Chaudhury, Ryosuke Kohita, Akifumi Wachi, Don Joven Agravante, Michiaki Tatsubori, Asim Munawar, and Alexander Gray, "Neuro-Symbolic Reinforcement Learning with First-Order Logic", EMNLP 2021.
Details and bibtex
The paper shows an initial experiment of LOA by extracting first-order logical facts from text observation and external word meaning network on TextWorld Coin-collector. The experimental results show RL training with the proposed method converges significantly faster than other state-of-the-art neuro-symbolic methods in a TextWorld benchmark.
@inproceedings{kimura-etal-2021-neuro, title = "Neuro-Symbolic Reinforcement Learning with First-Order Logic", author = "Kimura, Daiki and Ono, Masaki and Chaudhury, Subhajit and Kohita, Ryosuke and Wachi, Akifumi and Agravante, Don Joven and Tatsubori, Michiaki and Munawar, Asim and Gray, Alexander", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", address = "Online and Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.emnlp-main.283", pages = "3505--3511" }
-
Subhajit Chaudhury, Prithviraj Sen, Masaki Ono, Daiki Kimura, Michiaki Tatsubori, and Asim Munawar, "Neuro-symbolic Approaches for Text-Based Reinforcement Learning", EMNLP 2021.
Details and bibtex
The paper presents SymboLic Action policy for Textual Environments (SLATE) method which is same concept of LOA. The method outperforms previous state-of-the-art methods for the coin collector game from 5-10x fewer training games.
@inproceedings{chaudhury-etal-2021-neuro, title = "Neuro-Symbolic Approaches for Text-Based Policy Learning", author = "Chaudhury, Subhajit and Sen, Prithviraj and Ono, Masaki and Kimura, Daiki and Tatsubori, Michiaki and Munawar, Asim", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", address = "Online and Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.emnlp-main.245", pages = "3073--3078" }
-
Sarathkrishna Swaminathan, Dmitry Zubarev, Subhajit Chaudhury, Asim Munawar, “Reinforcement Learning with Logical Action-Aware Features for Polymer Discovery”, Reinforcement Learning for Real Life Workshop 2021.
Details and bibtex
The paper presents the first application of reinforcement learning in materials discovery domain that explicitly considers logical structure of the interactions between the RL agent and the environment.
@conference{swaminathan-etal-2021-reinforcement, title = "Reinforcement Learning with Logical Action-Aware Features for Polymer Discovery", author = "Swaminathan, Sarathkrishna and Zubarev, Dmitry and Chaudhury, Subhajit and Munawar, Asim", booktitle = "Reinforcement Learning for Real Life Workshop", year = "2021" }
MIT License