A task-oriented chatbot with reinforcement learning (DQN) and LSTM.
This chatbot is built based on a paper called 'end-to-end task completion neural dialogue system', but I rewrote the code with Keras and adapted it to our background enviroment - Singapore event recommendation. In my opinion, it is not suitbale to be called an end-to-end system, because the system they built consists of several parts and NLU and NLG parts is not end-to-end trainable with other parts. The components are:
- User simulator and a real user interface
- Natural language generator (can be sentence template filling or seq-to-seq model)
- Natural language understanding model
- Dialogue manager
- Dialogue state tracker
- An agent with Deep Q learning
- Sample and clean data from database using elastic search.
- Simulate the user using the sampled data, generate user agenda and randomly generate a user state (slot-value pairs).
- Parse natural language from a real user to slot-value pairs using LSTM.
- Update dialogue state.
- Predict the best action with dialogue state using Deep Q Learning and update rewards.
- Generate natural language based on agent action.
User Goal: {'request_slots': ['duration', 'is_weekend'], 'inform_slots': {'region': 'other', 'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}}
User State: {'request_slots': ['event'], 'history_slots': {}, 'turn': 1, 'inform_slots': {'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'rest_slots': ['region', 'duration', 'is_weekend'], 'act': 1}
Agent State: {'request_slots': ['region'], 'turn': 1332, 'sentence': '', 'inform_slots': {}, 'act': 0}
Episode over: False, Reward: 10
User State: {'request_slots': ['event'], 'history_slots': {'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'turn': 2, 'inform_slots': {'region': 'other'}, 'rest_slots': ['duration', 'is_weekend'], 'act': 1}
Agent State: {'request_slots': {}, 'turn': 1333, 'sentence': '', 'inform_slots': {'event': "I'm able to find the event.", 'name': u'Tuesday Night Badminton Game, 4 Aug, 8-10pm @ Geh Poh Ville Community Club'}, 'act': 1}
Episode over: False, Reward: -10
User State: {'request_slots': ['event'], 'history_slots': {'region': 'other', 'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'turn': 3, 'inform_slots': {}, 'rest_slots': ['duration', 'is_weekend'], 'act': 5}
Agent State: {'request_slots': {}, 'turn': 1334, 'sentence': '', 'inform_slots': {'event': "I'm able to find the event.", 'name': u'Speed Dating Event\u2605\u2605\u260524-36F\u2605\u2605\u2605\xad28-39M'}, 'act': 1}
Episode over: False, Reward: 50
User State: {'request_slots': ['is_weekend'], 'history_slots': {'region': 'other', 'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'turn': 4, 'inform_slots': {}, 'rest_slots': ['duration'], 'act': 0}
Agent State: {'request_slots': {}, 'turn': 1335, 'sentence': '', 'inform_slots': {'is_weekend': False}, 'act': 1}
Episode over: False, Reward: 50
User State: {'request_slots': ['duration'], 'history_slots': {'region': 'other', 'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'turn': 5, 'inform_slots': {}, 'rest_slots': [], 'act': 0}
Agent State: {'request_slots': {}, 'turn': 1336, 'sentence': '', 'inform_slots': {'duration': "I also don't know"}, 'act': 1}
Episode over: False, Reward: 100
User State: {'request_slots': [], 'history_slots': {'region': 'other', 'name': u'Tots Mind & Movement - $20 Trial Promo', 'event_host': u'Mums, Babies and Kids Activities (Inspire Mum & Baby)'}, 'turn': 6, 'inform_slots': {}, 'rest_slots': [], 'act': 4}
Agent State: {'request_slots': {}, 'turn': 1337, 'sentence': '', 'inform_slots': {}, 'act': 4}
Episode over: True, Reward: 100
************* simulation episode 211: Success, score: 300