Make metadata available in NLU pipeline #6090

cajoek · 2020-06-29T14:10:50Z

Note: I know that I'm using Rasa in an unusual way.

Description of Problem:

I have a website with some tools and I'm trying to educate and guide my users on how to use those tools. For that I have many interactive tutorials which in turn have many steps. During the tutorial users can ask questions to the AI tutor about what they see, where they are supposed to click, how things work, in which order to do things, why they should do that thing, and many other kinds of questions. This chat functionality is powered by Rasa. This setup means that there are very many similar but different intents and each intent may only be valid on a few pages or during a few steps. I would therefor like to pass a key in the metadata that indicates where in the tutorial the user currently is and use this key in my custom NLU component to filter out the relevant intents/responses for that part of the tutorial before predicting the correct one.

In a sense I want the contextual assistants context to also include what the user sees in the current tutorial.

Relevant topics:
https://forum.rasa.com/t/pass-along-users-current-location-on-website-as-a-paramter-to-rasa/25908
https://forum.rasa.com/t/metadata-in-message-in-nlu-pipeline-component/26767

Overview of the Solution:

Metadata for each message in the conversation needs to be available in the NLU pipeline.
Currently metadata is only avaiable in rasa.core.channels.UserMessage not in rasa.nlu.training_data.Message

In this post https://forum.rasa.com/t/new-training-data-format-ideas/29687 it is mentioned that "support for optional custom metadata in training examples" may come in a new update. Will this also allow for providing metadata to the NLU during inference?

Definition of Done:

Chat metadata for all messages is available to NLU components.

The text was updated successfully, but these errors were encountered:

sara-tagger · 2020-06-30T06:00:11Z

Thanks for submitting this feature request 🚀 @wochinge will get back to you about it soon! ✨

cristianmtr · 2020-07-03T11:46:27Z

My team is also in need of this.

Quoting my colleague, @nbeuchat :

One thing we want to do is passing some geographic context to the bot so that location entities can be properly disambiguated.
For example, if a user is looking for a place in NYC, we want to be able to differentiate between “Downtown” between NY or other cities so that we get the proper ID for this location. Up until now, we’ve done this mapping in actions so it was easy to use slots for the context. The problem is that we want to extend our location parsing using a custom pipeline component which uses Spacy’s pattern matcher. But we first need our disambiguation before applying the patterns.

wochinge · 2020-07-08T11:19:43Z

Sorry, I was convinced to have already answered 😬

Thanks for brining up all these different use cases. Especially the one with geographic context is very valid.
I think we should probably not just metadata but the whole conversation tracker (previous messages, bot actions / utterances, into the NLU pipeline).

What do you think @Ghostvv ?

cristianmtr · 2020-07-08T11:46:16Z

Yes, having access to the tracker and sender_id would also be very nice! We have added a mechanism for logging by sender_id to our logging system, and seeing which sender has triggered a specific error in a pipeline component would be very useful :)

Ghostvv · 2020-07-08T16:18:53Z

NLU pipeline should be context agnostic, it is used to classify text into abstract notions. Disambiguation should happen through rasa core. E.g. disambiguating entities can be done inside custom action or a form.
Same for the problem of website, I'd recommend to merge similar intents into one, but use dialogue story that would receive different predefined external intents when user switch pages. Then different actions can be predicted even though nlu intent is the same

cajoek · 2020-07-09T10:26:21Z

Does the NLU pipeline have to be context agnostic? Why?
The ResponseSelector is not so abstract.

Ghostvv · 2020-07-09T11:54:44Z

this is the separation of nlu and core. Core is responsible for context. Response selector is still context agnostic.
I think if we keep it agnostic, we maintain clarity of what happening when, which makes it more scalable.

We fully integrate nlu inside core in the further update. So you could easily access nlu components inside core, and therefore condition their output on the context.

nbeuchat · 2020-07-15T11:58:40Z

Disambiguation should happen through rasa core. E.g. disambiguating entities can be done inside custom action or a form.

Correct me if I'm wrong but custom actions and forms are not really part of Rasa core, are they? Rasa core just predicts the next action/form/utterance to use and that's it. For the disambiguation case, that means that you need to either:

Have a specific action doing this disambiguation and put it before any action that needs the entity to be correctly disambiguated (or linked). Then, you rely on Rasa core to predict your disambiguation action before any other action that actually needs your entity
Do the disambiguation in every single action that wants to use this (this is what we chose to do). That also forces you to use actions instead of utterances in some cases and you cannot simply use the automatic slot filling from entities

To me, it forces us to do some simple entities processing way later in the NLU > Core > Action pipeline which makes it very unclear of what is happening when.

I think if we keep it agnostic, we maintain clarity of what happening when, which makes it more scalable.

I would agree with this statement for the conversation part (ie: it shouldn't really matter where you are in the conversation from an NLU perspective). However, I feel that entity disambiguation/linkage should be done within the NLU pipeline and some limited context can be needed (geography for entity disambiguation, user locale for date parsing, user country for phone number parsing, etc.)

Ghostvv · 2020-07-15T22:46:02Z

custom actions and forms are not really part of Rasa core

what do you mean? they are part of rasa core, but you need to write them using sdk

Have a specific action doing this disambiguation and put it before any action that needs the entity to be correctly disambiguated (or linked).

why can't you do it inside the action that needs to use it?

geography for entity disambiguation, user locale for date parsing, user country for phone number parsing

is it always the same inside conversation but change from conversation to conversation?
Sorry, then misunderstood what do you mean by context. if it is changing inside one conversation, for example entity number can mean different things in different parts of the conversation, then I think it is better to have it in core, otherwise how do you create training data.
If it is set in the beginning of the conversation and don't change throughout the conversation, then I agree predict custom action every time is a tedious process. I can envision some default action similar to action_listen, that is not predicted, but always called after the utterance and execute the custom action from sdk that is basically responsible for auto slot filling but with additional customizable logic. Would this solve your issue?

wochinge · 2020-09-17T13:01:34Z

@nbeuchat What do you think of the proposed solution?

nbeuchat · 2020-09-17T13:57:23Z

@nbeuchat What do you think of the proposed solution?

I love it! Very much waiting to have it in Rasa 2.0. I just implemented last week something that would have greatly benefited from the custom auto slot filling. It's also great combined with the new entities groups (for example, to pick or ignore a group for auto-filling a slot).

wochinge · 2020-09-17T14:18:28Z

Very much waiting to have it in Rasa 2.0.

Ehm, about this 🙄 🙈 It's probably rather gonna be a minor after 2.0 . We are currently focusing upon releasing a production ready Rasa Open Source 2.0 version with the features which we committed to and want to avoid further delays due to newly added features. Getting 2.0 ready, documented and smooth to use is currently our number one priority. We will return to our regular release cycle (every 4 weeks) once 2.0 is released so it shouldn't take too long 🤞 I'm sorry to keep you waiting longer.

nbeuchat · 2020-09-17T14:22:53Z

No worries at all 😄 In any case, we'll have some work to do to migrate to 2.0 in the first place so I wasn't expecting to have that new feature up and running directly :-) Thanks for the update!

TyDunn · 2021-01-28T12:40:25Z

@wochinge Can we close this after 2.0?

wochinge · 2021-02-01T08:53:52Z

Yes, fixed by #6275

dingusagar · 2021-02-18T14:25:55Z

the default interpreter in Agent is RasaNLUInterpreter. But RasaNLUInterpreter does't have this feature of passing on the metadata in its parse method.

goxiaoy · 2021-07-24T09:33:13Z

Why this issue is closed?

The current meta is not passed from core to nlu

rasa/rasa/core/interpreter.py

Lines 131 to 147 in 2039cef

    
               async def parse( 
        
                   self, 
        
                   text: Text, 
        
                   message_id: Optional[Text] = None, 
        
                   tracker: Optional[DialogueStateTracker] = None, 
        
                   metadata: Optional[Dict] = None, 
        
               ) -> Dict[Text, Any]: 
        
                   """Parse a text message. 
        
                   Return a default value if the parsing of the text failed.""" 
        
                   if self.lazy_init and self.interpreter is None: 
        
                       self._load_interpreter() 
        
                   result = self.interpreter.parse(text) 
        
                   return result

The current interpreter of nlu can not accept metadata

rasa/rasa/nlu/model.py

Lines 444 to 449 in 2039cef

    
           def parse( 
        
               self, 
        
               text: Text, 
        
               time: Optional[datetime.datetime] = None, 
        
               only_output_properties: bool = True, 
        
           ) -> Dict[Text, Any]:

Ref active question from forum

https://forum.rasa.com/t/passing-metadata-to-custom-components-nlu/45198
https://forum.rasa.com/t/metadata-in-message-in-nlu-pipeline-component/26767/9

indam23 · 2021-11-24T10:04:04Z

The changes in the linked PR don't send metadata all the way down to the method in which it is needed for NLU components to access it.
Here in processer.parse_message metadata is available:

rasa/rasa/core/processor.py

Lines 572 to 574 in 7c8aa9f

    
           parse_data = await self.interpreter.parse( 
        
               text, message.message_id, tracker, metadata=message.metadata 
        
           )

But here in RasaNLUInterpreter.parse which it calls, only the text of the message actually gets passed on:

rasa/rasa/core/interpreter.py

Line 145 in 7c8aa9f

result = self.interpreter.parse(text)

(also true of RasaNLUHttpInterpreter.parse)

Eventually this will call rasa.nlu.model.Interpreter.parse, which does not accept metadata at all:

rasa/rasa/nlu/model.py

Lines 444 to 449 in 7c8aa9f

    
           def parse( 
        
               self, 
        
               text: Text, 
        
               time: Optional[datetime.datetime] = None, 
        
               only_output_properties: bool = True, 
        
           ) -> Dict[Text, Any]:

This means the UserMessage object available to NLU components is not the same object that the processor has access to.

rasa/rasa/nlu/model.py

Lines 467 to 470 in 7c8aa9f

    
           message = Message(data=data, time=timestamp) 
        
           for component in self.pipeline: 
        
               component.process(message, **self.context)

indam23 · 2021-11-24T10:11:28Z

The original issue description seems to be referring to training data metadata, which is quite different from incoming message metadata. However the linked PR seems to be making incoming message metadata available to the interpreter (although not quite, as above)

indam23 · 2021-11-24T12:24:09Z

Apologies, the above is true for Rasa 2.8.x, this is resolved entirely in Rasa 3.0. Metadata is available in each component's process method in message.data["metadata"]

mattyd2 · 2022-04-20T11:57:49Z

@melindaloubser1 can you provide a link in the docs or code to where message.data["metadata"] is implemented in 3.0?

indam23 · 2022-04-20T12:29:31Z

See here - The UserMessage object, which includes metadata, is now passed down through to processor._handle_message_with_tracker. That allows parsing to access the whole message object also.

cajoek added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR labels Jun 29, 2020

wochinge added the type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. label Jul 8, 2020

wochinge changed the title ~~Make metadata aviable in NLU pipline~~ Make metadata available in NLU pipeline Jul 8, 2020

Ghostvv mentioned this issue Jul 24, 2020

substitute auto slot filling with customizable action #6267

Closed

wochinge removed the type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. label Sep 17, 2020

alwx added the area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) label Jan 28, 2021

wochinge closed this as completed Feb 1, 2021

indam23 reopened this Nov 24, 2021

indam23 closed this as completed Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make metadata available in NLU pipeline #6090

Make metadata available in NLU pipeline #6090

cajoek commented Jun 29, 2020 •

edited

Loading

sara-tagger commented Jun 30, 2020

cristianmtr commented Jul 3, 2020

wochinge commented Jul 8, 2020

cristianmtr commented Jul 8, 2020

Ghostvv commented Jul 8, 2020

cajoek commented Jul 9, 2020

Ghostvv commented Jul 9, 2020 •

edited

Loading

nbeuchat commented Jul 15, 2020

Ghostvv commented Jul 15, 2020 •

edited

Loading

wochinge commented Sep 17, 2020

nbeuchat commented Sep 17, 2020

wochinge commented Sep 17, 2020 •

edited by Ghostvv

Loading

nbeuchat commented Sep 17, 2020

TyDunn commented Jan 28, 2021

wochinge commented Feb 1, 2021

dingusagar commented Feb 18, 2021

goxiaoy commented Jul 24, 2021

indam23 commented Nov 24, 2021

indam23 commented Nov 24, 2021

indam23 commented Nov 24, 2021 •

edited

Loading

mattyd2 commented Apr 20, 2022

indam23 commented Apr 20, 2022

Make metadata available in NLU pipeline #6090

Make metadata available in NLU pipeline #6090

Comments

cajoek commented Jun 29, 2020 • edited Loading

sara-tagger commented Jun 30, 2020

cristianmtr commented Jul 3, 2020

wochinge commented Jul 8, 2020

cristianmtr commented Jul 8, 2020

Ghostvv commented Jul 8, 2020

cajoek commented Jul 9, 2020

Ghostvv commented Jul 9, 2020 • edited Loading

nbeuchat commented Jul 15, 2020

Ghostvv commented Jul 15, 2020 • edited Loading

wochinge commented Sep 17, 2020

nbeuchat commented Sep 17, 2020

wochinge commented Sep 17, 2020 • edited by Ghostvv Loading

nbeuchat commented Sep 17, 2020

TyDunn commented Jan 28, 2021

wochinge commented Feb 1, 2021

dingusagar commented Feb 18, 2021

goxiaoy commented Jul 24, 2021

indam23 commented Nov 24, 2021

indam23 commented Nov 24, 2021

indam23 commented Nov 24, 2021 • edited Loading

mattyd2 commented Apr 20, 2022

indam23 commented Apr 20, 2022

cajoek commented Jun 29, 2020 •

edited

Loading

Ghostvv commented Jul 9, 2020 •

edited

Loading

Ghostvv commented Jul 15, 2020 •

edited

Loading

wochinge commented Sep 17, 2020 •

edited by Ghostvv

Loading

indam23 commented Nov 24, 2021 •

edited

Loading