Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

2.2 NLP

Selene Baez edited this page Sep 29, 2020 · 2 revisions

Overview

When Leolani is turned on a new Context object is instantiated, and it consists of datetime, location, and people and objects present. When a nearby person is seen and recognized a new Chat object is initialized, and when the person speaks this creates an Utterance object which is added to the Chat. Each Chat is connected to the person Leolani is speaking to and the Context within which the conversation is happening, and consists of a list of Utterances. Upon initialization, an Utterance is parsed with the CFG and analyzed with the help of the Analyzer class. If an Utterance is a statement, it also has a Perspective object which consists of a polarity, certainty and sentiment value.

The Analyzer class consists of a hierarchy of classes, topmost class is the abstract general class Analyzer, which is separated into two abstract classes StatementAnalyzer and QuestionAnalyzer, which consist of the concrete classes GeneralStatementAnalyzer, WhQuestionAnalyzer and VerbQuestionAnalyzer. These three get the utterance constituents as input (as parsed with the CFG) and map them to the intermediate triples which are then passed on to the Brain and stored as RDF triples.

Triple extraction

The intermediate triples consist of subject, predicate and complement alongside with their semantic types, and a perspective object in case of statements. In the case of a question the triple is incomplete. Below are a few examples of the intermediate triples which are the output of analyzers:

  • “My sister enjoys eating cakes” lenka-sister_enjoy_eating-cakes
  • “What does my sister enjoy?” lenka-sister_enjoy_?

As you can see the elements of the intermediate triple are separated with underscore while dash is used to separate elements of multiword expressions. In a particular case where a multiword expression is actually a collocation, the multiword expression is marked with apostrophes (e.g. selene_be-from_”mexico-city”). This is to make sure that subparts of collocations are not analyzed separately.

Basic rules that the analyzer follows are:

  • predicates are lemmatized verbs, with possible prepositions connected to the verb
    • (“live-in”, “come-from”,…)
  • modal verbs are analyzed using the lexicon and their modality is stored as one of the perspective values
    • “might-come” - {'polarity': 1, 'certainty': '0.5', 'sentiment': 0}
  • negation is removed after processing and stored within the perspective object as polarity
    • I think selene doesn't like cheese = “selene_like_cheese” - {'polarity': -1, 'certainty': '0.75', 'sentiment': ’0.75'}
    • I think selene hates cheese = “selene_hate_cheese” - {'polarity': 1, 'certainty': '0.75', 'sentiment': '-1'}
  • properties end with “-is”
    • My favorite color is green = lenka_favorite-color-is_green (this way it is quite easy for NLG)
  • words that refer to a person are grouped together in the subject unless the verb is just “be”, in this case they are processed like properties (“sister-is”)
    • My best friend is Selene = lenka_best-friend-is_selene
    • My best friend’s name is Selene = lenka-best-friend_name-is_selene
  • adjectives, determiners and numbers are joined with the noun
    • “a-dog”, “the-blue-shirt”, etc.

Perspective

  • certainty_value: between 0 and 1
if certainty_value > .90: 'CERTAIN'
if certainty_value >= .50: 'PROBABLE'
if certainty_value > 0: 'POSSIBLE'
else 'UNDERSPECIFIED' 
  • polarity_value: between -1 and 1
if polarity_value > 0: 'POSITIVE'
if polarity_value < 0: 'NEGATIVE'
else 'UNDERSPECIFIED'
  • sentiment_value: between -1 and 1
if sentiment_value > 0: 'POSITIVE'
if sentiment_value < 0: 'NEGATIVE'
else 'UNDERSPECIFIED'

Pipeline

Below is a short summary of NLP that happens during the utterance analysis:

  1. Removing usual openers such as “excuse me” or “leolani, can you tell me”, etc.
  2. Tokenization and replacing contractions with long variants of aux verbs
  3. POS tagging (NLTK and Stanford + would be good to add an additional tagger to use when the two have a mismatch)
  4. CFG parsing using the grammar which is manually designed
  5. Analyzer class maps the output of CFG parsing to the subject-object-complement triple, following the rules which are mentioned above
  6. Lemmatization using NLTK
  7. Modal verbs are analyzed using the lexicon and this is stored within Perspective
  8. Checking whether some of the multi-word elements are actually collocations such as New York or ice-cream (these should be processed as one word)
  9. Getting semantic types of each element of the triple, and its subparts, using the manually made lexicon, WordNet lexname, Stanford NER

Sample output

Here is a sample output for sentence “I have three white cats”:

  • subject: {"text": "Lenka", "type": ["person"]}
  • predicate: {"text": "have", "type": ["verb.possession"]}
  • object: {"text": "three-white-cats", "type": ["adj.all", "noun.animal", “numeral:3"]}
  • utterance type: STATEMENT
  • perspective: {'polarity': 1, 'certainty': 1, 'sentiment': 0}

Testing

Best way to see how the NLP works is to run the test_with_triples function which uses the datasets with scenarios stored in language/data. If you also want to test perspective extraction, use the language/data/perspective.txt test file. Other option which includes reply generation is tested with the function test_scenarios and uses scenarios.txt as test file. There is currently no golden standard with types included.

Clone this wiki locally