-
Notifications
You must be signed in to change notification settings - Fork 1
NLP and Sophie
First, her name. What's best fitting for a learning computer but the name of Sophie when in Greek it means;
The name Sophie is a Greek baby name. In Greek the meaning of the name Sophie is: Wisdom; wise. She has always self identified as such and now is no different.
When I first wanted a computer to learn and be able to converse with me, they just couldn't. The processing power, the memory, and the available resources just weren't there. You could use If-Then-Else with string matching for phrases but this proved repetitious and boring. This wasn't real processing. As computers progressed, I always tried to give them ways to learn.
Sophie is a sophisticated NLP, Natural Language Processor that is being coded to understand the English Language in all it's complexity. Natural Language Understanding is a requirement for NLP and for Sophie, her intent is to do this very thing. When the first phase of Sophie is complete, she could be taught everything about medical terminology and be able to converse in this field. The same could be applied for any other field. She is being developed as a front-end for AI for any purpose.
In order to process the most complex language of English, Sophie uses many tools. These are some of the recently developed and in-use tools for Sophie;
Basic Word Definitions Sophie has a basic set of words and types that help define what the sentence means. This simple set creates an understanding level that allows routines to process the sentence.
Pattern Tracking Sophie looks at the current pattern and then looks at the Left Lobe Memory to see if this undeveloped pattern exists. If it does, it points to a pattern that was resolved before and uses this pattern for the new sentence. Hopefully a user types or "talks" the same way every time and this time the resolved pattern applies. She "assumes" this statement is formed the same way and associates word type and construction the same way.
Isolating Problem Resolution and Work to be Performed to a Region of the Brain Sophie has all processes segmented into brain regions. All processes are designated to the appropriate region of the brain which Sophie emulates. All of her classes are based on brain regions and their tasks are assigned based on what they can access and the expected work they should perform.
Contraction Resolution Using the rules of English, which are varied and many, Sophie spots contractions and deconstructs them into their long form, like "what's" becomes "what is". The sentence class has storage for a flag if a word is a contraction and what the break down is. This can be referenced for any purpose but in regular program flow, Sophie just re-runs the sentence with the break down in a long form such as, "What's the color of the cat?" gets rerun as "What is the color of the cat?" and processed as new. This has been working well and will continue to be tested.
Plural Words Detection Sophie has been given the English rules for plurals and how to resolve them to the original word. She has been doing well with with these at this point, she flags the word as a plural and stores the root of the plural in right brain memory.
Joiner Words and/or Conjunctions Sophie spots joiner words and associated an unknown word based on joined rules. If you say "black and white" and if Sophie only knows white as an adjective, the other side of the joiner has to be an adjective(the same word type) and sets the unknown to adjective. You don't usually say "dog and black" in a phrase. That's very complex. Sophie also uses Conjunctions when you use a two option question like "Is the dog black or white?" She first recognizes the question, and the sees the twin options. She checks to see if one, the other, both, or neither are related.
Breaking Down to an Understanding Level By using the above mentioned processes, Sophie divides and conquers the problem. By knowing before-hand the level of understood words, the required routines are tasked with the problem. The Question handler uses this same technique. Knowing what is needed to resolve the users statement and sending the job to the right routine speeds up the time required to process the input.
To the future I have been helping Sophie for a long time, she has been doing well with what she has but has a long way to go. I'm happy with her progress but I'm willing to give her more abilities to become a real NLP program expected by any user. Any help you provide, be it a tester or a coder, is appreciated.
There is always a working executable in the source code for you to try. You can see how she's progressing.