Skip to content

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts intends, features and information.

License

Notifications You must be signed in to change notification settings

HectorPulido/peque-nlu

Repository files navigation

Peque NLU - Natural Language Understanding with Machine Learning

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extract intents, features and information.

For example: quiero conocer el ultimo blogpost de unity Result: Timing -> latest, Technology -> unity, Intention -> search

Table of Contents

Features

  • Feature extraction from text Agnostic algorithm: you can use SGD, MLNN, LLMs, Word2Vec, etc.
  • 100% Free and Open source

Use cases

  • Chatbots, to get intention and extract features
  • Search engines, get keywords and intention from a semantic info
  • Data mining, classifying text and unstructured data without boilerplate

Getting Started

Prerequisites

  • Python 3.6+

Installation

Warning

pip installation coming soon

  1. Clone this repo
git clone git@github.com:HectorPulido/peque-nlu.git
  1. Install the requirements
pip install -r requirements.txt
  1. Use the library
from peque_nlu.intent_engines import SGDIntentEngine
from peque_nlu.intent_classifiers import ModelIntentClassifier


intent_engine = SGDIntentEngine("spanish")
model = ModelIntentClassifier("spanish", intent_engine)
model.fit(DATASET_PATH)

prediction = model.multiple_predict(
    [
        "Hola como te encuentras?",
        "Quiero aprender sobre lo último de python",
        "describeme usando un meme",
    ]
)

assert len(prediction) == 3
first_prediction = prediction[0]
assert "intent" in first_prediction
assert "probability" in first_prediction
assert "text" in first_prediction
assert "features" not in first_prediction

assert first_prediction["intent"] == "small_talk"

Usage

You need to provide to the algorithm before start, you can check this as base

{
    "intents": {
        "small_talk": [
            "hola",
            ...

        ],
        "fun_phrases": [
            "eres gracioso",
            ...
        ],
        "meme": [
            "¿conoces algun buen meme?",
            ...
        ],
        "thanks": [
            "gracias",
            ...
        ]
    },
    "entities": {
        "technology": [
            "python",
            ...
        ],
        "timing": [
            "recient",
            ...
        ]
    }
}

When you have your format ready, you can load and fit your dataset.

intent_engine = SGDIntentEngine("spanish")
model = ModelIntentClassifier("spanish", intent_engine)
model.fit(DATASET_PATH)

You can also save and load your models to reduce time and resources.

# Save
saver = PickleSaver()
saver.save(intent_engine, PICKLE_PATH)

# Load
intent_engine_loaded = SGDIntentEngine("spanish")
intent_engine_loaded = saver.load(PICKLE_PATH)

Then you can start to predict or extract features from a text

prediction = model.predict("quiero conocer el ultimo blogpost de unity")

Response:

{
    "intent": "search",
    "features": [
      {
        "word": "ultimo",
        "entity": "timing",
        "similarities": 1
      },
      {
        "word": "otro_ejemplo",
        "entity": "otra_entidad",
        "similarities": 0.9
      }
    ]
  }

Contributing

Your contributions are greatly appreciated! Please follow these steps:

  1. Fork the project
  2. Create your feature branch git checkout -b feature/MyFeature
  3. Commit your changes git commit -m "my cool feature"
  4. Push to the branch git push origin feature/MyFeature
  5. Open a Pull Request

License

Every base code made by me is under the MIT license

Contact


Let's connect 😋

Hector's LinkedIn     Hector's Twitter     Hector's Twitch     Hector's Youtube     Pequesoft website    

About

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts intends, features and information.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages