Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

2.1 Framework

Selene Baez edited this page May 25, 2020 · 2 revisions

Applications & Events

This package revolves around Applications, which program some robot behaviour. Applications provide Event callbacks for each sensory experience the robot 'perceives'. These Events include:

  • on_image for every frame captured by robot camera
  • on_object for each object detected in a camera frame
  • on_face for each face detected in a camera frame
  • on_person when detected face is 'known' to the robot
  • on_new_person when detected face is 'new' to the robot
  • on_transcript for every utterance that can be resolved into text

Not every Application needs all features that are available, or there could be multiple implementations of a particular feature: say open source Speech-to-Text, instead of Google. For this reason, Components are used. When you create an application, you can specify which features you need, and they will be provided to you. Available Components are:

  • FaceDetection: exposes the on_face, on_person & on_new_person events
  • ObjectDetection: exposes the on_image & on_object events
  • SpeechRecognition: exposes the on_transcript event
  • Statistics: shows live statistics in terminal
  • VideoDisplay: shows video feed + Object & Person overlay in browser
  • etc..

In order to run Applications on multiple Backends, abstractions have been made for all Backend-specific Devices:

  • AbstractCamera
  • AbstractMicrophone
  • AbstractTextToSpeech

Backend implementations have been made for:

  1. Naoqi Backend: using the Camera / Microphone / AnimatedSpeech from Pepper / Nao
  2. Laptop / PC Backend: using the built in Webcam / Microphone / Google Text-to-Speech.

Being able to run applications on your local PC gives the advantage that you can test your application without needing a robot, speeding up development.

Adding new Backends

Any Hardware than can provide a:

  • Video Feed
  • Audio Feed
  • Speaker/Speech Output

can be a potential Backend for our Pepper Package! Please delve into the source code and implement AbstractCamera, AbstractMicrophone & AbstractTextToSpeech for your favourite hardware. Also, please send us a pull request if you did so!

In Practice

Writing an Application is straightforward and requires just a few steps:

  1. Create an Application that inherits from pepper.framework.Application
  2. Add required Components by inheriting from them
    • Order matters here, because of Component dependencies
  3. Run the Application with a specific Backend

see pepper/test/app/verbose.py for a minimalist working example.

from pepper.framework import *
from pepper import config


class MyApplication(Application, ObjectDetectionComponent, FaceDetectionComponent, SpeechRecognitionComponent):
    def on_image(self, image):
        pass

    def on_object(self, image, objects):
        pass

    def on_face(self, faces):
        pass

    def on_person(self, persons):
        pass

    def on_new_person(self, persons):
        pass

    def on_transcript(self, hypotheses, audio):
        pass


if __name__ == '__main__':
    MyApplication(config.get_backend()).run()

Intentions

When Applications get bigger, the need for more structure arises. That is where Intentions come in. Within each Application, the user programs one or several Intentions (The 'I' in BDI). These intentions act as subgoals within each application. An example is demonstrated below.

See pepper/test/app/intention.py for a minimalist working example.

from pepper.framework import *
from pepper import config


class MyApplication(Application, StatisticsComponent, FaceDetectionComponent, SpeechRecognitionComponent):
    pass


class IdleIntention(Intention, MyApplication):
    def on_face(self, faces):
        TalkIntention(self.application)


class TalkIntention(Intention, MyApplication):
    def __init__(self, application):
        super(TalkIntention, self).__init__(application)
        self.say("Hello, Human!")

    def on_transcript(self, hypotheses, audio):
        utterance = hypotheses[0].transcript

        if utterance == "bye bye":
            self.say("Goodbye, Human!")
            IdleIntention(self.application)
        else:
            self.say("How interesting!")


if __name__ == '__main__':

    # Initialize Application
    application = MyApplication(config.get_backend())

    # Run Intention
    IdleIntention(application)

    # Run Application
    application.run()