This is the code for the paper
This code has been used in Robocup@Home2017 and won 1st Place in Social Standard Platform (SSPL) AUPAIR - https://www.robocup2017.org/file/awards/Awards_RoboCup_athome.pdf
AUPAIR team website: https://bi.snu.ac.kr/Robocup/2017/index.html
Youtube link: https://goo.gl/Pxnf1n
If you find this code useful in your research, please cite: (will be available after Feb, 06, 2018)
@inproceedings{PALs2018,
title={Perception-Action-Learning System for Mobile Social-Service Robots using Deep Learning},
author={Lee, Beom-Jin and Choi, Jinyoung and Lee, Chung-Yeon and Park, Kyung-Wha and Choi, Sungjun and Han, Cheolho and Han, Dong-Sig and Baek, Christina and Emaase, Patrick and Zhang, Byoung-Tak },
booktitle={AAAI},
year={2018}
}
Upgraded version of perception module is on: https://github.com/gliese581gg/IPSRO
install tmux
sudo apt-get update
sudo apt-get install -y python-software-properties software-properties-common
sudo add-apt-repository -y ppa:pi-rho/dev
sudo apt-get update
sudo apt-get install -y tmux=2.0-1~ppa1~t
- UBUNTU 14.04 for ROS-INDIGO
- After cloning this repository
-- download https://pjreddie.com/media/files/yolo.weights
-- put it into python_script/bin
-- download http://posefs1.perception.cs.cmu.edu/Users/ZheCao/pose_iter_440000.caffemodel
-- put it into python_script/pose_model/model/_trained_COCO
-- download http://posefs1.perception.cs.cmu.edu/Users/ZheCao/pose_iter_146000.caffemodel
-- put it in src/pose_model/model/_trained_MPI/
-- download https://www.dropbox.com/s/ae071mfm2qoyc8v/pose_model.pth?dl=0
-- put it in src/pose_model/
-- download http://cs.stanford.edu/people/jcjohns/densecap/densecap-pretrained-vgg16.t7.zip
-- unzip it
-- put unzipped file in src/captioning_model/data/models/densecap/
Install below from their websites.
- NVIDIA driver 375.20
- CUDA 8.0
- CUDNN 5.1
- Tensorflow r1.1
Install below.
-
Torch from http://torch.ch/docs/getting-started.html (choose 'yes' when installer asks something about path)
-
SpeechRecognition
pip install SpeechRecognition
-
ROS indigo desktop full from http://blog.naver.com/gliese581gg/220645607537 or http://wiki.ros.org/indigo/Installation/Ubuntu
-
naoqi (add below to
~/.bashrc
)export PYTHONPATH=${PYTHONPATH}:~/'YOUR_PATH'/naoqi/lib/python2.7/site-packages
-
sshpass
apt-get install sshpass
-
others
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose pip install Cython sudo apt-get install libhdf5-dev libblas-dev liblapack-dev gfortran pip install h5py pip install keras darkflow (https://github.com/thtrieu/darkflow) (intall option 3) apt-get install ros-indigo-navigation apt-get install ros-indigo-gmapping apt-get install ros-indigo-pepper-* luarocks install nn luarocks install image luarocks install lua-cjson luarocks install https://raw.githubusercontent.com/qassemoquab/stnbhwd/master/stnbhwd-scm-1.rockspec luarocks install https://raw.githubusercontent.com/jcjohnson/torch-rnn/master/torch-rnn-scm-1.rockspec luarocks install cutorch luarocks install cunn luarocks install cudnn luarocks install md5 luarocks install --server=http://luarocks.org/dev torch-ros pip install http://download.pytorch.org/whl/cu80/torch-0.1.12.post2-cp27-none-linux_x86_64.whl pip install torchvision pip install nltk (open python and) import nltk ; nltk.download('punkt') pip install pattern pip install google-cloud
-
tmux
sudo apt-get update sudo apt-get install -y python-software-properties software-properties-common sudo add-apt-repository -y ppa:pi-rho/dev sudo apt-get update sudo apt-get install -y tmux=2.0-1~ppa1~t
merge the included catkin_ws src
to yours and catkin_make
merge catkin_ws src
catkin_make
import pepper_io
pio = pepper_io.pepper_io()
- Note that only 1 pio instance should be made. Do not make multiple instances
- You must put your script in the included 'python_script' folder
- Do every command with
sudo
or becomeroot
.
sudo -s
- Launch modules using Tmux
# 1. launch modules
cd {YOUR_PATH}/python_scripts
./run_pepper.sh {pepper's ip}
#(example : ./run_pepper.sh 192.168.1.176)
# 2. run your script in most right screen
python ~~.py --ip 192.168.1.176
- Manually run each module
#1. Start Driver
roslaunch pal_pepper pepper_start_jy.launch nao_ip:=192.168.1.~
#2. Start navigation module
roslaunch pal_pepper pepper_navigation.launch map_file:=full/path/to/your_yaml_file
or edit pepper_navigation.launch's default map loading file
roslaunch pal_pepper pepper_navigation.launch
#You will see 2D pose Estimate. Set robot location and move it with 2D Nav Goal to some place.
#Pepper will find where it is.
#3. run deep learning modules
cd {YOUR_PATH}/python_scripts
python obj_detector.py --ip 192.168.1.~~
python reid_module.py
python pose_detector.py
cd captioning
th run_ros2.lua
-
See
objs.msg
to see which types are supported. For example,int32 x
means x position.string class_string : object class int32 x : center of X of bounding box int32 y : center Y of bounding box int32 h : height of bounding box int32 w : width of bounding box float64 confidence : class score sensor_msgs/Image cropped : cropped image sensor_msgs/Image cropped_cloud : cropped point cloud geometry_msgs/Pose pose_wrt_robot : position of the object wrt the robot geometry_msgs/Pose pose_wrt_map : position of the object wrt the map geometry_msgs/Pose pose_wrt_odom : position of the object wrt the odometry int32 person_id : id of a person after person identification string person_name : (deprecated) name of a person float64 reid_score : confidence of re-identifiation of a person int32 isWaving : (deprecated) will be 1 if a person is waving his/her hands, 0 o.w. int32 isSitting : (deprecated) will be 1 if a person is sitting int32 isRwaving : (deprecated) 1 if the person waving right hand int32 isLwaving : (deprecated) 1 if the person waving right hand int32 isLying : (deprecated) 1 if the person is lying int32 isLpointing : (deprecated) 1 if the person pointing right hand int32 isRpointing : (deprecated) 1 if the person pointing right hand string ucolor : (deprecated) color of the person's upper cloth string lcolor : (deprecated) color of the person's lower cloth float64[] joints : list of floats, represents the position of joints the indexes are for 0 1 nose 2 3 neck 4 5 r_shoulder 6 7 r_elbow 8 9 r_wrist 10 11 l_shoulder 12 13 l_elbow 14 15 l_wrist 16 17 r_pelvis 18 19 r_knee 20 21 r_anckle 22 23 l_pervis 24 25 l_knee 26 27 l_ankle 28 29 r_eye 30 31 l_eye 32 33 r_ear 34 35 l_ear string[] captions : captions generated by captioning module string[] tags : contains useful tags for objects in current version, tags contain class of the object, 'waving','rwaving','lwaving','man','woman','sitting','lying','lpointing','rpointing','blue','green','red','white','black',person's name
Deprecated variables are still working but I highly recommend to use the 'tags' instead of them
-
See
obj_array.msg
.Header header : Contains timestamp and ETC, No need to modify objs[] objects : List of *objs.msg* int32 msg_idx : index of message (used for perception integration) sensor_msgs/Image scene_rgb : rgb image of whole scene sensor_msgs/Image scene_cloud : point cloud of whole scene
-
And
pose
class which is returned frompos_wrt_map
,pose_wrt_robot
andpose_wrt_odom
.# geometry_msgs/Pose Point position Quaternion orientation
position
consists offloat64 x float64 y float64 z
orientaiton
consists offloat64 x float64 y float64 z float64 w
def main():
pio = pepper_io(ip='192.168.1.176')
test_objects = pio.get_perception()
header = test_objects.header
objets = test_objects.objects
for obj in objects:
print obj.class_string # print class of the object.
pose = obj.pos_wrt_map # print position wrt map
position_x = pose.position.x
orientation_x = pose.orientation.x
# cropped image is rosimg. you have to convert this to numpy img.
cropped_image = pio.rosimg_to_numpyimg(obj.cropped) # this is numpy image
tags = obj.tags
return None
pio.get_perception(fil = None,reid=True,pose=True,captioning=True,time_limit = 3.0)
fil
: list of classes to find. find all classes if None (example : fil=['person','chair'])reid
,pose
,captioning
: if True, output contains those informationtime_limit
: return emptyobj_array
if latest information is older than this seconds
get integrated perception (objects, people's name, pose, captions)
returns objs_array
instance.
pio.get_objects(fil = None,time_limit = 3.0)
pio.get_people_identified(waving_only=False,,time_limit = 3.0)
pio.get_people_wavings(name = None,time_limit = 3.0)
fil
: list of classes to find. find all classes if None (example : fil=['person','chair'])waving_only
: if True, only waving people will be returnedname
: list of strings, if not None, only find specified peopletime_limit
: return emptyobj_array
if latest information is older than this seconds
get object information from individual modules (object detection, identification, pose detection)
pio.get_captions()
get captions of whole scene. captions for people are automatically stored in obj.captions
so don't use this to extract people's captions
pio.save_waypoint(filename)
filename
: a file name without path. E.g.:tour_guid.txt
save waypoints to a file.
pio.load_waypoints(filename)
filename
: a file name without path. E.g.:tour_guid.txt
reset current waypoints and load waypoints from a file.
pio.add_waypoint(name, location=None)
name
:sring
location
: a tuple, list or numpy array consts of threefloat
s forx
,y
,direction
in a map. The default value is the robot's location.
add a waypoint.
pio.go_to_waypoint(name, wait=True, clear_costmap=False)
name
:sring
wait
: the function returns immediately ifFalse
. else blocked until the robot gets to the goal.clear_costmap
: reset temporally detected obstacles ifTrue
, else use the previous costmap.
go to a waypoint.
pio.get_loc(p=np.array([0,0,0],o=np.array[0,0,0,0],source='CameraTopFrame',target='map')
p
: position tuple, list or numpy array consts of threefloat
s forx
,y
,z
wrt the robot.z
must always be0
.o
: orientation tuple, list or numpy array consts of fourfloat
s forx
,y
,z
,w
wrt the robot. this is quaternion coordinate systemsource
: source frametarget
: target frame
convert the location in source coordinate system to the location in target coordinate system
orientation must be quaternion (see also pio.yaw_to_quat
and pio.quat_to_yaw
)
pio.global_localization()
stabilize the robot's position by rotating several times.
pio.init_speech_recognition(sensitivity=0.3)
sensitivity
:0
~1
initialize speech recognition. Afther this, speeches are detected countinousely.
See also
pio.set_sound_sensitivity(sensitivity=0.9)
pio.start_recording(reset=False, base_duration=3.0)
reset
: start a new recoding session forTrue
, or add to an existing one ifFalse
- 'base_duration': minimum length of duration. Only for
reset
isTrue
.
Detect a speech from now manually.
pio.get_rgb()
Returns camera image to numpy array.
See also pio.get_depth()
and pio.get_point_cloud
.
The size of image is always 320(width) * 240(height).
pio.get_depth()
returns depth image numpy array
pio.get_point_cloud()
returns point cloud numpy array
pio.get_pos_wrt_robot(x,y,size=10,scan_len=50,scan='point')
x
,y
: position of a pixel of interestsize
: ifscan
is 'point', the function returns nearest point in surroundingsize
area ofx
,y
scan_len
: ifscan
is 'line', the function returns nearest point in vertical line fromx
,y
tox+scan_len
,y
scan
: 'point' or 'line'
Get a position of specific pixel in meters.
scan surrounding area or vertical line.
pio.do_anim(command)
#example : pio.do_anim('Gestures/Hey_1')
command
: a name of the gesture instring
.
Animate a gesture. The kinds of gestures are from http://doc.aldebaran.com/2-4/naoqi/motion/alanimationplayer-advanced.html#alanimationplayer-advanced.
pio.do_pose(command)
command
: a name of the posture instring
among Crouch, LyingBack, LyingBelly, Sit, SitRelax, Stand, StandInit, StandZero.
Change the posture of the robot.
pio.go_to_goal(x, y, theta, wait=True, clear_costmap=False)
x
,y
,theta
: the position of the goal wrt the map.theta
is in radian.wait
: the function returns immediately ifFalse
. else blocked until the robot gets to the goal.clear_costmap
: reset temporally detected obstacles ifTrue
, else use the previous costmap.
Move the robot to a location.
pio.set_velocity(x y, theta, duration=-1.)
x
,y
: the velocity of forward-backward diretion and left-right direction.theta
: the angular velocity of rotation.duration
: the duration of the movement. NOTE: if not specified, the robot moves forever in the specified direction. Colision!
Move the robot in a speciiec direction for a duration and velocity.
pio.stop()
Stop ongoing movement.
pio.activate_keyboard_contol()
Control the robot using a keyboard.
usage : type following command and press enter.
robot will maintain velocity unless you give another command.
commands:
w : forward
s : stop
x : backward
a : strafe left
d : strafe right
q : turn left
e : turn right
say : say next input
waypoint : add current location to waypoints
save_waypoint : save all waypoints
add_waypoint : add current location to waypoints
go_waypoint : go to waypoint
add_reid_target : add reid target
save_reid_targets : save reid target
sr : speech recognition
c : exit
pio.say(text)
text
:string
to say.
Say a text. Speech recognition is paused during saying.
pio.rosimg_to_numpyimg(img_msg)
img_msg
: ros image to convert.
Convert ros image to numpy image.
See also
pio.numpyimg_to_rosimg(npimg)
pio.find_ord(word, source=None)
word
: a word to find.source
: the source of speech to find. Use the latest one if not specified.
Find a word from a recorded speech.
pio.follow_person(self,target,target_dist=1.0,fail_dist_threshold=1.0,dist_strict=False, \
timeout=60,stop_criterion='dist',use_reid=False,reid_name=None,stop_word='stop',\
reid_strict=False,reid_add=False,score_threshold=-10,max_fail_count = 20, short_mode_thr = 1.5)
target
: a target to follow inobjs
instance.target_dist
: distance in meters.timeout
: fails if the stop criterion is not satisfied untiltimeout
.stop_criterion
:'dist'
for distance or'speech'
for a word.use_reid
: follow the closest person in location ifFalse
, follow the identified person ifTrue
.reid_name
: follow the person whoes name isreid_name
, else name the person asfollow
.stop_word
: the word to make the robot stop following ifstop_criterion
is'speech'
.
Follow a person until the stop criterion is satisfied.
To use Google Speech Recognition, you will be needing a google account.
Link: https://cloud.google.com/speech/docs/auth
Add the authorized jason file to speech_auth folder
Unindent self.speech_client = speech.Client.from_service_account_json('speech_auth/xxx.json') in pepper_io.py
the most recently recognized speech in the string
.
A list of strings used for speech recognition as hints.