Skip to content

Latest commit

 

History

History
98 lines (68 loc) · 4.82 KB

README.md

File metadata and controls

98 lines (68 loc) · 4.82 KB

Multi-task Instruction Agent

Introduction

Reinforcement learning techniques have improved robotic systems manipulation. Our research introduces a sophisticated agent that navigats a robotic manipulator through diverse tasks. Unlike traditional approaches, our method seamlessly transitions between activities using visual input and linguistic instructions.

base

Contact: Ahmed Mansour - ahmed_salah1996@yahoo.com

Components

Meta-World Environment

  • Description: An open-source benchmark for meta-reinforcement learning, consisting of 50 distinct robotic manipulation tasks.
  • Use case modification: 10 tasks were selected, with the simulator modified to handle three tasks concurrently.
  • Data Generation: Utilized SAC agents for generating datasets encompassing visual information, state observations, actions, rewards, and success flags.
  • Tasks: button-press-topdown-v2, button-press-v2, door-lock-v2, door-open-v2, drawer-open-v2, window-open-v2, faucet-open-v2, faucet-close-v2, handle-press-v2, coffee-button-v2.

Algorithms Overview

results:

Models Success Rates Comparison

Model Success Rate Parameters
Dataset (SAC generated) 84.0% ± 6.6%
BC CLIP Model 74.0% ± 9.0% 152 M
BC CLIP Model with Cross-Attention Neck 82.9% ± 6.2% 160 M
BC CLIP Model with FiLM Neck 87.2% ± 10.0% 170 M
IDT model with CLIP FiLM 89.4% ± 7.0% 170.81 M

Task-wise comparison

multi-env

What is provided in the repo:

  • Modified Metaworld environment: the environment holds 3 tasks as a time, this applies only on the visual rendered env observation, which means the vector observation includes only one task. for trying, run: python test_single.py

  • training script of baseline3 SAC:
    train_sac_on.py script is used to train a SAC agent on a single task, to use:

    python train_sac_on.py <task_name> <task_pos>

    • task_name i.e. button-press-v2
    • task_pos i.e. 0
    • task poses are 0 1 2 3 which are equivalent to right middle left Mix, which means where the target task should be placed on the table.
    • the training configs can be found under the directory configs/sac_configs/.json note: if task-name doesn't exist in the directory then defauld.json will be used.
  • dataset generation: after training the SAC agents, you can use them for generating dataset, using generate_data.py script, the script gets: arguments from train_utils/args.py

    1. configs/general_model_configs/agents_dict.json to configure the best agent for every task
    2. for data generation run: sh experiments/generate_dataset.sh
  • general model training and RL fine-tuning: many examples for different models for training or evaluation in experiments directory, for example:

    • train_base.sh for training base model (clip + linear head)
    • train_film.sh for training base model (clip + film layers + linear head)
    • train_dt.sh for training base model (clip + film layers + decision transformer)
    • finetune_dt.sh for training base model (clip + film layers + decision transformer + dt lora layers)
    • train_dt_obs.sh for training decision transformer only using the vector observation without images.
    • finetune_base_rl.sh for fine tuning a pretrained model with PPO using baseline3 framework (we did not get promising results with this yet)
  • Installations

  • references: