Release Training - Transfer grasp to stack - any toy - check_z_height - Situation Removal - EfficientNet-B0

This is Training progress for transfering grasp any toy weights to stack any toy weights with situation removal enabled, where any action that undoes past progress will be given a reward of 0.

This release also now supports continuous z height stacking based on the depth image data projected on to a height map. The highest value after a 5x5 median blur is applied is used as the stacking progress height.

Here is a placement action which failed, but it looks good visually and is a nice example of the training process:

Training images of successful stacks:

Status printout:

                                                                                                                                                                                                                           
Training iteration: 7244                                                                                                              
Change detected: True (value: 2160)                                                                                                                       
Primitive confidence scores: 1.819177 (push), 1.674433 (grasp), 2.063278 (place)                                                                                                                                                                                      
Strategy: exploit (exploration probability: 0.117432)                                                                                 
Action: push at (8, 140, 132)                                                                                                         
Executing: push at (-0.460000, 0.056000, 0.001002)                                                                                    
Trainer.get_label_value(): Current reward: 0.337089 Future reward: 1.960318 Expected reward: 0.337089 + 0.650000 x 1.960318 = 1.611296
Training loss: 0.004335                                                                                                                                   
Experience replay 16755: history timestep index 775, action: place, surprise value: 0.590137                                                                                                                                                                          
prev_height: 0.6741789133816113 max_z: 0.6746745066568761 goal_success: False <<<<<<<<<<<                                                                                                                                                                             
check_stack() stack_height: 0.6746745066568761 stack matches current goal: False partial_stack_success: False Does the code think a reset is needed: False                                                                                                            
Push motion successful (no crash, need not move blocks): True                                                                         
STACK:  trial: 630 actions/partial: 14.261811023622048  actions/full stack: 1035.0 (lower is better)  Grasp Count: 2838, grasp success rate: 0.715292459478506 place_on_stack_rate: 0.25024630541871923 place_attempts: 2030  partial_stack_successes: 508  stack_succ
esses: 7 trial_success_rate: 0.011111111111111112 stack goal: None                                                                                                                                                                                         
Trainer.get_label_value(): Current reward: 0.781250 Future reward: 2.106640 Expected reward: 0.781250 + 0.650000 x 2.106640 = 2.150566                                                                                                                             
Training loss: 0.002624                                                                                                                                                                                                                                               
Time elapsed: 5.915041
Trainer iteration: 7245.000000

Training images showing a sequence of actions which lead to a successful stack:

Initial command run, with no situation removal:

export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir 'objects/toys' --num_obj 10  --push_rewards --experience_replay --explore_rate_decay --place --check_z_height --future_reward_discount 0.65 --transfer_grasp_to_place --load_snapshot --snapshot_file '/home/costar/src/costar_visual_stacking/logs/2019-08-17.20:54:32-train-grasp-place-split-efficientnet-21k-acc-0.80/models/snapshot.reinforcement.pth'

Second resume command run, we started this from the weights of a training run which didn't have situation removal, but this run did have situation removal:


export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir 'objects/toys' --num_obj 10  --push_rewards --experience_replay --explore_rate_decay --place --check_z_height --future_reward_discount 0.65 --load_snapshot --snapshot_file '/home/costar/src/costar_visual_stacking/logs/2019-09-08.18:13:13/models/snapshot.reinforcement.pth'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training - Transfer grasp to stack - any toy - check_z_height - Situation Removal - EfficientNet-B0 - V0.6