Skip to content

Training - Transfer grasp to stack - any toy - check_z_height - Situation Removal - EfficientNet-B0 - V0.6

Pre-release
Pre-release
Compare
Choose a tag to compare
@ahundt ahundt released this 10 Sep 15:08
· 90 commits to grasp_pytorch0.4+ since this release
1911fc2

This is Training progress for transfering grasp any toy weights to stack any toy weights with situation removal enabled, where any action that undoes past progress will be given a reward of 0.

This release also now supports continuous z height stacking based on the depth image data projected on to a height map. The highest value after a 5x5 median blur is applied is used as the stacking progress height.

Here is a placement action which failed, but it looks good visually and is a nice example of the training process:
Screenshot from 2019-09-09 20-01-15

Training images of successful stacks:
000777 1 color
000529 1 color
000228 1 color
000463 1 color

Status printout:

                                                                                                                                                                                                                           
Training iteration: 7244                                                                                                              
Change detected: True (value: 2160)                                                                                                                       
Primitive confidence scores: 1.819177 (push), 1.674433 (grasp), 2.063278 (place)                                                                                                                                                                                      
Strategy: exploit (exploration probability: 0.117432)                                                                                 
Action: push at (8, 140, 132)                                                                                                         
Executing: push at (-0.460000, 0.056000, 0.001002)                                                                                    
Trainer.get_label_value(): Current reward: 0.337089 Future reward: 1.960318 Expected reward: 0.337089 + 0.650000 x 1.960318 = 1.611296
Training loss: 0.004335                                                                                                                                   
Experience replay 16755: history timestep index 775, action: place, surprise value: 0.590137                                                                                                                                                                          
prev_height: 0.6741789133816113 max_z: 0.6746745066568761 goal_success: False <<<<<<<<<<<                                                                                                                                                                             
check_stack() stack_height: 0.6746745066568761 stack matches current goal: False partial_stack_success: False Does the code think a reset is needed: False                                                                                                            
Push motion successful (no crash, need not move blocks): True                                                                         
STACK:  trial: 630 actions/partial: 14.261811023622048  actions/full stack: 1035.0 (lower is better)  Grasp Count: 2838, grasp success rate: 0.715292459478506 place_on_stack_rate: 0.25024630541871923 place_attempts: 2030  partial_stack_successes: 508  stack_succ
esses: 7 trial_success_rate: 0.011111111111111112 stack goal: None                                                                                                                                                                                         
Trainer.get_label_value(): Current reward: 0.781250 Future reward: 2.106640 Expected reward: 0.781250 + 0.650000 x 2.106640 = 2.150566                                                                                                                             
Training loss: 0.002624                                                                                                                                                                                                                                               
Time elapsed: 5.915041
Trainer iteration: 7245.000000

Training images showing a sequence of actions which lead to a successful stack:
000223 0 color
000224 0 color
000225 0 color
000226 0 color
000227 0 color
000228 0 color
000228 1 color

Initial command run, with no situation removal:

export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir 'objects/toys' --num_obj 10  --push_rewards --experience_replay --explore_rate_decay --place --check_z_height --future_reward_discount 0.65 --transfer_grasp_to_place --load_snapshot --snapshot_file '/home/costar/src/costar_visual_stacking/logs/2019-08-17.20:54:32-train-grasp-place-split-efficientnet-21k-acc-0.80/models/snapshot.reinforcement.pth' 

Second resume command run, we started this from the weights of a training run which didn't have situation removal, but this run did have situation removal:


export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir 'objects/toys' --num_obj 10  --push_rewards --experience_replay --explore_rate_decay --place --check_z_height --future_reward_discount 0.65 --load_snapshot --snapshot_file '/home/costar/src/costar_visual_stacking/logs/2019-09-08.18:13:13/models/snapshot.reinforcement.pth'