-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create script policy to automatic generate demos for LiftCube Env #24
Conversation
Thanks very much @Ke-Wang1017 for your help in improving this parts (invk and robot modelization) :) maybe we should see how one can propagate to other envs. For info, we apply the one file philosophy which explain the presence of code redundancy at some places. |
ee_to_cube = np.linalg.norm(ee_pos - cube_pos) | ||
print(f"Cube position: {cube_pos}, EE position: {ee_pos}, Distance: {ee_to_cube}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the print (prefer logging outside the environment instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
@@ -238,14 +265,21 @@ def step(self, action): | |||
cube_pos = self.data.qpos[:3] | |||
cube_z = cube_pos[2] | |||
ee_id = self.model.body("moving_side").id | |||
ee_pos = self.data.geom_xpos[ee_id] | |||
ee_pos = self.data.xpos[ee_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? I find it very hard to understand what the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I discovered that the previous function does not actually update the end-effector position, and the new one actually updates
@@ -221,6 +246,8 @@ def reset(self, seed=None, options=None): | |||
cube_rot = np.array([1.0, 0.0, 0.0, 0.0]) | |||
robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0]) | |||
self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos]) | |||
self.done = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For lift you don't want to terminate the episode early, otherwise the robot will just find a way to throw the cube. By not finishing the early episode, it forces the agent to learn how to stabilise the cube.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good catch, will modify it
@@ -221,6 +246,8 @@ def reset(self, seed=None, options=None): | |||
cube_rot = np.array([1.0, 0.0, 0.0, 0.0]) | |||
robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0]) | |||
self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos]) | |||
self.done = False | |||
self.success = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use is_success
instead (standard).
btw I don't think it's necessary to make it an attribute. Just return it as part of the info dict.
return observation, reward, False, False, {} | ||
# reward = reward_height + reward_distance | ||
reward = reward_distance | ||
if ee_to_cube < 0.049: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand that. Why would proximity to the cube implies success? And where does 0.049
comes from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add some context, I tried full grasp but the issue is that the mesh of gripper is too complex and the grasp always fails. Therefore I have a heuristic value and says if the gripper is very close to the cube, it can be said success. I know RL will somehow exploit this, for now I just want to do imitation learning with diffusion/ACT, so it should not be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for the mesh we need to work on this. But the goal here is to lift the cube, not to reach it, right?
self.threshold_height = 0.5 | ||
self.cube_low = np.array([-0.15, 0.10, 0.015]) | ||
self.cube_high = np.array([0.15, 0.25, 0.015]) | ||
self.target_low = np.array([-3.14159, -1.5708, -1.48353, -1.91986, -2.96706, -1.74533]) | ||
self.target_high = np.array([3.14159, 1.22173, 1.74533, 1.91986, 2.96706, 0.0523599]) | ||
self.q0 = (self.target_high + self.target_low) / 2 # home position |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not the home (or neutral) position. The neutral position is 0 in joint space, so it corresponds to -(target_high + target_low) / (target_high - target_low)
in action space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used in 'ee' mode for computing the inverse kinematics, so that the solution is close to the home position. maybe it is a good idea to put it inside the inverse kinematics function?
@@ -38,7 +38,7 @@ | |||
<body name="cube" pos="0.1 0.1 0.01"> | |||
<freejoint name="cube"/> | |||
<inertial pos="0 0 0" mass="0.1" diaginertia="0.00001125 0.00001125 0.00001125"/> | |||
<geom friction="0.5" condim="3" pos="0 0 0" size="0.015 0.015 0.015" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/> | |||
<geom friction="2.0 0.8 0.8" condim="4" pos="0 0 0" size="0.01 0.01 0.01" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you motivate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increase the friction of the cube to make the grasp more stable, but for now it still does not work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need 3 components in friction? The friction is different depending on the direction?
And condim=4 is for adding rotational friction, right?
Why did you shrink the cube?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi sorry I did some trials and forget to recover, for a smaller size of cube I find is easier to grasp, within all changes I think only condim=4 looks useful. Thanks for pointing out
Thanks for you contribution @Ke-Wang1017 I've made a few remarks, I would help a lot if you've some time to review them |
I have updated the inverse kinematics of LiftCube and created a script in examples/script_policy.py to generate automated policy for demos. Also tuned the XML to make the grasp more stable. However the grasp is still shaky and for now I will mark the trial success if the gripper is very close to the cube.