Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create script policy to automatic generate demos for LiftCube Env #24

Merged
merged 2 commits into from
Jun 18, 2024

Conversation

Ke-Wang1017
Copy link
Contributor

I have updated the inverse kinematics of LiftCube and created a script in examples/script_policy.py to generate automated policy for demos. Also tuned the XML to make the grasp more stable. However the grasp is still shaky and for now I will mark the trial success if the gripper is very close to the cube.

@perezjln
Copy link
Owner

Thanks very much @Ke-Wang1017 for your help in improving this parts (invk and robot modelization) :) maybe we should see how one can propagate to other envs. For info, we apply the one file philosophy which explain the presence of code redundancy at some places.

@perezjln perezjln merged commit a81f2d5 into perezjln:main Jun 18, 2024
0 of 4 checks passed
ee_to_cube = np.linalg.norm(ee_pos - cube_pos)
print(f"Cube position: {cube_pos}, EE position: {ee_pos}, Distance: {ee_to_cube}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the print (prefer logging outside the environment instead)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@@ -238,14 +265,21 @@ def step(self, action):
cube_pos = self.data.qpos[:3]
cube_z = cube_pos[2]
ee_id = self.model.body("moving_side").id
ee_pos = self.data.geom_xpos[ee_id]
ee_pos = self.data.xpos[ee_id]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this? I find it very hard to understand what the difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I discovered that the previous function does not actually update the end-effector position, and the new one actually updates

@@ -221,6 +246,8 @@ def reset(self, seed=None, options=None):
cube_rot = np.array([1.0, 0.0, 0.0, 0.0])
robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos])
self.done = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For lift you don't want to terminate the episode early, otherwise the robot will just find a way to throw the cube. By not finishing the early episode, it forces the agent to learn how to stabilise the cube.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good catch, will modify it

@@ -221,6 +246,8 @@ def reset(self, seed=None, options=None):
cube_rot = np.array([1.0, 0.0, 0.0, 0.0])
robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos])
self.done = False
self.success = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use is_success instead (standard).
btw I don't think it's necessary to make it an attribute. Just return it as part of the info dict.

return observation, reward, False, False, {}
# reward = reward_height + reward_distance
reward = reward_distance
if ee_to_cube < 0.049:
Copy link
Collaborator

@qgallouedec qgallouedec Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand that. Why would proximity to the cube implies success? And where does 0.049 comes from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add some context, I tried full grasp but the issue is that the mesh of gripper is too complex and the grasp always fails. Therefore I have a heuristic value and says if the gripper is very close to the cube, it can be said success. I know RL will somehow exploit this, for now I just want to do imitation learning with diffusion/ACT, so it should not be a problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for the mesh we need to work on this. But the goal here is to lift the cube, not to reach it, right?

self.threshold_height = 0.5
self.cube_low = np.array([-0.15, 0.10, 0.015])
self.cube_high = np.array([0.15, 0.25, 0.015])
self.target_low = np.array([-3.14159, -1.5708, -1.48353, -1.91986, -2.96706, -1.74533])
self.target_high = np.array([3.14159, 1.22173, 1.74533, 1.91986, 2.96706, 0.0523599])
self.q0 = (self.target_high + self.target_low) / 2 # home position
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not the home (or neutral) position. The neutral position is 0 in joint space, so it corresponds to -(target_high + target_low) / (target_high - target_low) in action space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in 'ee' mode for computing the inverse kinematics, so that the solution is close to the home position. maybe it is a good idea to put it inside the inverse kinematics function?

@@ -38,7 +38,7 @@
<body name="cube" pos="0.1 0.1 0.01">
<freejoint name="cube"/>
<inertial pos="0 0 0" mass="0.1" diaginertia="0.00001125 0.00001125 0.00001125"/>
<geom friction="0.5" condim="3" pos="0 0 0" size="0.015 0.015 0.015" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/>
<geom friction="2.0 0.8 0.8" condim="4" pos="0 0 0" size="0.01 0.01 0.01" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you motivate this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase the friction of the cube to make the grasp more stable, but for now it still does not work

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need 3 components in friction? The friction is different depending on the direction?
And condim=4 is for adding rotational friction, right?
Why did you shrink the cube?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi sorry I did some trials and forget to recover, for a smaller size of cube I find is easier to grasp, within all changes I think only condim=4 looks useful. Thanks for pointing out

@qgallouedec
Copy link
Collaborator

Thanks for you contribution @Ke-Wang1017 I've made a few remarks, I would help a lot if you've some time to review them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants