New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Create script policy to automatic generate demos for LiftCube Env #24

Merged

perezjln merged 2 commits into perezjln:main from Ke-Wang1017:main

Jun 18, 2024

Contributor

Ke-Wang1017 commented Jun 17, 2024

I have updated the inverse kinematics of LiftCube and created a script in examples/script_policy.py to generate automated policy for demos. Also tuned the XML to make the grasp more stable. However the grasp is still shaky and for now I will mark the trial success if the gripper is very close to the cube.

Ke-Wang1017 added 2 commits

June 17, 2024 22:04


          add scripted policy to generate automated demos for lift cube env

a40db76


          update README

5d2b950

Owner

perezjln commented Jun 18, 2024

Thanks very much @Ke-Wang1017 for your help in improving this parts (invk and robot modelization) :) maybe we should see how one can propagate to other envs. For info, we apply the one file philosophy which explain the presence of code redundancy at some places.

perezjln merged commit a81f2d5 into perezjln:main

0 of 4 checks passed

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

		ee_to_cube = np.linalg.norm(ee_pos - cube_pos)
		print(f"Cube position: {cube_pos}, EE position: {ee_pos}, Distance: {ee_to_cube}")

Collaborator

qgallouedec Jun 18, 2024

Drop the print (prefer logging outside the environment instead)

Contributor Author

Ke-Wang1017 Jun 18, 2024

sure

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

@@ @@ -238,14 +265,21 @@ def step(self, action): @@
                       cube_pos = self.data.qpos[:3]
                       cube_z = cube_pos[2]
                       ee_id = self.model.body("moving_side").id
-                      ee_pos = self.data.geom_xpos[ee_id]
+                      ee_pos = self.data.xpos[ee_id]

Collaborator

qgallouedec Jun 18, 2024

Can you elaborate on this? I find it very hard to understand what the difference.

Contributor Author

Ke-Wang1017 Jun 18, 2024

Yes! I discovered that the previous function does not actually update the end-effector position, and the new one actually updates

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

@@ @@ -221,6 +246,8 @@ def reset(self, seed=None, options=None): @@
                       cube_rot = np.array([1.0, 0.0, 0.0, 0.0])
                       robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
                       self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos])
+                      self.done = False

Collaborator

qgallouedec Jun 18, 2024

For lift you don't want to terminate the episode early, otherwise the robot will just find a way to throw the cube. By not finishing the early episode, it forces the agent to learn how to stabilise the cube.

Contributor Author

Ke-Wang1017 Jun 18, 2024

That is a good catch, will modify it

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

@@ @@ -221,6 +246,8 @@ def reset(self, seed=None, options=None): @@
                       cube_rot = np.array([1.0, 0.0, 0.0, 0.0])
                       robot_qpos = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
                       self.data.qpos[:] = np.concatenate([cube_pos, cube_rot, robot_qpos])
+                      self.done = False
+                      self.success = False

Collaborator

qgallouedec Jun 18, 2024

Use is_success instead (standard).
btw I don't think it's necessary to make it an attribute. Just return it as part of the info dict.

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

-                      return observation, reward, False, False, {}
+                      # reward = reward_height + reward_distance
+                      reward = reward_distance
+                      if ee_to_cube < 0.049:

Collaborator

qgallouedec Jun 18, 2024 •

edited

Loading

I don't understand that. Why would proximity to the cube implies success? And where does 0.049 comes from?

Contributor Author

Ke-Wang1017 Jun 18, 2024

Just add some context, I tried full grasp but the issue is that the mesh of gripper is too complex and the grasp always fails. Therefore I have a heuristic value and says if the gripper is very close to the cube, it can be said success. I know RL will somehow exploit this, for now I just want to do imitation learning with diffusion/ACT, so it should not be a problem.

Collaborator

qgallouedec Jun 18, 2024

Ok for the mesh we need to work on this. But the goal here is to lift the cube, not to reach it, right?

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/envs/lift_cube_env.py

                       self.threshold_height = 0.5
                       self.cube_low = np.array([-0.15, 0.10, 0.015])
                       self.cube_high = np.array([0.15, 0.25, 0.015])
+                      self.target_low = np.array([-3.14159, -1.5708, -1.48353, -1.91986, -2.96706, -1.74533])
+                      self.target_high = np.array([3.14159, 1.22173, 1.74533, 1.91986, 2.96706, 0.0523599])
+                      self.q0 = (self.target_high + self.target_low) / 2 # home position

Collaborator

qgallouedec Jun 18, 2024

No, it's not the home (or neutral) position. The neutral position is 0 in joint space, so it corresponds to -(target_high + target_low) / (target_high - target_low) in action space.

Contributor Author

Ke-Wang1017 Jun 18, 2024

This is used in 'ee' mode for computing the inverse kinematics, so that the solution is close to the home position. maybe it is a good idea to put it inside the inverse kinematics function?

qgallouedec reviewed

View reviewed changes

gym_lowcostrobot/assets/low_cost_robot_6dof/lift_cube.xml

    
            @@ -38,7 +38,7 @@
          
              		<body name="cube" pos="0.1 0.1 0.01">

              			<freejoint name="cube"/>

              			<inertial pos="0 0 0" mass="0.1" diaginertia="0.00001125 0.00001125 0.00001125"/>

              			<geom friction="0.5" condim="3" pos="0 0 0" size="0.015 0.015 0.015" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/>

              			<geom friction="2.0 0.8 0.8"  condim="4" pos="0 0 0" size="0.01 0.01 0.01" type="box" name="cube" rgba="0.5 0 0 1" priority="1"/>

Collaborator

qgallouedec Jun 18, 2024

can you motivate this?

Contributor Author

Ke-Wang1017 Jun 18, 2024

Increase the friction of the cube to make the grasp more stable, but for now it still does not work

Collaborator

qgallouedec Jun 20, 2024

Why do you need 3 components in friction? The friction is different depending on the direction?
And condim=4 is for adding rotational friction, right?
Why did you shrink the cube?

Contributor Author

Ke-Wang1017 Jun 20, 2024

Hi sorry I did some trials and forget to recover, for a smaller size of cube I find is easier to grasp, within all changes I think only condim=4 looks useful. Thanks for pointing out

Collaborator

qgallouedec commented Jun 18, 2024

Thanks for you contribution @Ke-Wang1017 I've made a few remarks, I would help a lot if you've some time to review them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet