Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize Hydroelastic and Point Contacts (continued from #25) #133

Merged
merged 91 commits into from
Apr 12, 2023

Conversation

sloretz
Copy link
Collaborator

@sloretz sloretz commented Sep 13, 2022

This brings @gbalke's PR #25 up to date with both the latest changes in Drake and Drake-ROS.

Quite a bit has changed, so I changed the demo to get contact information from a multibody plant instead of the scene graph. I added the ability to visualize point contacts (with inspiration from gazebo it uses a ball for the contact point and a line for the contact normal). I mimicked ContactResultsToLcmSystem and ConnectContactResultsToDrakeVisualizer() as ContactMarkersSystem and ConnectContactResultsToRviz().

The demo is two balls rolling down a slope. The lower ball is very soft and compliant and the upper ball is rigid. The slope is also rigid. Hydroelastic collisions are present in two places: between the compliant and rigid balls, and between the compliant ball and the slope. A point contact is present between the rigid ball and the slope.

Screenshot from 2022-09-16 14-32-53

ros2 run drake_ros_examples collisions
rviz2 -d src/drake-ros/drake_ros_examples/examples/collisions/hcollisions.rviz

Issues

  • The texture on the hydroelastic collision mesh is only visible in one direction. On the other side it's all black. I don't know why.

This change is Reviewable

@sloretz sloretz changed the title Sloretz gbalke drake ros viz hydroelastic [WIP] Visualizing Hydroelastic Collisions (continued from #25) Sep 13, 2022
@sloretz sloretz changed the title [WIP] Visualizing Hydroelastic Collisions (continued from #25) Visualizing Hydroelastic Collisions (continued from #25) Sep 16, 2022
Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@EricCousineau-TRI for review. I think this is ready for feedback.

Reviewable status: 0 of 16 files reviewed, all discussions resolved (waiting on @EricCousineau-TRI)

@sloretz sloretz changed the title Visualizing Hydroelastic Collisions (continued from #25) Visualize Hydroelastic and Point Contacts (continued from #25) Sep 16, 2022
@EricCousineau-TRI
Copy link
Collaborator

Which OS / ROS distro should I use for this? Focal+Galactic, Jammy+Humble, or Jammy+Rolling?

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend Humble and the latest version of Drake. Either Focal with drake-ros-underlay (It's a Humble tarbal) or Jammy with Humble from debs.

Reviewable status: 0 of 16 files reviewed, all discussions resolved (waiting on @EricCousineau-TRI)

@sloretz sloretz marked this pull request as ready for review September 16, 2022 23:24
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 9 files at r1, 4 of 13 files at r2, 6 of 11 files at r3, 3 of 3 files at r4, all commit messages.
Reviewable status: 13 of 17 files reviewed, 3 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):
When running using Drake install in /opt/drake, I had to do this:

export LD_LIBRARY_PATH=/opt/drake/lib:${LD_LIBRARY_PATH}

Otherwise, I got the following error:

... libdrake.so: cannot open shared object file: No such file or directory

Is that expected?



drake_ros_examples/package.xml line 15 at r4 (raw file):

  <depend>drake_ros_core</depend>
  <depend>drake_ros_viz</depend>
  <depend>libgflags-dev</depend>

BTW Thanks!


drake_ros_examples/examples/collisions/collisions.cc line 22 at r4 (raw file):

#include <utility>

#include "drake/multibody/plant/multibody_plant.h"

BTW At some point, we should start using better header file sorting order.

@EricCousineau-TRI
Copy link
Collaborator

Noted the following on 90a770b (slightly older rev)

  • The rate of update between sphere and contact patch is different - why?
  • On restart, the spheres don't move?
  • Ignore comment about only centroids moving, didn't observe that correctly 🤦
2022-09-16-drake-ros-90a770b.mp4

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and woot on running! thanks for the help!

Reviewable status: 13 of 17 files reviewed, 5 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):
Why do spheres update at rate different than contact patch?


a discussion (no related file):
Why does restarting sim only reset contact patch, but not spheres?


Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 13 of 17 files reviewed, 4 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

When running using Drake install in /opt/drake, I had to do this:

export LD_LIBRARY_PATH=/opt/drake/lib:${LD_LIBRARY_PATH}

Otherwise, I got the following error:

... libdrake.so: cannot open shared object file: No such file or directory

Is that expected?

It sounds like this might be related to the first bullet point in #135 (comment)


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 13 of 17 files reviewed, 4 unresolved discussions (waiting on @cottsay, @EricCousineau-TRI, and @sloretz)

a discussion (no related file):

Previously, sloretz (Shane Loretz) wrote…

It sounds like this might be related to the first bullet point in #135 (comment)

Ah yeah; was fuzzily remembering something along those lines 🤦

Is there any way we can add this to LD_LIBRARY_PATH via drake_ros_core?

I remember in catkin you could generate / install *.sh files to be incorporated into the setup.sh
Yeah, it was catkin_add_env_hooks:
https://github.com/eacousineau/amber_developer_stack/blob/068acd416b2929ffd4fa561cc100be3eef90ad44/common_scripts/CMakeLists.txt#L7

I think we could use the drake find results, configure a file, and then have that put Drake on LD_LIBRARY_PATH for CMake env?

\cc @cottsay


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 13 of 17 files reviewed, 4 unresolved discussions (waiting on @cottsay, @EricCousineau-TRI, and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Ah yeah; was fuzzily remembering something along those lines 🤦

Is there any way we can add this to LD_LIBRARY_PATH via drake_ros_core?

I remember in catkin you could generate / install *.sh files to be incorporated into the setup.sh
Yeah, it was catkin_add_env_hooks:
https://github.com/eacousineau/amber_developer_stack/blob/068acd416b2929ffd4fa561cc100be3eef90ad44/common_scripts/CMakeLists.txt#L7

I think we could use the drake find results, configure a file, and then have that put Drake on LD_LIBRARY_PATH for CMake env?

\cc @cottsay

(the test in this case would be to then remove the LD_LIBRARY_PATH changes that were added into the GitHub workflows)


@cottsay
Copy link
Collaborator

cottsay commented Sep 17, 2022

I think we could use the drake find results, configure a file, and then have that put Drake on LD_LIBRARY_PATH for CMake env?

We should be able to make an ament hook for this.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet! will make a quick issue and assign it to ya

Reviewable status: 13 of 17 files reviewed, 4 unresolved discussions (waiting on @cottsay, @EricCousineau-TRI, and @sloretz)

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 13 of 17 files reviewed, 4 unresolved discussions (waiting on @cottsay, @EricCousineau-TRI, and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

(the test in this case would be to then remove the LD_LIBRARY_PATH changes that were added into the GitHub workflows)

OK Deferred to #139.


Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 3 unresolved discussions (waiting on @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Why do spheres update at rate different than contact patch?

The trigger types were different between the ContactMarkersSystem and the SceneMarkersSystem. The former was publishing at an unlimited rate. I was seeing contacts markers published at 200-300Hz, while the visual markers were published at around 6Hz. I updated how parameters are passed to ConnectContactResultsToRviz and make the default trigger types Forced and Periodic with a default rate of 32Hz.

The system seems to run slower than realtime on my machine while the two balls are touching. I see 18Hz now for both until the top ball rolls off the compliant ball.


a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Why does restarting sim only reset contact patch, but not spheres?

This appears to be a mix of time, transforms, and "frame_locked" markers.

First the fixed frame in the RViz config is "world", meaning everything we display has to be transformed to that frame. All of the markers are set to "frame locked", which means they are expected to move with the frame they're declared in. When a marker is "frame locked" RViz will recompute the pose of the marker in the fixed frame using the "latest" transform. "Latest" here means timestamp with the highest value.

The contact patch and points are already declared to be in "world" frame, so no transform is needed. The spheres are declared to be in their own tf frames, so their transform will be calculated from the "latest" TF data.

The marker systems get the current time from Drake. Since this is a simulated system, Drake sets time to zero every time the simulation is restarted. RViz would normally reset itself when time jumps backwards, but it doesn't know simulated time is being used. Instead it transforms the spheres using the tf data with the latest timestamp, and the latest time stamp comes from the longest previous run of the simulation. The spheres are frozen in place until time advances farther than that, at which point they move as expected.

There are a few ways we could fix this. One option would be to make the Drake-ROS systems ignore Drake's time and always use the system time. New runs would always have greater timestamp values as long as the system clock didn't jump backwards. A problem with this option occurs when a simulation is running at slower than real time. Say one uses ros2 bag to record the simulation, and they want to play it back at "real time". That becomes impossible because the timestamps were set to their slower-than-realtime values.

Another option would be to make a ROSClockSystem that published Drake's time on a /clock topic so RViz could use it. This seems like the most "correct" option, but it requires remembering to launch RViz with the "use_sim_time" option. We had this before in a past prototype, but it looks like we haven't added it drake_ros_core here.

A third option might be to specify the initial time to be something other than 0. That would make new runs have newer time stamps than old runs. Unfortunately this doesn't seem to work in Drake at the moment.

I tried setting the time before the simulator's Initialize() call in the collisions example.

simulator_context.SetTime(time(nullptr));

The first step fails with this error.

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error control wants to select step smaller than minimum allowed (1.66371e-05)
[ros2run]: Aborted


drake_ros_examples/examples/collisions/collisions.cc line 22 at r4 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

BTW At some point, we should start using better header file sorting order.

I think we copied this order from drake

IncludeCategories:

https://github.com/RobotLocomotion/drake/blob/35e6290f1b126a7bd19f406a2ac674745e12360c/.clang-format#L20

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 3 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, sloretz (Shane Loretz) wrote…

This appears to be a mix of time, transforms, and "frame_locked" markers.

First the fixed frame in the RViz config is "world", meaning everything we display has to be transformed to that frame. All of the markers are set to "frame locked", which means they are expected to move with the frame they're declared in. When a marker is "frame locked" RViz will recompute the pose of the marker in the fixed frame using the "latest" transform. "Latest" here means timestamp with the highest value.

The contact patch and points are already declared to be in "world" frame, so no transform is needed. The spheres are declared to be in their own tf frames, so their transform will be calculated from the "latest" TF data.

The marker systems get the current time from Drake. Since this is a simulated system, Drake sets time to zero every time the simulation is restarted. RViz would normally reset itself when time jumps backwards, but it doesn't know simulated time is being used. Instead it transforms the spheres using the tf data with the latest timestamp, and the latest time stamp comes from the longest previous run of the simulation. The spheres are frozen in place until time advances farther than that, at which point they move as expected.

There are a few ways we could fix this. One option would be to make the Drake-ROS systems ignore Drake's time and always use the system time. New runs would always have greater timestamp values as long as the system clock didn't jump backwards. A problem with this option occurs when a simulation is running at slower than real time. Say one uses ros2 bag to record the simulation, and they want to play it back at "real time". That becomes impossible because the timestamps were set to their slower-than-realtime values.

Another option would be to make a ROSClockSystem that published Drake's time on a /clock topic so RViz could use it. This seems like the most "correct" option, but it requires remembering to launch RViz with the "use_sim_time" option. We had this before in a past prototype, but it looks like we haven't added it drake_ros_core here.

A third option might be to specify the initial time to be something other than 0. That would make new runs have newer time stamps than old runs. Unfortunately this doesn't seem to work in Drake at the moment.

I tried setting the time before the simulator's Initialize() call in the collisions example.

simulator_context.SetTime(time(nullptr));

The first step fails with this error.

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error control wants to select step smaller than minimum allowed (1.66371e-05)
[ros2run]: Aborted

Ah, this sounds like a systemic issue that may require some more thinking outside scope of this PR.

Can you make an issue for this, perhaps copying+pasting your above investigation + pointer to this PR + perhaps subset of video that shows restart behavior?


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 4 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Ah, this sounds like a systemic issue that may require some more thinking outside scope of this PR.

Can you make an issue for this, perhaps copying+pasting your above investigation + pointer to this PR + perhaps subset of video that shows restart behavior?

(at which point, we can resolve this discussion item and move this PR forward)



drake_ros_examples/examples/collisions/collisions.cc line 22 at r4 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I think we copied this order from drake

IncludeCategories:

https://github.com/RobotLocomotion/drake/blob/35e6290f1b126a7bd19f406a2ac674745e12360c/.clang-format#L20

Sorry, didn't fully specify.

I meant that we should improve our include ordering for "party-ness", e.g. grouping of "STL, third-party non-Drake, Drake, then first-party"


drake_ros_examples/examples/collisions/collisions.cc line 22 at r5 (raw file):

#include <utility>

#include "drake/multibody/plant/multibody_plant.h"

nit Inconsistent usage of quotes vs. <> includes.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 8 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)


drake_ros_viz/src/contact_markers_system.cc line 237 at r5 (raw file):

    const auto& color = impl_->params.default_color;
    ball_msg.color.r = color.r();

nit Can you move color conversion to separate function?

We can then eventually incorporate this into public converters like @gbiggs's PR.


drake_ros_viz/src/contact_markers_system.cc line 272 at r5 (raw file):

    const auto& nhat_BA_W = contact_info.point_pair().nhat_BA_W;
    // C = Center of contact, l = one of the end points of the normal

I'm having trouble figuring out what this physically means, as well as how it relates to below math.

Concretely:

  • I can't tell if l is qualifying C or a distinct entitty
  • I don't know what "end point of normal" means (physically)

drake_ros_viz/src/contact_markers_system.cc line 275 at r5 (raw file):

    const auto p_Cl_W = kPointNormalLength / 2.0 * nhat_BA_W;

    normal_msg.points.resize(2);

nit resize(2) + front(), back() feels too "cute".

Can you instead make a free function for converting (w/ TODO to leverage #141), just do `.push_back(Vector3dToRosPoint(p_CStart_W


drake_ros_viz/src/contact_markers_system.cc line 276 at r5 (raw file):

    normal_msg.points.resize(2);
    normal_msg.points.front().x = p_Cl_W.x() + contact_info.contact_point().x();

We should not mix the role of conversion and math transform.

Can you make a separate var for math, e.g. `, then relegate this to purely conversion?
My guess is something like:

p_WC = contact_info.contact_point()
p_WStart = p_WC + p_CL_W
p_WEnd = p_WC - p_CL_W

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 9 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, sloretz (Shane Loretz) wrote…

The trigger types were different between the ContactMarkersSystem and the SceneMarkersSystem. The former was publishing at an unlimited rate. I was seeing contacts markers published at 200-300Hz, while the visual markers were published at around 6Hz. I updated how parameters are passed to ConnectContactResultsToRviz and make the default trigger types Forced and Periodic with a default rate of 32Hz.

The system seems to run slower than realtime on my machine while the two balls are touching. I see 18Hz now for both until the top ball rolls off the compliant ball.

Cool, same rate now, excellent!

Regarding update rate itself, yeah, that does seem a bit slow. Making separate discussion item, resolving this one!


a discussion (no related file):

The system seems to run slower than realtime on my machine while the two balls are touching. I see 18Hz now for both until the top ball rolls off the compliant ball.

It'd be nice to know if that's intrinsic to our simulation, or the converters.
Can you add a command-line options to enable/disable both DrakeVisualizer and the ROS-specific visualization?


a discussion (no related file):

I see 18Hz now for both until the top ball rolls off the compliant ball.

Can I ask how you observed this? ros2 topic hz?


Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 19 files reviewed, 9 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

I see 18Hz now for both until the top ball rolls off the compliant ball.

Can I ask how you observed this? ros2 topic hz?

Correct. I compared ros2 topic hz /contacts and ros2 topic hz /scene_markers/visual



drake_ros_viz/src/contact_markers_system.cc line 272 at r5 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

I'm having trouble figuring out what this physically means, as well as how it relates to below math.

Concretely:

  • I can't tell if l is qualifying C or a distinct entitty
  • I don't know what "end point of normal" means (physically)

I'm not 100% sure normal is the right thing to call it because the visualization goes both directions from the contact point rather than one direction from a surface. It's visualizing nhat_BA_W and nhat_AB_W as a line segment with length kPointNormalLength centered at point C (I think drake calls this p_WC). l is one of the points of that line segment. I see you're using p_CL_W below.

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 20 files reviewed, 9 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

The system seems to run slower than realtime on my machine while the two balls are touching. I see 18Hz now for both until the top ball rolls off the compliant ball.

It'd be nice to know if that's intrinsic to our simulation, or the converters.
Can you add a command-line options to enable/disable both DrakeVisualizer and the ROS-specific visualization?

It looks like it's the simulation. I added a command line option ros2 run drake_ros_examples collisions --use_drake_visualizer that uses DrakeVisualizer instead of RViz. Drake visualizer reports a real time factor of 0.55ish which is very close to to 18 out of 32 Hz I see with Rviz.


Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 20 files reviewed, 8 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, sloretz (Shane Loretz) wrote…

It looks like it's the simulation. I added a command line option ros2 run drake_ros_examples collisions --use_drake_visualizer that uses DrakeVisualizer instead of RViz. Drake visualizer reports a real time factor of 0.55ish which is very close to to 18 out of 32 Hz I see with Rviz.

Note that DrakeVisualizer only works on Focal. It seems to have been removed upstream in Jammy.



drake_ros_examples/examples/collisions/collisions.cc line 22 at r5 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Inconsistent usage of quotes vs. <> includes.

Done.

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 20 files reviewed, 6 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

(at which point, we can resolve this discussion item and move this PR forward)

Opened issue with video in #144



drake_ros_viz/src/contact_markers_system.cc line 237 at r5 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Can you move color conversion to separate function?

We can then eventually incorporate this into public converters like @gbiggs's PR.

Done.


drake_ros_viz/src/contact_markers_system.cc line 272 at r5 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I'm not 100% sure normal is the right thing to call it because the visualization goes both directions from the contact point rather than one direction from a surface. It's visualizing nhat_BA_W and nhat_AB_W as a line segment with length kPointNormalLength centered at point C (I think drake calls this p_WC). l is one of the points of that line segment. I see you're using p_CL_W below.

Updated comments and code.


drake_ros_viz/src/contact_markers_system.cc line 276 at r5 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

We should not mix the role of conversion and math transform.

Can you make a separate var for math, e.g. `, then relegate this to purely conversion?
My guess is something like:

p_WC = contact_info.contact_point()
p_WStart = p_WC + p_CL_W
p_WEnd = p_WC - p_CL_W

Done.

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires ros/rosdistro#36474

Reviewable status: 38 of 42 files reviewed, 2 unresolved discussions (waiting on @EricCousineau-TRI and @jwnimmer-tri)

@jwnimmer-tri
Copy link
Contributor

Sorry :(. We haven't finished vendoring VTK yet in Drake (RobotLocomotion/drake#16502). Hopefully it'll land within a month or two.

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 33 of 46 files reviewed, 2 unresolved discussions (waiting on @EricCousineau-TRI and @jwnimmer-tri)


drake_ros_examples/examples/hydroelastic/hydroelastic.cc line 109 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This branching should ideally not pollute example code.

Can we carve this out into rmw_isolation, or at least encapsulate it somewhere else so users don't get hung up on this detail?

My main suggestion is, let's show ament index and Bazel playing well together.

Can we add functionality in drake_ros to update the ament index appropriately?
Maybe you can leverage your Bazel ament index logic used in scraping, or we can do it in code.

There's now no more branching for bazel 🎉 . I added ament_index_share_files which creates an ament index with a package resource marker, and the given files symlinked into a share directory (matching ament_index's expectation of a FHS folder structure). We no longer need to use bazel's runfiles API (as long as we're not on Windows).

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 13 files at r17, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @sloretz)


bazel_ros2_rules/ros2/ament_index.bzl line 69 at r17 (raw file):

    output_to_genfiles = True,
    provides = [AmentIndex],
)

nit There's a chance we didn't resolve the "first one isn't the one I intended in Bazel".

What happens if a user declares an ament index for two different packages in Bazel? Should we fail fast, warn, or do nothing?
Consider adding a TODO to that effect.


bazel_ros2_rules/ros2/ament_index.bzl line 83 at r17 (raw file):

    file = join(d, "short_path of file")

Note that this works only when bazel is able to create a real runfiles

nit I'm confused on this, specifically "able to create a real runfiles folder [...] where bazel creates a runfiles manifest, the ament index won't be accessible"

Can you clarify wording here?


bazel_ros2_rules/ros2/tools/dload_shim.cc line 83 at r17 (raw file):

          // Rlocation returns an empty string if the path cannot be found,
          // or contains "./", "../", etc.
          value_stream << runfiles->Rlocation(actions[i][j]) << ":";

nit Why not wrap this in a function, and fail fast if the key returns empty?


drake_ros_examples/examples/hydroelastic/BUILD.bazel line 10 at r17 (raw file):

ament_index_share_files(
    name = "hydroelastic_share",

nit Can you add a comment to clarify motivation for this mechanism? e.g.

# We use `ament_index_share_files` so we can easily write code that can run in both CMake / ament and Bazel.

drake_ros_examples/examples/hydroelastic/hydroelastic.cc line 98 at r11 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I'm fine with ament generating an install tree, and I'm fine with Bazel during crazy runfiles merging, but I'm fine b/c those are at build time.

I was thinking maybe it was possible at build time. I was thinking do_dload_shim would look at all the AmentIndex providers from the :data dependencies of the target it was given, and it would construct the ament index. Currently it sets AMENT_PREFIX_PATH but expects those paths already exist. Does that change your thoughts on what path we should take?

I think that would mean declaring package names like (a), but multiple ament indexes with different resources could be merged into the same path.

Gotcha, to summarize, ament doesn't allow for shadowing? (that does seem like a good thing!)

Shadowing meaning, a package can't exist in two ament indexes? I think in the ROS world we call this overriding. We allow it. The location of an installed package is a type of resource you can query from the ament index. When you query a for "where is the package drake_ros_examples installed", the ament index code picks the first matching resource it finds. Workspaces are prepend toAMENT_PREFIX_PATH so that resources in a workspace sourced after an earlier one can override resources in an earlier sourced workspace. If we have two rules that each create an ament index with a package marker for drake_ros_examples, then there are two AMENT_PREFIX_PATHs to check, and the ament index code will say "drake_ros_examples" is installed only to the first one. I think the limitation is the ament index doesn't allow a package to be split across multiple indexes.

OK Yeah, that explanation helps!
I think there's still a chance we didn't resolve the "first one isn't the one I intended in Bazel", but will comment above to that effect.


drake_ros_examples/examples/hydroelastic/hydroelastic.cc line 109 at r11 (raw file):

Previously, sloretz (Shane Loretz) wrote…

There's now no more branching for bazel 🎉 . I added ament_index_share_files which creates an ament index with a package resource marker, and the given files symlinked into a share directory (matching ament_index's expectation of a FHS folder structure). We no longer need to use bazel's runfiles API (as long as we're not on Windows).

OK Sweet!


drake_ros/viz/contact_markers_system.cc line 106 at r17 (raw file):

}

std::vector<uint8_t> GenerateHeatmapPng() {

BTW Yeah, given this setup, I think we should eventually just specify an image on disk rather than generate it... Or we should make this more dynamic if we care about that.

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 44 of 46 files reviewed, 2 unresolved discussions (waiting on @EricCousineau-TRI and @sloretz)


bazel_ros2_rules/ros2/ament_index.bzl line 69 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit There's a chance we didn't resolve the "first one isn't the one I intended in Bazel".

What happens if a user declares an ament index for two different packages in Bazel? Should we fail fast, warn, or do nothing?
Consider adding a TODO to that effect.

I think two different packages is a valid use case. For example, a target might need an arm model from ur_description and an end-effector model from robotiq_description. Two ament indexes specifying the same package could be problematic.

I tried it out here: sloretz@b373cf4

  • Two different packages works fine, as expected
  • The same package twice surprised me. If they have the same prefix, then the runfiles are merged. I would have thought bazel would notice and error about two different rules creating the same file (the package marker for the foo package).
  • The same package twice with a different prefix is the problematic case. Changing the order of the deps doesn't seem to change which prefix is listed first, meaning I'm not sure we have any control of what prefix is used when "overriding" or "shadowing". This would be a good case to raise an error, though it would require the dload shim to look at the files that are going to be generated in an AmentIndex, and I'm not sure how to do that. For now I left a note and a TODO in the ament_index_share_files comment about avoiding overriding in 44f8be5.

bazel_ros2_rules/ros2/ament_index.bzl line 83 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit I'm confused on this, specifically "able to create a real runfiles folder [...] where bazel creates a runfiles manifest, the ament index won't be accessible"

Can you clarify wording here?

I think this might apply only to Windows (where it's not always possible to create symlinks as a non-admin user), but I'm not 100% sure that's the only place it applies. When bazel can't create a runfiles directory it creates a *.runfiles_manifest file instead. This file says what runfiles folder structure bazel would create if it could. If bazel were to do that, thenAMENT_PREFIX_PATH we set won't exist, and so ament_index_cpp calls aren't going to find anything.

I added a little more wording to that effect in 2f7a3f9.


bazel_ros2_rules/ros2/tools/dload_shim.cc line 83 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Why not wrap this in a function, and fail fast if the key returns empty?

I'd be concerned about existing packages declaring environment variable paths that don't actually exist (like a package says it needs LD_LIBRARY_PATH set, but doesn't install any libraries) . I'll try this in a separate PR.


drake_ros_examples/examples/hydroelastic/BUILD.bazel line 10 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Can you add a comment to clarify motivation for this mechanism? e.g.

# We use `ament_index_share_files` so we can easily write code that can run in both CMake / ament and Bazel.

Added comment in e41e15d

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 44 of 46 files reviewed, 2 unresolved discussions (waiting on @EricCousineau-TRI)


bazel_ros2_rules/ros2/tools/dload_shim.cc line 83 at r17 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I'd be concerned about existing packages declaring environment variable paths that don't actually exist (like a package says it needs LD_LIBRARY_PATH set, but doesn't install any libraries) . I'll try this in a separate PR.

#265

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r18, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @sloretz)


bazel_ros2_rules/ros2/ament_index.bzl line 69 at r17 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I think two different packages is a valid use case. For example, a target might need an arm model from ur_description and an end-effector model from robotiq_description. Two ament indexes specifying the same package could be problematic.

I tried it out here: sloretz@b373cf4

  • Two different packages works fine, as expected
  • The same package twice surprised me. If they have the same prefix, then the runfiles are merged. I would have thought bazel would notice and error about two different rules creating the same file (the package marker for the foo package).
  • The same package twice with a different prefix is the problematic case. Changing the order of the deps doesn't seem to change which prefix is listed first, meaning I'm not sure we have any control of what prefix is used when "overriding" or "shadowing". This would be a good case to raise an error, though it would require the dload shim to look at the files that are going to be generated in an AmentIndex, and I'm not sure how to do that. For now I left a note and a TODO in the ament_index_share_files comment about avoiding overriding in 44f8be5.

Sorry, I meant "two different instances of the same package in different Bazel packages"

I'm not surprised that Bazel allows the merging when you're in the same folder - in fact, I think that's A-OK, as they should ultimately be the same folder -- unless there's some ament manifest that conflicts -- however, Bazel should fail if you try to combine two runfiles that have the same path for unique instances of said file.

However, if you have the same package name for ament spread across two different Bazel packages, that might cause an issue.
Resolving this, but added a nit to clarify.


bazel_ros2_rules/ros2/ament_index.bzl line 83 at r17 (raw file):

Previously, sloretz (Shane Loretz) wrote…

I think this might apply only to Windows (where it's not always possible to create symlinks as a non-admin user), but I'm not 100% sure that's the only place it applies. When bazel can't create a runfiles directory it creates a *.runfiles_manifest file instead. This file says what runfiles folder structure bazel would create if it could. If bazel were to do that, thenAMENT_PREFIX_PATH we set won't exist, and so ament_index_cpp calls aren't going to find anything.

I added a little more wording to that effect in 2f7a3f9.

OK Sweet, thanks!


bazel_ros2_rules/ros2/ament_index.bzl line 87 at r18 (raw file):

platform like Windows, or with `--nobuild_runfile_links`).

Note that the same package_name should never be used in two different

nit Per above, I recommend qualifying this:

Note that the same `package_name` for ROS 2 packages should not be used
in multiple different rules spread across multiple Bazel packages. ...

@EricCousineau-TRI
Copy link
Collaborator

@sloretz Is this something we could land soon? (tomorrow, if possible?)

Copy link
Collaborator Author

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 44 of 46 files reviewed, all discussions resolved (waiting on @EricCousineau-TRI)


bazel_ros2_rules/ros2/ament_index.bzl line 87 at r18 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Per above, I recommend qualifying this:

Note that the same `package_name` for ROS 2 packages should not be used
in multiple different rules spread across multiple Bazel packages. ...

Done in ae15ae4

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r19, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @sloretz)

@EricCousineau-TRI
Copy link
Collaborator

Thanks!

@sloretz sloretz merged commit 00b0b03 into main Apr 12, 2023
@sloretz sloretz deleted the sloretz__gbalke__drake_ros_viz_hydroelastic branch April 12, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants