Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Iron Irwini #52

Open
5 of 8 tasks
gavanderhoorn opened this issue Jun 6, 2023 · 20 comments
Open
5 of 8 tasks

Support Iron Irwini #52

gavanderhoorn opened this issue Jun 6, 2023 · 20 comments
Labels
enhancement New feature or request help wanted micro-ROS micro-ROS related issue todo Not an issue, just a TODO
Milestone

Comments

@gavanderhoorn
Copy link
Collaborator

gavanderhoorn commented Jun 6, 2023

Tasks:

@gavanderhoorn gavanderhoorn added enhancement New feature or request help wanted micro-ROS micro-ROS related issue todo Not an issue, just a TODO labels Jun 6, 2023
@gavanderhoorn gavanderhoorn added this to the untargeted milestone Jun 6, 2023
@gavanderhoorn

This comment was marked as outdated.

@gavanderhoorn gavanderhoorn modified the milestones: untargeted, 0.2.0 Aug 18, 2023
@gavanderhoorn

This comment was marked as outdated.

@gavanderhoorn
Copy link
Collaborator Author

Something to be aware of: ros2/rmw_dds_common#68.

Almost starts to pay off to implement some sort of CI that checks message compatibility between different ROS 2 distributions.

@ted-miller
Copy link
Collaborator

Doesn't build yet. But I'm working on this:
https://github.com/Yaskawa-Global/motoros2/tree/iron_wip

@ted-miller
Copy link
Collaborator

update VS project with Iron targets for all supported controllers

This kinda sucks to do and the current method does not scale well. I'm assuming the cmake-passthrough mode of M+SDK would make compilation easier. But I'm not sure how intellisense would behave without the stuff that was manually added.

@ted-miller
Copy link
Collaborator

Very quick/preliminary testing. But I wanted to write it down before I leave.

AFAICT, libmicroros seems OK. (More testing needed)

Other things I noticed using the Iron version

  • During setup, I had to set S2C1402=3 and reboot twice. This was probably a mistake on my part, but I really thought I set it right on the first attempt.
  • I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here.
  • Looking at joint_states, the cross-axis-coupling compensation routine isn't perfect. I'm seeing slight fluctuations in T axis when jogging others. But this is likely due to rounding errors on our part and could probably be fixed with Use mpGetFBPulsePosEx instead of mpGetFBPulsePos #199.
  • When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

@ted-miller
Copy link
Collaborator

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???


When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.


I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

ros2 service call /read_group_io motoros2_interfaces/srv/ReadGroupIO "{address: 7001}"
2024-03-26 13:14:07.410 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
requester: making request: motoros2_interfaces.srv.ReadGroupIO_Request(address=7001)

response:
motoros2_interfaces.srv.ReadGroupIO_Response(result_code=0, message='Success', success=True, value=192)

But if I go back to the original terminal, I can make those service calls without any issue.

It seems that fastrtps might not be threadsafe. More testing is needed to submit a ticket to eProsima.

A quick google search said that someone got a similar error with a version mismatch.

My docker container:

ros2 doctor --report | grep fastrtps
2024-03-26 13:17:01.132 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
rosidl_typesupport_fastrtps_cpp           : latest=3.0.2, local=3.0.2
rmw_fastrtps_shared_cpp                   : latest=7.1.3, local=7.1.3
rosidl_dynamic_typesupport_fastrtps       : latest=0.0.2, local=0.0.2
rosidl_typesupport_fastrtps_c             : latest=3.0.2, local=3.0.2
rmw_fastrtps_cpp                          : latest=7.1.3, local=7.1.3
fastrtps_cmake_module                     : latest=3.0.2, local=3.0.2
middleware name    : rmw_fastrtps_cpp

I'm entirely not sure which of our packages we would want to compare that too. I guess it would be the Agent, since it's doing the translation between fastrtps and microxrcedds.

The Agent container:

ros2 doctor --report | grep fastrtps
/opt/ros/humble/lib/python3.10/site-packages/ros2doctor/api/__init__.py: 154: UserWarning: Fail to call QoSCompatibilityReport class functions.
rosidl_typesupport_fastrtps_cpp           : latest=2.2.2, local=2.2.2
rmw_fastrtps_shared_cpp                   : latest=6.2.6, local=6.2.5
rosidl_typesupport_fastrtps_c             : latest=2.2.2, local=2.2.2
rmw_fastrtps_cpp                          : latest=6.2.6, local=6.2.5
fastrtps_cmake_module                     : latest=2.2.2, local=2.2.2
middleware name    : rmw_fastrtps_cpp

Definitely some differences. Don't know if they're important.


I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.


After running the FJT script, I got this message in my client.

control_msgs.action.FollowJointTrajectory_Result(error_code=-500301, error_string='Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]')

This is very confusing.

Looking at the logging script, I think it might be due to the time tolerance rather than position.

[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[1711461114.98533487] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 FJT using DEFAULT goal time tolerance: 500000000 ns
[1711461114.98539233] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 FJT action failed
[1711461114.98543978] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]

I might need to open a new issue for this.

@ted-miller
Copy link
Collaborator

Except for the misc notes above, it seems to be working. I was able to run the topics, services, and both types of motion.

@gavanderhoorn
Copy link
Collaborator Author

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???

I seem to remember having run into this as well, and then we added this to the Alarm: 8014[1] FAQ:

If the alarm is posted again after restarting the controller, make sure to allow sufficient time for the controller to properly delete the job (flushing changes to the file system may take some time). Allow for at least 20 seconds between deleting the job and restarting the controller.

Could be what you experienced.

When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.

I would've expected that yes.

Key things have changed between Humble and Iron. Especially ros2/rmw_dds_common#68 breaks everything between those versions.

I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

[..]

A quick google search said that someone got a similar error with a version mismatch.

[..]

Definitely some differences. Don't know if they're important.

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

It's also slightly surprising to see the SHM transport mentioned in the error message. AFAIK, that's explicitly disabled in the Agent Docker image.

I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

Slightly off-topic, but:

Looking at the logging script, I think it might be due to the time tolerance rather than position.

[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[..]

is your debug listener out-of-date? That formatting (the date duplication) was fixed a long time ago.

@ted-miller
Copy link
Collaborator

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

That's fair.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

I didn't know that had an apt package.

is your debug listener out-of-date?

Extremely. I keep forgetting to update it.

@ted-miller
Copy link
Collaborator

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

Do you have a trick to updating this? I'm just going through the repos list and manually checking for newer versions.
I'm assuming the following repos are applicable:

  • eProsima/Micro-CDR - No changes

  • eProsima/Micro-XRCE-DDS-Client (with private modifications) - Update needed

  • micro-ROS/rmw-microxrcedds.git - Inconsequential changes

  • micro-ROS/micro-ROS-Agent - There is a newer version of this involving "thread". But it really seems inconsequential.

@gavanderhoorn
Copy link
Collaborator Author

gavanderhoorn commented Mar 26, 2024

I have a tool which checks our .repos against upstream. But part of it is manual work, yes.

I'll take a look tomorrow.


Edit:

Inconsequential changes

I always either update everything, or nothing.

Things may look inconsequential, but specific releases have been tested against/with each other. To avoid having to vet things ourselves again at all levels, it's best to stick (or at least start) with whatever upstream has tagged as a release.

@ted-miller
Copy link
Collaborator

I pulled the latest Micro-XRCE-DDS-Client and applied the custom changes for a quick test.

I was not able to reproduce the original issue after updating. But I can't say with any certainty that the problem was fixed since I didn't reproduce the original issue multiple times. (hindsight being 20/20 and all...)

@gavanderhoorn
Copy link
Collaborator Author

gavanderhoorn commented Mar 28, 2024

Combined with the latest test build of Iron M+ libmicroros (shared with you in the other thread), main...gavanderhoorn:iron_no_remap_rules should build.


Edit: this is of course just a work-around for now.

@ted-miller
Copy link
Collaborator

@gavanderhoorn
Is Iron still of any benefit? I've asked @jimmy-mcelwain to start looking at this. But I see that microros now supports Jazzy.

@gavanderhoorn
Copy link
Collaborator Author

Yes. We should support Humble, Iron and Jazzy.

Humble and Jazzy are LTS, so will be around for quite some time.

Iron is not an LTS, but should prepare us for Jazzy.

@gavanderhoorn
Copy link
Collaborator Author

Updates to micro_ros_motoplus: Yaskawa-Global/micro_ros_motoplus#3.

@ted-miller
Copy link
Collaborator

find solution for broken CLI argument parsing (MotoROS2 does not have a CLI) due to upstream micro-ROS / rcl changes

I'm of the opinion that we just preprocessor it out for the time being. I think this issue has been sitting here too long.

update documentation:
Extracting the files

I'm not sure what this means


Jimmy and I have both tested the current state. Everything is working well (with remap rules not supported).

@jimmy-mcelwain
Copy link
Collaborator

jimmy-mcelwain commented Oct 30, 2024

I tried doing a full build (RCL_MICROROS=OFF). When I do so and load it onto the controller, it is missing the definition for the symbol strtoll. If I add a definition, then I am able to properly load it onto the controller and boot. However, as soon as I try to connect with my desktop client, it errors out at this line. The rcl_ret_t here is just RCL_RET_ERROR, nothing descriptive. I don't know what the problem is. I did notice that this is the same as #292, which. It seems to often fail on this step if there is some sort of communication error. I am certain, though, that I am not running into the same network problem as the poster there. I can keep the code exactly the same on iron wip with the exception of the definition of RCL_MICROROS_COMPLETE_IMPL, and it will succeed with the libmicroros build with RCL_MICROROS=ON and fail with the build with RCL_MICROROS=OFF. Since it only errors out upon connection with the client, I thought it could be an rmw problem, but I am having trouble figuring it out. I'm guessing that something is incomplete/wrong in my libmicroros.

@ted-miller suggested that I manually perform the remapping through the config file (change the string that is passed in when the publisher etc. is initialized), so I will probably do that sometime since I haven't been able to figure out how to remap using the intended mechanism.

@gavanderhoorn
Copy link
Collaborator Author

I'm only one voice here, but:

@ted-miller suggested that I manually perform the remapping through the config file (change the string that is passed in when the publisher etc. is initialized), so I will probably do that sometime since I haven't been able to figure out how to remap using the intended mechanism.

please don't do this.

Using parameters for topic/service/action names is a bit of an anti-pattern.

The real solution would be to address this one: ros2/rcl#998, or figure out why a full build doesn't boot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted micro-ROS micro-ROS related issue todo Not an issue, just a TODO
Projects
None yet
Development

No branches or pull requests

3 participants