Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Allocator - Aligned allocations #31

Merged
merged 5 commits into from
May 15, 2019

Conversation

emersonknapp
Copy link
Contributor

@emersonknapp emersonknapp commented May 13, 2019

  • Remove unwanted main symbol
  • Centralize copy-pasted code for getting static allocator
  • Align the memory pool and returned memory from in the static allocator to std::max_align_t in order to make it all safe for optimized builds

Fixes #30
Unblocks ros2/rmw_fastrtps#272

Signed-off-by: Emerson Knapp eknapp@amazon.com

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@emersonknapp emersonknapp changed the title Aligned memory pool allocations Static Allocator - Aligned allocations May 13, 2019
@emersonknapp
Copy link
Contributor Author

emersonknapp commented May 13, 2019

@thomas-moulard Can you please run CI for this PR? Unfortunately we cannot restrict the test or build packages very much.

Gist: https://gist.githubusercontent.com/emersonknapp/c3646b3d48b6ad07b1c9144eac98dff7/raw/fdc4bae8a7a945f689c6748cf6fc66187b61df5c/ros2.repos
Additional BUILD args: --packages-up-to test_rclcpp test_communication test_cli test_cli_remapping rclpy
Additional TEST args: --packages-up-to test_rclcpp test_communication test_cli test_cli_remapping rclpy
CI Job: ci_launcher

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
Copy link

@thomas-moulard thomas-moulard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, we could clean up a few things but it's fairly minor. This PR lacks unit tests.

@thomas-moulard
Copy link

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp
Copy link
Contributor Author

Any thoughts on how to unit test this? I am not able to reproduce the observed problem in a toy example - only in the context of a specific system dynamically linking against OpenSSL, using an optimized build of glibc6

@emersonknapp
Copy link
Contributor Author

@thomas-moulard The build failed because my arguments were wrong - the test and build args should have been:

--packages-up-to test_rclcpp test_communication test_cli test_cli_remapping rclpy

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@thomas-moulard
Copy link

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status
    Triggering a new build of ci_linux

@thomas-moulard
Copy link

What about instantiating a static allocator in the unit test and checking the addresses are actually aligned?
Obviously a repro-case would be best, but it will be hard to write. I think we can find a trade-off there between repro-case and no tests.

@wjwwood
Copy link
Member

wjwwood commented May 14, 2019

In general, looks good to me. It might fix aarch64 too, since we've had alignment issue on ARM in the past. Thanks for looking into it, I know the code is hairy :x

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
Copy link

@thomas-moulard thomas-moulard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for considering adding unit tests. I don't think that I need to take another pass on this so I'll LGTM it now 👍

@emersonknapp
Copy link
Contributor Author

@wjwwood I've addressed the formatting concerns and added in the statics. I'm looking at test_memory_tools.cpp and trying to figure how to add a unit test for static_allocator.hpp, which is buried in the source directory. Any ideas on that without having to restructure the project?

Other than that, it looks like the test failures on CI were test_cli.test_params_yaml timeout - which I am getting locally with or without this chage (btw do you know if there is an open issue for that? if not should I open it on system_tests? i think it's one of those things we keep seeing)

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@emersonknapp
Copy link
Contributor Author

I've just pushed a commit that is an attempt at exposing the src dir to the tests without restructuring things - lmk if this is ok or if it's a kludge - I wouldn't consider myself enough of a CMake expert to determine.

@jacobperron
Copy link
Contributor

Other than that, it looks like the test failures on CI were test_cli.test_params_yaml timeout - which I am getting locally with or without this chage (btw do you know if there is an open issue for that? if not should I open it on system_tests? i think it's one of those things we keep seeing)

Here's the issue for the known flake ros2/build_farmer#166

@dirk-thomas
Copy link
Contributor

Testing rclcpp and test_rclcpp:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp
Copy link
Contributor Author

emersonknapp commented May 15, 2019

Regarding time source test failures in above - I am able to reproduce those locally on master ros2.repos without this patch.

As for this patch, I believe the only packages this really affects are rcl and rcutils, because from my understanding those are the only packages whose tests LD_PRELOAD the memory_tools_interpose.so library. (Those are only packages that mention the LIBRARY_PRELOAD_ENVIRONMENT_VARIABLE property exported from here)

Locally, if I run the rcl tests without this patch on master ros2.repos, I get segfaults in the following tests

The following tests FAILED:                                                                                                                                                                                                     
          8 - test_time__rmw_fastrtps_cpp (Failed)                                                                                                                                                                              
         10 - test_context__rmw_fastrtps_cpp (Failed)                                                                                                                                                                           
         17 - test_init__rmw_fastrtps_cpp (Failed)                                                                                                                                                                              
         18 - test_node__rmw_fastrtps_cpp (Failed)                                                                                                                                                                              
         22 - test_guard_condition__rmw_fastrtps_cpp (Failed)                                                                                                                                                                   
         32 - test_time__rmw_fastrtps_dynamic_cpp (Failed)                                                                                                                                                                      
         34 - test_context__rmw_fastrtps_dynamic_cpp (Failed)                                                                                                                                                                   
         41 - test_init__rmw_fastrtps_dynamic_cpp (Failed)                                                                                                                                                                      
         42 - test_node__rmw_fastrtps_dynamic_cpp (Failed)                                                                                                                                                                      
         46 - test_guard_condition__rmw_fastrtps_dynamic_cpp (Failed)                                                                                                                                                           
Errors while running CTest                                                                                                                                                                                                      
--- stderr: rcl                                                                                                                                                                                                                 
Errors while running CTest                                                           

But with this patch, 100% of rcl tests pass.

I don't know if the buildfarm is witnessing the same segfaults - it is possible that it is a very subtle combination of variables. It would be nice to see the current state of master compared to this build for rcl and rcutils

Copy link
Contributor

@jacobperron jacobperron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look okay to me. I'll trigger a set of builds for comparison.

@jacobperron
Copy link
Contributor

jacobperron commented May 15, 2019

In addition to the current state of master I've also triggered a build with Fast-RTPS 1.7 to see if the time source test failures are coming from something introduced in Fast-RTPS 1.8.

Linux CI

  • master: Build Status
  • this PR: Build Status
  • this PR + Fast-RTPS 1.7 Build Status

@jacobperron
Copy link
Contributor

jacobperron commented May 15, 2019

I guess after ros2/rmw_fastrtps#272, we can't expect Fast-RTPS 1.7 to build.

Edit: I've re-triggered with rmw_fastrtps HEAD~1

@dirk-thomas
Copy link
Contributor

The CI results are a strict improvement over the current master. Merging...

@dirk-thomas dirk-thomas merged commit afe870c into osrf:master May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfaults in Openssl static initialization with memory interpose library
6 participants