Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure no redundant rcl_logging initialization and finalization (alternative) #573

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions rcl/include/rcl/arguments.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ typedef struct rcl_arguments_t
#define RCL_SHORT_REMAP_FLAG "-r"
#define RCL_LOG_LEVEL_FLAG "--log-level"
#define RCL_EXTERNAL_LOG_CONFIG_FLAG "--log-config-file"
// Prefixes used for some options
#define RCL_ENABLE_FLAG_PREFIX "--enable-"
#define RCL_DISABLE_FLAG_PREFIX "--disable-"
// To be prefixed with --enable- or --disable-
#define RCL_LOG_STDOUT_FLAG_SUFFIX "stdout-logs"
#define RCL_LOG_ROSOUT_FLAG_SUFFIX "rosout-logs"
Expand Down Expand Up @@ -395,6 +398,23 @@ rcl_ret_t
rcl_arguments_fini(
rcl_arguments_t * args);

/// Return `true` if argv contains logging arguments.
/**
* Attribute | Adherence
* ------------------ | -------------
* Allocates Memory | No
* Thread-Safe | Yes
* Uses Atomics | No
* Lock-Free | Yes
*
* \param argc
* \param argv can be `NULL`. In that case, the return value is `false`.
*/
RCL_PUBLIC
RCL_WARN_UNUSED
bool
rcl_contains_logging_arguments(int argc, char const * const * argv);

#ifdef __cplusplus
}
#endif
Expand Down
3 changes: 0 additions & 3 deletions rcl/src/rcl/arguments.c
Original file line number Diff line number Diff line change
Expand Up @@ -245,9 +245,6 @@ _rcl_parse_param_file_rule(
rcl_params_t * params,
char ** param_file);

#define RCL_ENABLE_FLAG_PREFIX "--enable-"
#define RCL_DISABLE_FLAG_PREFIX "--disable-"

/// Parse a bool argument that may or may not be for the provided key rule.
/**
* \param[in] arg the argument to parse
Expand Down
76 changes: 51 additions & 25 deletions rcl/src/rcl/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,24 @@ extern "C"

#include "rcl/init.h"

#include "./arguments_impl.h"
#include "./common.h"
#include "./context_impl.h"
#include "./init_options_impl.h"
#include "rcl/arguments.h"
#include "rcl/error_handling.h"
#include "rcl/logging.h"
#include "rcutils/logging_macros.h"
#include "rcutils/stdatomic_helper.h"

#include "rmw/error_handling.h"

#include "tracetools/tracetools.h"

static atomic_uint_least64_t __rcl_next_unique_id = ATOMIC_VAR_INIT(1);
#include "rcl/arguments.h"
#include "rcl/error_handling.h"
#include "rcl/logging.h"

#include "./arguments_impl.h"
#include "./common.h"
#include "./context_impl.h"
#include "./init_options_impl.h"

static atomic_uint_least64_t g_rcl_next_unique_id = ATOMIC_VAR_INIT(1);
static atomic_uint_least64_t g_logging_ref_count = ATOMIC_VAR_INIT(0);

rcl_ret_t
rcl_init(
Expand Down Expand Up @@ -122,23 +127,35 @@ rcl_init(
goto fail;
}

ret = rcl_logging_configure(&context->global_arguments, &allocator);
if (RCL_RET_OK != ret) {
fail_ret = ret;
RCUTILS_LOG_ERROR_NAMED(
ROS_PACKAGE_NAME,
"Failed to configure logging: %s",
rcutils_get_error_string().str);
goto fail;
if (0u == rcutils_atomic_fetch_add_uint64_t(&g_logging_ref_count, 1)) {
ret = rcl_logging_configure(&context->global_arguments, &allocator);
Copy link
Member

@wjwwood wjwwood Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️ context switch here, then:

  • in thread 2, start rcl_shutdown()
    • my theory is that rcl_shutdown() will fail because though g_logging_ref_count is > 0, trying to do rcl_logging_fini() will fail as it has not yet been configured

My conclusion is that rcl_init() is not thread-safe with rcl_shutdown().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, I see what you meant now.

So, I always considered that the upper layer will ensure that each context calls rcl_init and rcl_shutdown in a thread safe manner (rcl_init and rcl_shutdown are still no thread-safe).
That's the case now in rclcpp (it's not the case in rclpy, but that's definitely wrong).

Under that situation, using an atomic count adds protection between rcl_init and rcl_shutdown calls on different objects, and avoids the need of a "global" mutex.[*]

My main reason for pushing this alternative, it's that it solves the problem in a more "minimalistic" way (only one PR, without adding functions like rcl_logging_increase_ref_count that we might want to deprecate later). It's easily to backport too.


[*] If calls in each context between rcl_init and rcl_shutdown are protected by an upper layer, the following situations are possible:

  • Only one context: It's thread safe because the upper layer ensures that.
  • Two contexts, init1-init2-shutdown1-shutdown2: The atomic ensures thread safety between init1 and init 2. The upper layer ensures thread safety between init1 and shutdown1.
  • Two contexts, init1-shutdown2-shutdown1: Calling shutdown in a non-inited context is handled correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what about this case:

init1-shutdown1-init2-...

Where shutdown1 is interrupted after adjusting the atomic count, but before it does log fini, and context switches to init2 which then increments it, finding that according to the count logging was not initialized, and tries to initialize logging when it already has been. Either the "initialize while already initialized" case will fail, or if that silently passes, when shutdown1 continues it will shutdown logging and leave init2 initialized but without logging.

I remain unconvinced that this is thread-safe even when using separate context objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh 🤦‍♂️, I completed miss that case. Thanks William!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, this stuff is tricky. I've only learned to be very suspicious of creating thread-safety with just atomics after messing it up many, many times. :)

Maybe it's possible to salvage this approach with enough iteration, but honestly I think a global mutex provided by rclpy/rclcpp is the safest thing to do to protect the init/shutdown and therefore logging init.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's possible to salvage this approach with enough iteration, but honestly I think a global mutex provided by rclpy/rclcpp is the safest thing to do to protect the init/shutdown and therefore logging init.

Yes, I agree.
@fujitatomoya do you want to iterate and open PRs in rclpy and rclcpp? If not, I can take it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanpauno

you can go ahead to take them! thanks for your effort!

if (RCL_RET_OK != ret) {
fail_ret = ret;
RCUTILS_LOG_ERROR_NAMED(
ROS_PACKAGE_NAME,
"Failed to configure logging: %s",
rcutils_get_error_string().str);
goto fail;
}
} else {
// Check if logging configurations were specified again or not.
// It's not possible to check that in the already parsed `context->global_arguments`
// as many of the configurations have default values matching a possible argument.
if (rcl_contains_logging_arguments(argc, argv)) {
RCUTILS_LOG_WARN_NAMED(
ROS_PACKAGE_NAME,
"rcl_init was called more than once with logging arguments,"
"ignoring the last logging configuration");
}
}

// Set the instance id.
uint64_t next_instance_id = rcutils_atomic_fetch_add_uint64_t(&__rcl_next_unique_id, 1);
uint64_t next_instance_id = rcutils_atomic_fetch_add_uint64_t(&g_rcl_next_unique_id, 1);
if (0 == next_instance_id) {
// Roll over occurred, this is an extremely unlikely occurrence.
RCL_SET_ERROR_MSG("unique rcl instance ids exhausted");
// Roll back to try to avoid the next call succeeding, but there's a data race here.
rcutils_atomic_store(&__rcl_next_unique_id, -1);
rcutils_atomic_store(&g_rcl_next_unique_id, -1);
goto fail;
}
rcutils_atomic_store((atomic_uint_least64_t *)(&context->instance_id_storage), next_instance_id);
Expand Down Expand Up @@ -184,13 +201,22 @@ rcl_shutdown(rcl_context_t * context)
RCL_SET_ERROR_MSG(rmw_get_error_string().str);
return rcl_convert_rmw_ret_to_rcl_ret(rmw_ret);
}

rcl_ret_t rcl_ret = rcl_logging_fini();
RCUTILS_LOG_ERROR_EXPRESSION_NAMED(
RCL_RET_OK != rcl_ret, ROS_PACKAGE_NAME,
"Failed to fini logging, rcl_ret_t: %i, rcl_error_str: %s", rcl_ret,
rcl_get_error_string().str);
rcl_reset_error();
size_t previous_logging_ref_count =
rcutils_atomic_fetch_add_uint64_t(&g_logging_ref_count, -1);

if (1u == previous_logging_ref_count) {
ivanpauno marked this conversation as resolved.
Show resolved Hide resolved
rcl_ret_t rcl_ret = rcl_logging_fini();
RCUTILS_LOG_ERROR_EXPRESSION_NAMED(
RCL_RET_OK != rcl_ret, ROS_PACKAGE_NAME,
"Failed to fini logging, rcl_ret_t: %i, rcl_error_str: %s", rcl_ret,
rcl_get_error_string().str);
rcl_reset_error();
} else if (0u >= previous_logging_ref_count) {
rcutils_atomic_fetch_add_uint64_t(&g_logging_ref_count, 1);
RCUTILS_LOG_ERROR_NAMED(
ROS_PACKAGE_NAME,
"logging was already finished");
}

return RCL_RET_OK;
}
Expand Down