Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reset() support to the Transaction object #199

Open
svwilliams opened this issue Jun 24, 2020 · 2 comments
Open

Add reset() support to the Transaction object #199

svwilliams opened this issue Jun 24, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@svwilliams
Copy link
Contributor

From @efernandez

I've been thinking of some changes for the ignition/reset logic. Below I try to explain my proposed changes in brief:

  1. As a building piece, let's add the ability to remove all the variables or constraints in the graph by specifying that in the Transaction and changing the Graph::update logic, which is currently implemented as follows, clearly not supporting that:

void Graph::update(const Transaction& transaction)
{
// Update the graph with a new transaction. In order to keep the graph consistent, variables are added first,
// followed by the constraints which might use the newly added variables. Then constraints are removed so that
// the variable usage is updated. Finally, variables are removed.
// Insert the new variables into the graph
for (const auto& variable : transaction.addedVariables())
{
addVariable(variable.clone());
}
// Insert the new constraints into the graph
for (const auto& constraint : transaction.addedConstraints())
{
addConstraint(constraint.clone());
}
// Delete constraints from the graph
for (const auto& constraint_uuid : transaction.removedConstraints())
{
removeConstraint(constraint_uuid);
}
// Delete variables from the graph
for (const auto& variable_uuid : transaction.removedVariables())
{
removeVariable(variable_uuid);
}
}

  1. Since we currently serialize the Transaction class, we don't want to break any previously serialized transaction recorded, so we can use fuse_core::uuid::NIL as a special value to indicate ALL variables or constraints, and use that in the removedConstraints() or removedVariables() to indicate we want all of them removed from the graph. This means any sensor model can request the graph to be cleared at any time by sending a transaction like the following to the optimizer:
fuse_core::Transaction transaction;
transaction.removeVariable(fuse_core::uuid::NIL);
transaction.removeConstraint(fuse_core::uuid::NIL);

The actual way we request all variables or constraints to be removed from the graphs could be done hiding the detail of the special UUID value used to indicate that. That means we could have:

transaction.removeAllVaraibles();
transaction.removeAllConstraints();

or probably only allow to remove all variables if also all constraints are removed, in which case we could simply expose a method to do transaction.removeAll(); or transaction.clear();, or something similar.

  1. The Graph::update logic should handle this new case. This could be done with the following naive implementation:
void Graph::update(const Transaction& transaction)
{
  // Remove all constraints from the graph
  for (const auto& constraint_uuid : transaction.removedConstraints())
  {
    if (constraint_uuid == fuse_core::uuid::NIL)
    {
      clearConstraints();
      break;
    }
  }
  // Remove all variables from the graph
  for (const auto& variable_uuid : transaction.removedVariables())
  {
    if (variable_uuid == fuse_core::uuid::NIL)
    {
      clearVariables();
      break;
    }
  }
  // Insert the new variables into the graph
  for (const auto& variable : transaction.addedVariables())
  {
    addVariable(variable.clone());
  }
  // Insert the new constraints into the graph
  for (const auto& constraint : transaction.addedConstraints())
  {
    addConstraint(constraint.clone());
  }
  // Delete constraints from the graph
  for (const auto& constraint_uuid : transaction.removedConstraints())
  {
    if (constraint_uuid != fuse_core::uuid::NIL)
    {
      removeConstraint(constraint_uuid);
    }
  }
  // Delete variables from the graph
  for (const auto& variable_uuid : transaction.removedVariables())
  {
    if (variable_uuid != fuse_core::uuid::NIL)
    {
      removeVariable(variable_uuid);
    }
  }
}

where clearConstraints() and clearVariables() could be new pure virtual methods:

virtual void clearConstraints() = 0;
virtual void clearVariables() = 0;

that could be efficiently implemented as follows for the fuse_graphs::HashGraph:

void clearConstraints()
{
  constraints_.clear();
  constraints_by_variable_uuid_.clear();
}

void clearVariables()
{
  variables_.clear();
  variables_on_hold_.clear();
}

I believe Transaction::merge() should work as is:

void Transaction::merge(const Transaction& other, bool overwrite)
{
stamp_ = std::max(stamp_, other.stamp_);
involved_stamps_.insert(other.involved_stamps_.begin(), other.involved_stamps_.end());
for (const auto& added_constraint : other.added_constraints_)
{
addConstraint(added_constraint, overwrite);
}
for (const auto& removed_constraint : other.removed_constraints_)
{
removeConstraint(removed_constraint);
}
for (const auto& added_variable : other.added_variables_)
{
addVariable(added_variable, overwrite);
}
for (const auto& removed_variable : other.removed_variables_)
{
removeVariable(removed_variable);
}
}

It shouldn't be a problem to have more than one fuse_core::uuid::NIL removed variable or constraint, although it'd be handled as a single one. Similarly, with the implementation proposed above for the Graph::update() method, the removal of all variables or constraints in the graph would happen before any additions. This allows an ignition sensor to create a new prior constraint at the same time it request the current graph to be cleared. For instance, the fuse_models::Unicycle2DIgnition that sends this transaction:

auto transaction = fuse_core::Transaction::make_shared();
transaction->stamp(stamp);
transaction->addInvolvedStamp(stamp);
transaction->addVariable(position);
transaction->addVariable(orientation);
transaction->addVariable(linear_velocity);
transaction->addVariable(angular_velocity);
transaction->addVariable(linear_acceleration);
transaction->addConstraint(position_constraint);
transaction->addConstraint(orientation_constraint);
transaction->addConstraint(linear_velocity_constraint);
transaction->addConstraint(angular_velocity_constraint);
transaction->addConstraint(linear_acceleration_constraint);
// Send the transaction to the optimizer.
sendTransaction(transaction);

Would simply have to add this to that transaction before calling sendTransaction:

transaction->removeVariable(fuse_core::uuid::NIL);
transaction->removeConstraint(fuse_core::uuid::NIL);

It shouldn't be a problem if this transactions get merged with others, that would add more variables and constraints, because the graph would be cleared before adding any variables or constraints. This will still rely on the optimizer to purge any transaction before the ignition one. If we want, I think it'd be possible to detect a transaction has fuse_core::uuid::NIL, and consider that as an ignition transaction, or a non-mergeable transaction. Technically, it can get merged with future transaction, but not with past ones. But if we never merge it, it'd be fine, and equivalent to what we do now.

I believe this would allow us to get rid of the reset logic in the ignition sensor models:

if (!params_.reset_service.empty())
{
// Wait for the reset service
while (!reset_client_.waitForExistence(ros::Duration(10.0)) && ros::ok())
{
ROS_WARN_STREAM("Waiting for '" << reset_client_.getService() << "' service to become avaiable.");
}
auto srv = std_srvs::Empty();
if (!reset_client_.call(srv))
{
// The reset() service failed. Propagate that failure to the caller of this service.
throw std::runtime_error("Failed to call the '" + reset_client_.getService() + "' service.");
}
}

This would make them easier to implement. It'd also make things faster to update and we wouldn't lose any transactions that we currently lose when we call the reset service. That happens because it takes time to reset, and it stops all the plugins, so they stop receiving measurements. The approach proposed here is equivalent to not using the reset service, which can actually be used now. So certainly, this functionality already exists. The only difference is that with this change we can indicate we want the graph to be cleared at the transaction level, which I believe we can't do now.

I wonder if ignition sensor really have to support calling the reset service. After all, if the graph gets cleared, all plugins will get notified about that. Maybe it's too late and some sensor models could have already generated constraints that reference variables from the graph before clearing it. 🤔

@svwilliams
Copy link
Contributor Author

I am definitely not opposed to streamlining the reset operation. Adding a reset option to the transaction is an interesting idea, but I'll need to spend some time to fully understand all of the ramifications.

But a few immediate thoughts:

  • I dislike abusing the UUID and giving NIL a special meaning. I'd rather extend the transaction to directly support the reset feature. The Boost Serialization system has support for deserializing different versions of the same class, so it should be possible to extend the Transaction and still allow older versions to be deserialized. To be clear, I've never used that feature, but in theory it is possible.
    https://www.boost.org/doc/libs/1_65_0/libs/serialization/doc/tutorial.html#versioning
  • One of the reasons for having the sensors do an unsubscribe - subscribe cycle during reset is to ensure the ROS callback queues get cleared of old data and we start fresh after the reset. It also gives the sensors and motion models a hook to clear out any internal state. While most of the sensor models have rather trivial reset logic, the internal state clearing is important to the motion model state tracking. In any alternative reset implementation, we need to make sure the sensors and motions have that reset hook, and that the reset gets triggered at the correct time.
    https://github.com/locusrobotics/fuse/blob/devel/fuse_models/src/unicycle_2d.cpp#L225
  • If the goal is to make the reset operation "faster", there may be better sensor reset logic. Maybe we clear the callback queue instead of doing an unsubscribe-subscribe cycle? Maybe we transmit the reset time to the sensors/motion models and let them filter out unwanted messages?
  • Before the addition of reset logic, the stop()/start() methods were added to support the robot_localization "enable service" feature. I think that is an important feature to keep, and having the stop() methods actually unsubscribe from the topics seems like the proper operation. The reset implementation then piggybacked off of those functions and did a stop()-start() cycle. We could make a special reset() method in addition to the stop() and start() methods, with a default implementation that does a stop()-start() cycle. That would allow special "fast" reset logic to be implemented.

@efernandez
Copy link
Collaborator

  • I dislike abusing the UUID and giving NIL a special meaning.

If Boost Serialization versioning works well, yes, I totally agree. 👍

  • In any alternative reset implementation, we need to make sure the sensors and motions have that reset hook, and that the reset gets triggered at the correct time.

I admit I didn't think about that this time, but I actually gave that a thought some time ago and I just forgot to check what I wrote down and share it. This is what I came with, which I believe it'd address the need for a reset hook:

Currently, the fixed lag smoother relies on a reset service to reset the optimizer state: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_optimizers/src/fixed_lag_smoother.cpp#L395

This means the ignition sensor models must call that reset service, which looks a bit sketchy: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_models/src/unicycle_2d_ignition.cpp#L236

Instead of that, we could have a reset callback in the sensor models, so the reset service doesn't have to be called. We can still have the reset service, but the ignition sensor models wouldn't have to use it. Instead, they can use a cleaner and more efficient mechanism.

This should be relatively easy to implement:

  1. Take an optional (defaults to nullptr) reset_callback in fuse_core::SensorModel::initialize: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_core/include/fuse_core/sensor_model.h#L103
  2. The fuse_core::AsyncSensorModel::initialize implementation only has to store the callback doing reset_callback_ = reset_callback. Something similar to https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_core/src/async_sensor_model.cpp#L72
  3. A reset method could be added, which would be equivalent to sendTransaction: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_core/src/async_sensor_model.cpp#L81
  4. The reset service call in https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_models/src/unicycle_2d_ignition.cpp#L236 can be replace to a call to reset(), and all the service client stuff can be cleaned up, as well as the related code, like error handling and ROS parameters.
  5. The fuse_optimizers::Optimizer would have a resetCallback, that would have a default no-op implementation, since only the fixed lag smoother needs it. It'd be similar to the transactionCallback: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_optimizers/include/fuse_optimizers/optimizer.h#L154
  6. The optimizer would initialize the sensor models injecting the resetCallback in a similar way to the transactionCallback: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_optimizers/src/optimizer.cpp#L253, https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_optimizers/src/optimizer.cpp#L422
  7. The fixed lag smoother would implement the virtual resetCallback method with the code currently used for the reset service callback: https://github.com/clearpathrobotics/fuse/blob/e9366b2a15bbaccb84f060f17d45fd0972c52e79/fuse_optimizers/src/fixed_lag_smoother.cpp#L395
  8. And the resetServiceCallback would simply call the resetCallback

In principle, this should also allow to re-implement the fuse_models::Unicycle2DIgnition sensor models using the onStart and onStop methods, instead of the start and stop ones. But I haven't look at that part yet.

  • Maybe we clear the callback queue instead of doing an unsubscribe-subscribe cycle? Maybe we transmit the reset time to the sensors/motion models and let them filter out unwanted messages?

First of all, the motivation is to make the reset logic faster, but actually in the sense that we don't miss measurements while resetting, which happens with the existing fuse_models because the reset logic stops them, meaning they shutdown their subscribers. When they're restarted, we lose some measurements, which in general is fine, but not always. Certainly, the need for this is a bit marginal. I only needed this for playing back recorded transactions after a recorded graph. But I guess there could be other use cases.

We need to be careful if we end up clearing the callback queue, since IIRC that could leave things in a bad state, probably only if done from inside a callback. Either way, I believe any transaction generated by the sensor models would be filtered out in the optimizer, since it'd be older than the reset transaction, which would have a logic similar to the current one for the ignition sensor models. I think we still need that part.

Transmitting the reset time to the sensor/motion models might be an option. But TBH I don't want to push for a change that would make things more complex.

  • We could make a special reset() method in addition to the stop() and start() method

Makes sense and this sounds similar to what I'm proposing above, with the reset_callback.

Thanks for creating an issue to discuss this. What are your thoughts about the reset_callback? I believe that's something quite trivial and independent of the reset support in the transactions, so maybe it could be done first.

@efernandez efernandez added the enhancement New feature or request label Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

2 participants