Add reset to batch opt #360

jakemclaughlin6 · 2024-03-15T16:52:59Z

Adding the same reset service from the fixed lag smoother to the batch optimizer. This also adds a mutex for the optimization which is performing fine in my experiments, this is to make sure on reset we have sole access to the graph.

jakemclaughlin6 · 2024-03-18T14:04:53Z

@svwilliams This should be a relatively non-invasive addition, let me know if otherwise and if so for my use case I can just extend the existing batch optimizer with a reset function

svwilliams

I believe there is a locking error here. See in-line comments for details.

svwilliams · 2024-03-18T17:14:37Z

fuse_optimizers/src/batch_optimizer.cpp

+  // DANGER: The optimizationLoop() function obtains the lock optimization_mutex_ lock and the
+  //         pending_transactions_mutex_ lock at the same time. We perform a parallel locking scheme here to
+  //         prevent the possibility of deadlocks.
+  {
+    std::lock_guard<std::mutex> lock(optimization_mutex_);
+    // Clear all pending transactions
+    {
+      std::lock_guard<std::mutex> lock(pending_transactions_mutex_);
+      pending_transactions_.clear();
+    }
+    // Clear the graph and marginal tracking states
+    graph_->clear();


Unlike the fixed lag smoother, the batch optimizer does not lock the pending_transactions_mutex_ during the optimization loop. So you should not need to worry about the locking order as much.

However, the optimization loop does consist of two steps:

Make a local copy of the combined_transaction_

Update the graph from the local transaction copy

It is possible for the reset() operation to interrupt the optimization loop in between Step 1 and Step 2. If that happens, then the local transaction copy would contain variables and constraints but the graph would have been cleared by the reset logic. And that will cause problems. We need to ensure that the state of the graph does not change after the local transaction copy has been created.

Something like:

In BatchOptimizer::optimizationLoop()

{ std::lock_guard<std::mutex> lock(optimization_mutex_); // Copy the combined transaction so it can be shared with all the plugins fuse_core::Transaction::ConstSharedPtr const_transaction; { std::lock_guard<std::mutex> lock(combined_transaction_mutex_); const_transaction = std::move(combined_transaction_); combined_transaction_ = fuse_core::Transaction::make_shared(); } // Update the graph graph_->update(*const_transaction); // Optimize the entire graph graph_->optimize(params_.solver_options); // Make a copy of the graph to share fuse_core::Graph::ConstSharedPtr const_graph = graph_->clone(); // Optimization is complete. Notify all the things about the graph changes. notify(const_transaction, const_graph); // Clear the request flag now that this optimization cycle is complete optimization_request_ = false; }

Now optimization_mutex_ and combined_transaction_mutex_ are locked at the same time, so we are in a similar situation as the fixed-lag smoother.

In BatchOptimizer::resetServiceCallback()

// DANGER: The optimizationLoop() function obtains the lock optimization_mutex_ lock and the // combined_transaction_mutex_ lock at the same time. We perform a parallel locking scheme here to // prevent the possibility of deadlocks. { std::lock_guard<std::mutex> lock(optimization_mutex_); // Clear the combined transation { std::lock_guard<std::mutex> lock(combined_transaction_mutex_); combined_transaction_.clear(); } // Clear the graph and marginal tracking states graph_->clear(); } // Clear all pending transactions // The transaction callback and the optimization timer callback are the only other locations where // the pending_transactions_ variable is modified. As long as the BatchOptimizer node handle is // single-threaded, then pending_transactions_ variable cannot be modified while the reset callback // is running. Therefore, there are no timing or sequence issues with exactly where inside the reset // service callback the pending_transactions_ are cleared. { std::lock_guard<std::mutex> lock(pending_transactions_mutex_); pending_transactions_.clear(); }

DavidLocus

LGTM

jakemclaughlin6 · 2024-03-20T20:17:21Z

If this is good to merge, then someone else will have to merge this I don't have access @svwilliams @DavidLocus

svwilliams · 2024-03-20T21:45:06Z

Yep, I'm working on a ROS 2 port right now. I'll get this merged in later today.

jakemclaughlin6 added 6 commits March 15, 2024 10:34

Add reset to batch optimizer

c0679e3

Fix compile errors

00c7bb0

fix formatting

69709a9

remove compile flag

a666522

remove std out

fab2871

remove flag

85db97b

svwilliams requested review from svwilliams and DavidLocus March 18, 2024 16:25

svwilliams requested changes Mar 18, 2024

View reviewed changes

Fix threading scheme

e89392e

jakemclaughlin6 requested a review from svwilliams March 19, 2024 12:20

jakemclaughlin6 added 2 commits March 19, 2024 09:08

Fix bug

66b0882

remove flag

b02cebb

svwilliams approved these changes Mar 19, 2024

View reviewed changes

DavidLocus approved these changes Mar 20, 2024

View reviewed changes

svwilliams mentioned this pull request Mar 21, 2024

Port the "Add reset to batch optimizer" patch to ROS 2 Rolling #361

Open

svwilliams merged commit 415971a into locusrobotics:devel Mar 21, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reset to batch opt #360

Add reset to batch opt #360

jakemclaughlin6 commented Mar 15, 2024 •

edited

Loading

jakemclaughlin6 commented Mar 18, 2024

svwilliams left a comment

svwilliams Mar 18, 2024

DavidLocus left a comment

jakemclaughlin6 commented Mar 20, 2024

svwilliams commented Mar 20, 2024

Add reset to batch opt #360

Add reset to batch opt #360

Conversation

jakemclaughlin6 commented Mar 15, 2024 • edited Loading

jakemclaughlin6 commented Mar 18, 2024

svwilliams left a comment

Choose a reason for hiding this comment

svwilliams Mar 18, 2024

Choose a reason for hiding this comment

DavidLocus left a comment

Choose a reason for hiding this comment

jakemclaughlin6 commented Mar 20, 2024

svwilliams commented Mar 20, 2024

jakemclaughlin6 commented Mar 15, 2024 •

edited

Loading