When a joint goes in HF, errors like `positionMoveRaw: skipping command` flood the logger #768

S-Dafarra · 2021-10-18T15:45:52Z

Bug description

When writing simple applications to control the robot, we are used to sending commands to joints without checking the current control mode. See for example https://github.com/robotology/icub-tutorials/blob/23ac25487d5a872030b82b38a28eb44b2cb3bda3/src/motorControlBasic/tutorial_arm.cpp#L129

This is because we usually set the desired control mode only once, during the initialization phase, and during the control loop we simply get some measurement, and we set the references to the joints.

It may happen though, that some joint goes in HF after the startup. This happens quite easily with the hands. After that, the logger gets quickly filled up with messages like:

<ERROR> velocityMoveRaw: skipping command because  BOARD left_arm-eb26-j12_15 (IP 10.0.1.26)   joint  2  is not in VOCAB_CM_VELOCITY mode

Since the top level application usually does not check for the control mode at each control loop, it keeps sending references to the joint, producing errors like the above. Since the control loop is usually at 100Hz, this will saturate the logger quite easily, losing the possibility to check other errors.

cc @pattacini @marcoaccame

Steps to reproduce

It should be enough to run the tutorial https://github.com/robotology/icub-tutorials/tree/23ac25487d5a872030b82b38a28eb44b2cb3bda3/src/motorControlBasic on the robot and change the control mode of the controlled joints to something different from Position and Idle.

Expected behavior

The errors of

icub-main/src/libraries/icubmod/embObjMotionControl/embObjMotionControl.cpp

Line 2224 in 601af52

    
           yError() << "positionMoveRaw: skipping command because " << getBoardInfo() << " joint " << j << " is not in VOCAB_CM_POSITION mode";

,

icub-main/src/libraries/icubmod/embObjMotionControl/embObjMotionControl.cpp

Line 1897 in 601af52

    
           yError() << "velocityMoveRaw: skipping command because " << getBoardInfo() << " joint " << j << " is not in VOCAB_CM_VELOCITY mode";

, and

icub-main/src/libraries/icubmod/embObjMotionControl/embObjMotionControl.cpp

Line 4106 in 601af52

    
           yError() << "setReferenceRaw: skipping command because" << getBoardInfo() << " joint " << j << " is not in VOCAB_CM_POSITION_DIRECT mode";

should be limited in time, eventually sending the error only once, and sending it again when the control mode changes.

Example repository

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

pattacini · 2021-10-18T21:20:40Z

This is not a bug, but rather more a feature request, I'd say.

In my opinion, the high-level SW should be responsible somehow for checking the status of the boards. We do this for example in the Cartesian and Gaze controller.

However, due to some other limitations (see robotology/community#558) that may turn out to be severe in certain conditions, I reckon we could come up with a kind of workaround.

Regarding the workaround strategy below

should be limited in time, eventually sending the error only once, and sending it again when the control mode changes.

I wouldn't rule out yet the choice of keeping triggering the error but at a much lower frequency. We'd need to ponder all the possibilities.

Stay tuned.

pattacini · 2021-10-19T18:28:41Z

To illustrate what I have in mind, I've fast-prototyped the handling logic in Stateflow.



The model with the input and the outputs

A closer look at the FSM chart composed of 2 parallel subcharts

Essentially, the FSM receives as input a boolean that is 1 when an error message is being triggered and 0 otherwise and yields as output a corresponding boolean with the same meaning that undergoes however a sort of smart down-sampling plus a counter that tells us how many original errors have been raised in a given temporal window.

The handler is composed of two parallel charts:

EvalOccurence is devoted to evaluating the number of errors triggered in a given temporal window (default = 1 sec).
ErrorHandler is the actual handler implementing the logic below:
- If the occurrence of the errors is above a threshold, then it triggers the output only each second (i.e., same lapse as above).
- Otherwise, output = input.

We have only 2 params:

The temporal window is used to evaluate the frequency of the input errors and to carry out down-sampling at the output stage.
The threshold for the errors detected in the window above which triggers down-sampling (default = 5).

The output of the handler is twofold:

A boolean that tells when to print the message.
An integer that accounts for the number of errors that occurred since the last print (this info can be used to populate the message).

Here's below a typical outcome where it's shown how the output follows the input only within the initial window that serves to evaluate the occurrence of the errors. After that, the output gets triggered only in single instances.

The FSM can be obviously used to generate code.

You can play with the model: error_downsampler.zip.

S-Dafarra · 2021-10-20T07:15:17Z

That's pretty cool! At the moment a different error is thrown by each joint. We had cases in which a MAIS board shut down, and then all the 9 joints of a hand were going in HF. In that case, those lines were producing 10000 errors in about 10 seconds. I guess the mechanism you described considers each joint separately right?

pattacini · 2021-10-20T07:25:27Z

At the moment the FSM is agnostic wrt other info like the joint number and the initial control mode as it receives only a boolean accounting for the occurrence of the input errors.

Along this line, we may then apply this algorithm as a function to each individual type of errors (per joint and per mode). The reduction will be still very significative although we may require too much memory just for that.

Alternatively, we may let the printouts show up initially with their info (i.e., joint number and control modes) to then only print a cumulative agnostic error message. In this case, we do need only one instance of the handler, I guess.

pattacini · 2021-10-20T09:30:42Z

🟢 Just pushed the model to `event-downsampler`.

This way, we may use the internal state DOWNSAMPLE as in the following meta code:

IN = false;

if (error_1) {
  IN = true;
  if (!FSM.DOWNSAMPLE) {
    yError() << "positionMoveRaw: skipping command because " << getBoardInfo() << " joint " << j << " is not in VOCAB_CM_POSITION mode";
  }
}

if (error_2) {
  IN = true;
  if (!FSM.DOWNSAMPLE) {
    yError() << "velocityMoveRaw: skipping command because " << getBoardInfo() << " joint " << j << " is not in VOCAB_CM_VELOCITY mode";
  }
}

// ...

FSM_step(IN);

if (FSM.DOWNSAMPLE && FSM.OUT) {
  yError() << "Skipping the requested command as the board is not in the correct control mode. Detected" << FSM.CNT << "errors on aggregate since the last message";
}

pattacini · 2021-10-25T12:14:06Z

Had looked a bit deeper at the code and found out that we'd need to have a timer running at a reasonable rate (usual 10 ms) while collecting possible asynchronous input events. Before, the assumption was that we could run at the fastest rate, which can no longer hold.

Therefore, I've refactored the model as per https://github.com/icub-tech-iit/matlab-tools/tree/master/event-downsampler, where now we have essentially a counter as input in place of a boolean.

pattacini · 2021-11-10T13:22:28Z

@mfussi66 and @davidetome are working on this in https://github.com/mfussi66/icub-main/tree/devel.

pattacini · 2021-11-16T13:57:25Z

Done in #770.

S-Dafarra added the Type: Bug label Oct 18, 2021

pattacini changed the title ~~When a joint goes in HF, errors like positionMoveRaw: skipping command flood the logger~~ When a joint goes in HF, errors like positionMoveRaw: skipping command flood the logger Oct 18, 2021

pattacini added Type: Enhancement and removed Type: Bug labels Oct 18, 2021

pattacini self-assigned this Oct 19, 2021

pattacini assigned mfussi66 and davidetome Nov 10, 2021

mfussi66 mentioned this issue Nov 16, 2021

Add message downsampler to avoid logger flooding when changing control mode #770

Merged

pattacini linked a pull request Nov 16, 2021 that will close this issue

Add message downsampler to avoid logger flooding when changing control mode #770

Merged

pattacini closed this as completed Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a joint goes in HF, errors like `positionMoveRaw: skipping command` flood the logger #768

When a joint goes in HF, errors like `positionMoveRaw: skipping command` flood the logger #768

S-Dafarra commented Oct 18, 2021 •

edited

Loading

pattacini commented Oct 18, 2021 •

edited

Loading

pattacini commented Oct 19, 2021 •

edited

Loading

S-Dafarra commented Oct 20, 2021

pattacini commented Oct 20, 2021 •

edited

Loading

pattacini commented Oct 20, 2021 •

edited

Loading

pattacini commented Oct 25, 2021 •

edited

Loading

pattacini commented Nov 10, 2021

pattacini commented Nov 16, 2021

When a joint goes in HF, errors like positionMoveRaw: skipping command flood the logger #768

When a joint goes in HF, errors like positionMoveRaw: skipping command flood the logger #768

Comments

S-Dafarra commented Oct 18, 2021 • edited Loading

Bug description

Steps to reproduce

Expected behavior

Example repository

Additional context

pattacini commented Oct 18, 2021 • edited Loading

pattacini commented Oct 19, 2021 • edited Loading

S-Dafarra commented Oct 20, 2021

pattacini commented Oct 20, 2021 • edited Loading

pattacini commented Oct 20, 2021 • edited Loading

🟢 Just pushed the model to event-downsampler.

pattacini commented Oct 25, 2021 • edited Loading

pattacini commented Nov 10, 2021

pattacini commented Nov 16, 2021

When a joint goes in HF, errors like `positionMoveRaw: skipping command` flood the logger #768

When a joint goes in HF, errors like `positionMoveRaw: skipping command` flood the logger #768

S-Dafarra commented Oct 18, 2021 •

edited

Loading

pattacini commented Oct 18, 2021 •

edited

Loading

pattacini commented Oct 19, 2021 •

edited

Loading

pattacini commented Oct 20, 2021 •

edited

Loading

pattacini commented Oct 20, 2021 •

edited

Loading

🟢 Just pushed the model to `event-downsampler`.

pattacini commented Oct 25, 2021 •

edited

Loading