Skip to content
This repository has been archived by the owner on Nov 11, 2024. It is now read-only.

gnss: remove device from global list when no poweroff #147

Merged
merged 1 commit into from
Oct 15, 2023

Conversation

alonbl
Copy link
Contributor

@alonbl alonbl commented Oct 15, 2023

The following code should be valid to recover from a situation in which the GNSS is disconnected/connected or power up.

  while((uret = uDeviceOpen(&gDeviceCfg,
          &device_handle)) != U_ERROR_COMMON_SUCCESS) {
  	uPortTaskBlock(5000);
  }

Current implementation fails with U_ERROR_COMMON_INVALID_PARAMETER with no recovery option, this loop is infinite connecting GNSS will have no affect.

The reason is that the removeDevice() is called with powerOff=false, this result in not removing the device from the global device list.

When the device is opened again it is already found within the list at by pGetGnssInstanceTransportHandle which is called by uGnssAdd(), uGnssAdd() returns U_ERROR_COMMON_INVALID_PARAMETER.

The software should release any resources attached to a device regardless of poweron/poweroff, the resources are logical and may be created when required.

This patch removes the device from the list in case of poweroff was not requested.

The following code should be valid to recover from a situation in which the
GNSS is disconnected/connected or power up.

  while((uret = uDeviceOpen(&gDeviceCfg,
          &device_handle)) != U_ERROR_COMMON_SUCCESS) {
  	uPortTaskBlock(5000);
  }

Current implementation fails with U_ERROR_COMMON_INVALID_PARAMETER with no
recovery option, this loop is infinite connecting GNSS will have no affect.

The reason is that the removeDevice() is called with powerOff=false, this
result in not removing the device from the global device list.

When the device is opened again it is already found within the list at
by pGetGnssInstanceTransportHandle which is called by uGnssAdd(), uGnssAdd()
returns U_ERROR_COMMON_INVALID_PARAMETER.

The software should release any resources attached to a device regardless of
poweron/poweroff, the resources are logical and may be created when required.

This patch removes the device from the list in case of poweroff was not
requested.

Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
@RobMeades
Copy link
Contributor

RobMeades commented Oct 15, 2023

Trying to recall if there was a reason why I did it that way and also pushing to test system in case it objects...

@RobMeades
Copy link
Contributor

Hmm, no, a look at the code suggests that the cellular equivalent does a proper remove, looks like this was just a mistake for the GNSS case. Will merge if test system says "yo".

Copy link
Contributor

@RobMeades RobMeades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the equivalent device code for cellular code does a proper removal, this appears to be an oversight for the GNSS case. The test system shows now issue with this change, hence looks good both to me and the machine.

Merging...

@RobMeades RobMeades merged commit b3aa942 into u-blox:master Oct 15, 2023
@alonbl
Copy link
Contributor Author

alonbl commented Oct 17, 2023

Hello @RobMeades,

Thinking about this a bit more... leaving the device instance open when poweoff failure will result in a leak as the opendevice will fail and not allow user to release the resources or reattempt as the next open will fail.

I suggest to remove the poweron/poweroff from the device open so the user will be able to control the close sequence, or to release the device in any case if open failed regardless of power state.

Alon

@RobMeades
Copy link
Contributor

Hmmm, yes, so maybe it should be:

static int32_t removeDevice(uDeviceHandle_t devHandle, bool powerOff)
{
    int32_t errorCode = (int32_t) U_ERROR_COMMON_INVALID_PARAMETER;
    uDeviceGnssInstance_t *pContext = (uDeviceGnssInstance_t *) U_DEVICE_INSTANCE(devHandle)->pContext;
    if (pContext != NULL) {
        errorCode = (int32_t) U_ERROR_COMMON_SUCCESS;
        if (powerOff) {
            //  Whatever happens if we got this far we are releasing the device, error code ignored
            uGnssPwrOff(devHandle);
        }
        uGnssRemove(devHandle);
        uPortFree(pContext);
    }
}

@alonbl
Copy link
Contributor Author

alonbl commented Oct 17, 2023

@RobMeades library should notify the user that we could not power off the device in addition to the original error. as maybe some other action should be done. this is why playing with power should probably be separated from device behaviors.

@RobMeades
Copy link
Contributor

RobMeades commented Oct 18, 2023

Apologies for the delay in replying: needed to complete a mutexed operation before I could free-up processing time :-).

There are two cases:

  • uDeviceClose() is called with the power-off flag set to true. In the GNSS case this calls uDevicePrivateGnssRemove(), which calls removeDevice(), which returns to the user the errorCode of the power-off process failure and, with the code as it stands, does not in any way remove the device, it is still there and powered on, which is what the caller is meant to interpret from the negative error code. If uDeviceClose() is called with the power-off flag set to false then the code, as it stands, will remove the device and leave it powered on, no error can occur in this case; the user has achieved their aim, so if the user had got "stuck" in not being able to uDeviceClose() a device that was, for some reason, never responding to commands to power-off, this is the way out: don't power if off 'cos it's not actually possible to power it off it would seem, just release the resources it was occupying.

  • uDeviceOpen(), the purpose of which is to put a device into a state where it can be communicated-with (hence it must be powered-on), is called and fails for some reason. In the GNSS case this calls uDevicePrivateGnssAdd(), and the fail cases of that which can end up calling removeDevice() are (a) if GNSS is using I2C and we have no room left for another device on that I2C bus (see pFindDeviceI2c()) or (b) if addDevice() fails to power-on the GNSS device.

In case (b) the error reported to the caller is the errorCode from trying to power the device on, which seems clear enough, and removeDevice() is called with false because there's nothing to power off, the device never powered on.

In case (a) we've managed to successfully power-on the GNSS device and then found there is no room in the list so we call removeDevice() with true to switch it back off again and if powering it off again fails we don't report that error. This is somewhat of a corner case: to hit it you'd have to have more than U_DEVICE_PRIVATE_DEVICE_I2C_MAX_NUM (10) GNSS devices on the same I2C bus.

So I don't think I'm going to worry about (b) and I think the uDeviceClose() is OK with the code as it stands...?

@alonbl
Copy link
Contributor Author

alonbl commented Oct 18, 2023

Hello @RobMeades ,

Per your description, I am afraid that the change we merged 7df8ac9 broke as you do expect device to be available to the user in case of open failure this is required to properly close the device with different power attributes. This is very confusing and should be clearly documented. And there is a bug there in any case as we should return a valid device handle only if it is indeed valid per the power off issue and not in other cases.

I suggest a more consistent design so that open and poweron will be separated, the open is local operation and does not communicate with the device while poweron does. The open should fail only if local operating system cannot allocate the resources. I guess I would have also separated between poweroff and close for the same reason.

Regards,
Alon

@RobMeades
Copy link
Contributor

Not sure I follow: if uDeviceOpen() fails then in all cases except (b) (which is very unlikely to occur) the device is both not powered-up and not occupying any resources, there is no device and therefore no device handle. The user sees the negative error code and tries a different tack...?

@alonbl
Copy link
Contributor Author

alonbl commented Oct 18, 2023

As far as I can follow, if the device open is success and some interaction with it fails and if poweron/off is enabled then the open will fail, the handle will be null and there will be no way to recover.

@RobMeades
Copy link
Contributor

OK, I've taken the time to use @philwareublox as a sounding box and, spelling it out in laborious detail, we believe you are describing this scenario:

  1. Device is opened and running, all is good.
  2. Something goes wrong with the device, the ubxlib functions are returning error, it's gone titsup.
  3. As a recovery mechanism, the application calls uDeviceClose() with the powerOff flag set to true.
  4. With the code as it stands, this will return an error because the device is misbehaving, it cannot be powered off.

If we were to change the code as I indicated a few comments above, to remove the device anyway in this scenario, while that would resolve the problem, as you have pointed out it also hides information; the user no longer knows whether their misbehaving device is powered on or not. Well, they might guess that we cannot possibly know 'cos the goldarned thing is misbehaving, but anyway, let's say we take the return values of the code at face value.

The correct recovery, then, is to call uDeviceClose() again, this time with the powerOff flag set to false. This logic always works because the only reason for an error to be returned by uDeviceClose(), assuming you've not passed in a wonky device handle, is if the device is misbehaving and fails to power off.

I don't want to change the APIs for device open/close, it would be a breaking change to a fundamental API, something we very much try not to do, so instead I will update the description for uDeviceClose() as follows:

/** Close an open device instance, optionally powering it down.
 *
 * IMPORTANT: if you are calling this function because you have
 * a misbehaving device and you are attempting recovery, it is
 * best to call this function with powerOff set to false.  This is
 * because this function will return an error, and NOT CLOSE
 * the device, if the process of powering off the device fails.
 * Alternatively, you may adopt the following logic:
 *
 * ```
 * if (uDeviceClose(devHandle, true) != 0) {
 *     // Device has not responded to power off request, just release resources
 *     uDeviceClose(devHandle, false);
 * }
 * ```
 * 
 * Note: when a device is closed not all memory associated with it
 * is immediately reclaimed; if you wish to reclaim memory before
 * uPortDeinit() you may do so by calling uPortEventQueueCleanUp().
 *
 * @param devHandle handle to a previously opened device.
 * @param powerOff  if true then also power the device off; leave
 *                  this as false to simply logically disconnect
 *                  the device, in which case the device will be
 *                  able to return to a useful state on
 *                  uDeviceOpen() very quickly.  Note that Short
 *                  Range devices do not support powering off;
 *                  setting this parameter to true will result in
 *                  an error.  Also note that if this flag is set to true
 *                  and the device does not respond correctly to
 *                  the request to power off then this function will
 *                  return a negative error code and NOT CLOSE
 *                  the device.
 * @return          zero on success else a negative error code.
 */

I will also update all of the places in the examples and the test code where we call uDeviceClose() with powerOff set to true to adopt the same logic, so that it is clear wherever someone looks for guidance.

@alonbl
Copy link
Contributor Author

alonbl commented Oct 19, 2023

Great, thank you for sorting this out!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants