Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing design concerns with Sensor Subsystem (sensing) #58925

Closed
3 tasks
yperess opened this issue Jun 7, 2023 · 29 comments
Closed
3 tasks

Fixing design concerns with Sensor Subsystem (sensing) #58925

yperess opened this issue Jun 7, 2023 · 29 comments
Labels
area: Sensor Subsystem RFC Request For Comments: want input from the community

Comments

@yperess
Copy link
Collaborator

yperess commented Jun 7, 2023

Introduction

Note: This RFC is used to document the concerns with #55389 and is done in the context of #36963. It was asked that the PR is allowed to progress and concerns will be addressed in following PRs.

Several changes are needed to the first PR for the sensor subsystem which I believe are required for a successful sensor framework. Definitions for success in my mind are (since Chromebooks will be the first clients):

  1. Able to fit in small-ish memory footprint (ideally under 15kB of flash space which is the current Chromium sensor framework's footprint).
  2. Able to pass Android's CTS tests (performance tests).

Problem description

My current concerns surround the design of the system and will also be summarized here for a quick TL;DR

  • Sensing contains dual code paths for the same purpose
  • Sensing devicetree bindings is too restrictive
  • Sensor types are not 1:1 compatible with CHRE

Proposed change

  1. Provide an API to publish data into the subsystem
    When used and a new producer is registered with the subsystem, a node is provided along with metadata about the producer. This includes information such as the sensor type and capabilities. Each open connection is used for arbitration. The metadata for the connection contains the requested attributes alone with the callbacks. When a connection is closed, we can then look at the existing open connections to the producer and re-arbitrate (same will be done when the configuration changes).

Class Diagram
UML flow

  1. Have nodes in devicetree be metadata about the physical sensors OR be virtual sensors
    These nodes for physical sensors would be much more easy to manage and arbitrate if a single one is used and registers itself as a producer of 2 types (accel/gyro). This means that the rotation matrix and other augmenting properties can be provided in the same place since they affect the data produced by the same sensor.

  2. Sensors should use the existing sensor API. It already provides attribute get/set capabilities as well as getting samples (including streaming soon). There's no reason to add a new API.

Detailed RFC

The following highlights my concerns with the current design.

Dual code paths

The sensing subsystem contains a separate code path for virtual sensors instantiated via devicetree vs clients of the subsystem. Though both are treated the same if we were to look at the subsystem as a black box. For example:

If a client wanted to implement an unofficial lid angle, they would:

  1. get the sensing nodes for the lid and base accelerometers
  2. call sensing_sensor_open() to start a connection to those nodes
  3. configure both nodes to the desired data rate and other settings (subsystem will arbitrate)
  4. In the callback provided in the sensing_sensor_open() they would perform the angle calculation

This is an issue for both memory footprint and passing CTS as it overcomplicates what should be a simple system of arbitration and pub sub. Current design of the subsytem means that if the same developer wants to convert this algorithm to an official virtual sensor, they must use the sensing internal sensor API instead of the one for the clients. They need to register reporters in devicetree and the connections to the reporters will automatically be opened by the subsytem. Next, if a virtual sensor wants to dynamically get more information from another sensor, they will have to go through the "normal" client setup of opening a connection, because the previously mentioned flow is only supported for statically registered dependencies. This means that the sensor now needs to support both workflows AND the subsystem needs to include both code paths. To be clear, this means that:

  1. The sensor subsystem needs to arbitrate all STATIC and DYNAMIC connections together (neither gets a priority)
  2. The sensor subsystem needs to deliver data and events to both connection types

I don't see a good reason for Zephyr's sensor subsystem to support 2 separate code paths that do EXACTLY the same thing. Virtual sensors need to be clients of the sensor subsystem just like any other client. The only "special" treatment they should get is that they need to register as data producers with the subsystem and be able to push data into the subsystem and have the subsystem open connections to the virtual sensor.

Devicetree bindings

The current layout of the subsytem is such that every child node of the subsystem is a sensor. This restriction is not necessary and makes it difficult to extend the subsystem. If we ever want to include nodes which are not sensors, it'll require a large refactor.

With the above class diagram, I believe that a binding can be created that will automatically populate a sensing_sensor_producer struct. Each property should have explicit needs. For example, with the current PR, minimal-interval is introduced and is either unnecessary or not needed (I can't tell because it's not really documented). I don't see a need for it and it should be removed until we have a strong reason for adding it.

Prior versions of the PR included a rotation matrix in the child sensor nodes. I want to make sure this doesn't come back in (it was removed in recent patches). The rotation matrix is a property of the physical device, it's error prone to have it in the subsytem's sensor node because these nodes are per type (1 node for accel and 1 node for gyro even if they both point to the same physical device). I'm actually not even convinced that having these separate notes will work well since configuring one means also possibly configuring the other so I will need to see how arbitration will work (for example, if we need to update the GPIO behavior of an accel/gyro chip having 1 node for accel and 1 node for gyro might mess up the arbitration. I would like to see a design proposal for how this will be done.

This leads me to the final obvious point that the virtual sensors don't need to be children of the subsystem if they use an iterable section to create the nodes in C, and we guarantee that there's always at most 1 instance of the subsystem, this is already handled by the compatible string for those nodes that will know to create the structs via a call to SENSING_SENSOR_DT_DECLARE() or something similar.

CHRE sensor types

This is a simple matter of where things are used. CHRE is an official module of Zephyr and will be running on the same compute core as the sensor subsystem. Using HID is only used when sending data off chip. This means that our options are:

  1. Convert subsystem constants HID types to CHRE every time we pass data between the two
  2. Convert subsystem constants CHRE types to HID every time we pass data to the application processor

This seems like a no brainer to me. I'll also point out that as per CTS, there exists a client (I can't say which) that said they'll switch to Zephyr from FreeRTOS for their Android phones when we get this all working with CHRE. This further makes me think that the extra translation is a waste of cycles and/or a lookup table for every single event.

Proposed change (Detailed)

The key to this proposal is leveraging the existing constructs. Virtual sensors are built like normal sensors (using the same APIs). Virtual sensors that consume information from other sensors are treated the same as the application.

Demo 1: webm

In this demo:

  1. Listing the 2 sensors
  2. Listing the current connections (none)
  3. Open a connection and configure to 100Hz
  4. Open another connection to the same device
  5. Show both connections
  6. Configure the second connection to 200 Hz and show arbitration (sensor is set to 200Hz)
  7. Close the second connection and show arbitration (sensor is set back to 100Hz)

Demo 2: webm

In this demo:

  1. listing the 4 sensors that are on the subsystem.
  2. open 2 connections to the same accelerometer and request a single sample from the first connection. Both clients are notified of the data.
  3. close the second connection
  4. list the currently open connections
  5. open a new connection to the other accelerometer (an emulated one). Shows recycling the connection that was closed in step 3
  6. open a connection to the angle sensor
  7. Request a sample from the angle sensor, notice that the connections for both accelerometers also get the data.

DISCLAIMER: the data for the angle sensor is actually just the dot product. I need to add more features to the DSP library for the angle calculation to fully work

Proposed structure

Below are diagrams of the structure that's being proposed (used in the demo videos)

Sensor subsystem - Data structures

  • Uses an iterable section for the sensing_sensor_info
  • Uses the existing struct sensor_driver_api for virtual sensors
  • Uses the existing struct sensor_info for the common information about sensors
  • Treats all clients the same (virtual sensors open connections just like the application)
  • Each connection holds a pointer to the sensing_sensor_info struct it's connected to
  • Each connection holds the currently requested configuration (allows for re-arbitration when we disconnect or change a configuration)
  • The connection pool is created via an iterable section that combines statically allocated connections that are owned by virtual sensors (for example the angle calculator allocates 2 connections for itself) + a Kconfig array of dynamic connections available for the app + the sensing_shell.

Sensor subsystem - Processing loop

  • The processing loop is blocking on the RTIO CQ
  • If no batching is needed for the data, the data is broadcast to the open connections
  • If batching is needed, add it to the buffer and check if the batch is complete (full or expired)

Alternatives

Do nothing?

@yperess yperess added the RFC Request For Comments: want input from the community label Jun 7, 2023
@henrikbrixandersen
Copy link
Member

Thank you for this detailed summary.

This was referenced Jun 22, 2023
@huhebo
Copy link
Collaborator

huhebo commented Jun 27, 2023

@yperess, thank you for the detailed comments and suggestions, very good!

Before explaining to the concerns in this RFC and help other guys to understand the context, I still want to summarize and highlight the key concepts of current Sensing Subsystem design in #55389.

We defined two entities (only these two) in current design:

  • Application Client
    • Config and consume data of Sensor only
    • Need explicitly and dynamically call sensing_open_sensor() to open a Sensor(create connection, return a handler of type sensing_sensor_handle_t)
    • Using Sensing Subsystem API
  • Sensor
    • Virtual Sensor
      • Config and consume data of other Sensor(s) either Physical or Virtual
    • Physical Sensor
      • Collect other input data (kernel mode sensor device or data from other device like WIFI, BLE etc)
    • Produce and report new sensor data to subsystem
    • Automatically instantiated from devicetree, connection auto created between reporter Sensor and client Sensor (sensing_sensor_handle_t auto opened for client Sensor)
    • Using some Sensing Subsystem API (config, get information of reporter Sensor etc) and implements Sensing Sensor API

This can keep things simpler and clearer:

  • If user just want to use existing Sensor(s), just focus on Application Client side
  • If user want to overwrite or create new Sensor(s), focus on Sensor side

At the consume side, Application Client and Sensor have very similar behavior, for example config and receive data from reporter Sensor(s). And we want keep internal implementation more unify with single data/config path, we used the unified handler type of sensing_sensor_handle_t to present the connection between Sensors and connection between Sensor and Application Client, so, every APIs take sensing_sensor_handle_t as input parameters can be used both by Sensor and Application Client, like sensing_set_config(), sensing_get_config(), sensing_get_sensor_info().

The major differences are connection(type sensing_sensor_handle_t) are auto created by subsystem between Sensors from devicetree (STATIC), but Application Client need explicitly call API to open (DNAMIC).

for more details, please see https://docs.zephyrproject.org/latest/services/sensing/index.html#sensor-instance-handler
and https://docs.zephyrproject.org/latest/services/sensing/index.html#major-flows.

Why we have this design, will explain with the concerns in this RFC:

Dual code paths

The sensing subsystem contains a separate code path for virtual sensors instantiated via devicetree vs clients of the subsystem.

Like described above, although we defined different entities: Application Client and Sensor and two sets of APIs: Sensing Subsystem API and Sensing Sensor API

But internally we reused single config path, for example, Virtual Sensor and Application Client can use same API: sensing_set_config() to config their reporter Sensor. The arbitration is no special for Virtual Sensor client or Application Client.

For data path, internally most parts are same, only have a small special difference of finally data receive callback APIs:

  • Application Client

     typedef void (*sensing_data_event_t)(
     	sensing_sensor_handle_t handle,
     	const void *buf);
  • Sensor

    typedef int (*sensing_sensor_process_t)(
     	const struct device *dev,
     	sensing_sensor_handle_t reporter,
     	void *buf, int size);

The difference is, for Sensor we need pass the device instance itself (struct device *) to callback for multipe instance support in same Sensor implementation code.

But for Application Client, we no need identify different Application Client instance in the data event callback, and it's strange for define a Application Client instance with struct device*.

So, in this case, we can't use same data event callback APIs for both Sensor and Application Client.

This is an issue for both memory footprint and passing CTS as it overcomplicates what should be a simple system of arbitration and pub sub.

As described above, we reused single config/data path internally, and use devicetree statically config Sensors and statically allocate memory for their connections (with metadata). Compared with dynamic create connections between Sensors (need pre-allocate memory pool, may waste memory), it used memory more efficiently with no waste.

Sensor and Application Client calling same sensing_set_config() API to config report Sensor(s), subsystem doing arbitration in sensing_set_config() unify, no need special process for Sensor or Application Client client, not see any overcomplicates here.

Current design of the subsytem means that if the same developer wants to convert this algorithm to an official virtual sensor, they must use the sensing internal sensor API instead of the one for the clients. They need to register reporters in devicetree and the connections to the reporters will automatically be opened by the subsytem.

Sensing Sensor API is also public APIs, Application developer also can use it to overwrite or create new Sensor(s).

devicetree is good for config different sensor connections without code change, it's suitable for different board configurations. will talk more on the Devicetree bindings concerns section.

Next, if a virtual sensor wants to dynamically get more information from another sensor, they will have to go through the "normal" client setup of opening a connection, because the previously mentioned flow is only supported for statically registered dependencies.

No, Virtual Sensor can just call sensing_get_sensor_info() to get more information for its reporter Sensor(s) like Application Client, the only difference is Virtual Sensor can get its reporters' handlers at its init callback directly without open by itself (subsystem opened for it with devicetree):

typedef int (*sensing_sensor_init_t)(
		const struct device *dev, const struct sensing_sensor_info *info,
		const sensing_sensor_handle_t *reporter_handles, int reporters_count);

then, almost every APIs which take sensing_sensor_handle_t as input can be called directly, like get/set config, get more information, get status etc.

This means that the sensor now needs to support both workflows AND the subsystem needs to include both code paths. To be clear, this means that:

  1. The sensor subsystem needs to arbitrate all STATIC and DYNAMIC connections together (neither gets a priority)
  2. The sensor subsystem needs to deliver data and events to both connection types

To be honest, although we used unified sensing_sensor_handle_t to present the two connections, internally they're in did not exactly the same, they have different metadata (different data event callbacks etc), as explained above, for config path with sensing_set_config() API, there's no special treatment internally for arbitration, the arbitration is only take account of how many connections (with configurations) on the Sensor instance at that time, not care about the connection types is from Sensor client or Application Client, also not care about its statically created or dynamically created (not aware of this, and no need to aware).

Only for the data push path at the last stage, calling client's data receive callbacks need check connection types, as explained above, due to same Sensor code need support multipe instance, Sensor client's data receive callback (sensing_sensor_process_t) need pass instance device (struct device*), we can't use same data callbacks for both Sensor client and Application Client, the special treatment is just calling different callbacks (stored in connection's metadata), this difference is very small, and it's also not care about the connection is statically created or dynamically created.

I don't agree this means there are two separated code paths.

I don't see a good reason for Zephyr's sensor subsystem to support 2 separate code paths that do EXACTLY the same thing. Virtual sensors need to be clients of the sensor subsystem just like any other client. The only "special" treatment they should get is that they need to register as data producers with the subsystem and be able to push data into the subsystem and have the subsystem open connections to the virtual sensor.

The idea to make virtual sensors like any other client (Appliation etc), and let virtual sensors like application client open reporter sensors by themselves, like your mentioned example:

If a client wanted to implement an unofficial lid angle, they would:

  1. get the sensing nodes for the lid and base accelerometers
  2. call sensing_sensor_open() to start a connection to those nodes
  3. configure both nodes to the desired data rate and other settings (subsystem will arbitrate)
  4. In the callback provided in the sensing_sensor_open() they would perform the angle calculation

It's sounds good, but if take more thinking, I can see below problems:

  • A. Memory issue
    Since every connection is dynamically created, need either pre-allocate a memory pool (may waste of memory) or using malloc (not recommend and strictly controlled in Zephyr).

  • B. Dependency order issue (critical)
    Each sensor needs an init stage to open and config it's reporter sensors, right? Sensor can register it's init callback when register to the subsystem.
    The problem is the whole sensing topologic with virtual sensors is a DAG (Directed Acyclic Graph), it's not allowed two sensors depends on each other (former a cycle in the graph), for example, if X -> Y -> Z -> X, then when X config it's reporter Z, Z will config Y, Y will config X, then X config Z ... this will enter a dead loop.
    In fact, to make everything work correctly, we always need make sure the reporter sensors should init before it's client, we need some sorting process like topological sorting to make the correct init order.
    If sensors all open and config their reporters dynamically, I don't know the subsystem how to make sure init them with the right dependency order.

  • C. Board config issue
    It will lose some board configurability, since the sensor code need explicitly open specific sensor nodes (like hard code), so, if want to change the report nodes (same sensor type but different instance) of a client sensor on a board, need change code, but can't just config in board's devicetree.

But with current Sensing Subsystem design:

  • issue A is greatly improved, all connections between Sensors are statically created from devicetree, no need pre-alloc memory or malloc, no memory waste, only the connection between Sensor and Application Client need dynamically create and have such issue (it's hard to know how many application clients and each client will open which sensors).

  • issue B can be easily resolved, all sensors and connections between them all statically created from devicetree, and thanks to the phandle of devicetree (devicetree automatically order nodes with phandle dependents), we can easily make the correct init order for all sensors and check if there a depends on cycle (dead loops of depends). And since Application Clients just pure consumers, no others depend on them, they are always in the top layer of the whole topologic with dependency, so it's doesn't matter for them.

This is why we need distinguish Sensor client and Application client, not just for users to easy understand and use.

  • for issue C, since we defined each sensor instance and their reporter nodes in devicetree, there is no explicit open nodes code in sensor, we can easily config in board devicetree, this can make the most configurability.

Devicetree bindings

The current layout of the subsytem is such that every child node of the subsystem is a sensor. This restriction is not necessary and makes it difficult to extend the subsystem. If we ever want to include nodes which are not sensors, it'll require a large refactor.

It's true everything inside the subsystem is a sensor, sensor is powerful abstract concept, we can base on it to build many amazing systems. But we do have Physical Sensor to collect/wrapper outside world, and in the devicetree, we have the underlying-device property:

lid_accel: lid-accel {
			compatible = "zephyr,sensing-phy-3d-sensor";
			...
			underlying-device = <&bmi160_spi>;
		     };

we can use it to point any device node, for example a camera device, a WIFI or BLE device, to create the new wrapper Physical Sensor type.

For example, if want to make a vision sensor with camera device, just create the new vision Physical Sensor type, define the sensor nodes in devicetree:

vision {
			compatible = "zephyr,sensing-vsion-sensor";
			...
			underlying-device = <&camera_0>;
       };

Then, in the vision Physical Sensor code, can get camera device via the underlying-device node, and get image data from the camera device, doing some vision process, produce vision sensor data and report to Sensing Subsystem to join other sensors fusion.

All these are by design, it's very flexible, I don't see there need large refactor.

With the above class diagram, I believe that a binding can be created that will automatically populate a sensing_sensor_producer struct. Each property should have explicit needs. For example, with the current PR, minimal-interval is introduced and is either unnecessary or not needed (I can't tell because it's not really documented). I don't see a need for it and it should be removed until we have a strong reason for adding it.

Prior versions of the PR included a rotation matrix in the child sensor nodes. I want to make sure this doesn't come back in (it was removed in recent patches). The rotation matrix is a property of the physical device, it's error prone to have it in the subsytem's sensor node because these nodes are per type (1 node for accel and 1 node for gyro even if they both point to the same physical device). I'm actually not even convinced that having these separate notes will work well since configuring one means also possibly configuring the other so I will need to see how arbitration will work (for example, if we need to update the GPIO behavior of an accel/gyro chip having 1 node for accel and 1 node for gyro might mess up the arbitration. I would like to see a design proposal for how this will be done.

minimal-interval and rotation matrix are minor details and not impacts the whole design, we can discuss more.

Agree with you, we need avoid unnecessary or not needed things, and for rotation matrix, I agree it's a property of the physical device, so, I think it's better in kernel mode sensor device node, since the Sensing Subsystem is optional for some low end Iot devices which just want some simple physical sensors, no need fusion, in this case, they also need rotation matrix to calibrate the axis, so put rotation matirx in kernel mode sensor device node is more make sense. And as your proposal, we can define the metadata nodes with rotation matrix for such combo sensor device (acc, gyro in same chip).

This leads me to the final obvious point that the virtual sensors don't need to be children of the subsystem if they use an iterable section to create the nodes in C, and we guarantee that there's always at most 1 instance of the subsystem, this is already handled by the compatible string for those nodes that will know to create the structs via a call to SENSING_SENSOR_DT_DECLARE() or something similar.

As the explanation in the Dual code paths section, define each sensor including virtual sensors and their reporter phandles is key to resolve the A, B and C issues.
Another good thing is, with all sensors in devicetree, it's made an editable snapshot of all sensors and their reporting relationships of the board, it's very clear and helpful for debugging.

CHRE sensor types

This is a simple matter of where things are used. CHRE is an official module of Zephyr and will be running on the same compute core as the sensor subsystem. Using HID is only used when sending data off chip. This means that our options are:

  1. Convert subsystem constants HID types to CHRE every time we pass data between the two
  2. Convert subsystem constants CHRE types to HID every time we pass data to the application processor
    This seems like a no brainer to me. I'll also point out that as per CTS, there exists a client (I can't say which) that said they'll switch to Zephyr from FreeRTOS for their Android phones when we get this all working with CHRE. This further makes me think that the extra translation is a waste of cycles and/or a lookup table for every single event.

The goal of Sensing Subsystem is suitable for varies sensing usages with Zephyr, so it need keep neutrality, and need decouple with any specific protocols. And agree the first product goal is Chrombooks.

I think we need standardize sensor types, the HID is the only existing industry standard with most sensor types defined (and with custom types for extending) and widely used in Windows and Linux as I know, it's better than CHRE sensor types at least in the standardized aspect, that's why I prefer HID, please note, we just used the HID's sensor type definition (how many data fields of a type etc), but no HID protocol and data format involved.

I'm happy, if we can and want define Zephyr's own standard sensor types to cover CHRE, Chrome, Linux and Windows instead of HID, at least I don't' think CHRE sensor types can cover Windows's sensor types.

And in fact, as you can see, the current sensor data format is more like the data format defined in CHRE, because we think this is a good design (we always study good things from different designs), so, we need more data conversion to HID than to CHRE. :)

/*Sensing Subsystem */
struct sensing_sensor_value_3d_q31 {
	struct sensing_sensor_value_header header;
	int8_t shift;
	struct {
		uint32_t timestamp_delta;
		union {
			q31_t v[3];
			struct {
				q31_t x;
				q31_t y;
				q31_t z;
			};
		};
	} readings[1];
};
/*CHRE*/
struct chreSensorThreeAxisData {
    /**
     * @see chreSensorDataHeader
     */
    struct chreSensorDataHeader header;
    struct chreSensorThreeAxisSampleData {
        /**
         * @see chreSensorDataHeader
         */
        uint32_t timestampDelta;
        union {
            float values[3];
            float v[3];
            struct {
                float x;
                float y;
                float z;
            };
            float bias[3];
            struct {
                float x_bias;
                float y_bias;
                float z_bias;
            };
        };
    } readings[1];
};

Sine Sensing Subsystem switched to zDSP with Q-Format, but CHRE still use float, the convert from Q-Format to float is needed.

For the CTS test, we're working on it with Sensing Subsystem on Chrome OS. Not see big problems at current.

Proposed change

  1. Provide an API to publish data into the subsystem

We already have this API (sensing_sensor_post_data()) in Sensing Sensor API

This API is only for Sensor, Application Client can't use, as explained in the Dual code paths section, we need keep Application Client as pure consumer.

  1. Have nodes in devicetree be metadata about the physical sensors OR be virtual sensors
    These nodes for physical sensors would be much more easy to manage and arbitrate if a single one is used and registers itself as a producer of 2 types (accel/gyro). This means that the rotation matrix and other augmenting properties can be provided in the same place since they affect the data produced by the same sensor.

That's a good idea, we can think and discuss more.

  1. Sensors should use the existing sensor API. It already provides attribute get/set capabilities as well as getting samples (including streaming soon). There's no reason to add a new API

I'm afraid we can't use existing sensor API, because they are all discrete sensor data channel based, but Sensing Subsystem is designed with sensor type (include multiple axis data) based.

@huhebo
Copy link
Collaborator

huhebo commented Jun 29, 2023

And if the purpose of Virtual Sensor dynamically open reporters is for some special usages which want change Virtual Sensor's behavior and inputs at runtime, we need to know the solid usage cases, and carefully think about it and doing evaluation, support dynamic (change node and edge) DAG computing model can lead many problems such as memory issue and depends on order change issues as described above.

@KeHIntel
Copy link
Collaborator

Hi, Yuval

Thank you very much for the detailed suggestions.

I would like to share my thoughts on the first issues you raised.

  • Sensing contains dual code paths for the same purpose

I think we need to first align on the design concept of sensing subsystem.
In our opinion, and also what we agreed before, sensing subsystem is:

  • A standalone Zephyr subsystem that manages all type of sensors
  • And provide unified high level sensing API for Zephyr application clients

In such way, we will inevitably have two set of APIs to serve different purposes

  • Sensor object APIs for sensor driver or algorithm developers to create sensor objects
    • APIs are defined in include/zephyr/sensing/sensing_sensor.h
    • All sensor objects are statically registered to sensing subsystem via device tree
    • Sensing subsystem manages all sensor objects, and knows their relationship (sensor topology)
    • Sensing subsystem handles the data flow and config flow of all sensor objects and scheduling
    • Sensing subsystem handles the common function of all sensor objects, like arbitration, batching, etc.
  • Sensor client APIs for application developers to enumerate, open, close, config sensors
    • APIs are defined in include/zephyr/sensing/sensing.h
    • Different from sensor object, sensor client application are not required to be config via device tree
    • Sensing subsystem does not manage sensor client application (it is outside sensing subsystem)
    • Sensing subsystem does not manage the relationship among multiple applications
    • Sensing subsystem does not handle data/config flow among multiple applications
    • Sensing subsystem does not handle the scheduling of any client applications

Below figure illustrates the concept of sensor object (ellipse boxes) and sensor application (rectangle boxes), hope it can help clarify the design

image

Now back to the issue, I think the dual path is referring to “path 1” and “path 2” in above figure

  • phy_sensor_0 has two path: path 1 is to application 1, path 2 is to virt_sensor_0
  • in the view of phy_sensor_0, it is indeed dual path in same purpose: to serve its clients
  • however, from the view of sensing subsystem, they are totally different: path 1 is external while path 2 is internal
  • if we converge path 1 and path 2 into same, then virt_sensor_0 will be either moved out to application level, or become something new (something between sensor object and application)
  • in such way, sensing subsystem API no longer handles path 3 or 4 (as the path from sensor to client), virt_sensor_0 needs to expose its own APIs to other applications/clients to use

So in summary for dual path issue, by design:

  • we want to treat virtual sensor object as same as physical sensor object in sensing subsystem – I believe this is also what Google suggested before
  • however by nature, virtual sensor is client of other sensors, while physical sensor is raw data supplier – this is the difference we need to handle
  • so we need to treat virtual sensor as both sensor object (inside sensing subsystem) and sensor client (of another sensor) - it is different from sensor application who is just sensor client
  • hence path 1 and path 2 are handled in slight different ways in above figure

Best Regards,
Ke Han

@huhebo
Copy link
Collaborator

huhebo commented Jun 30, 2023

Thanks @KeHIntel for more explanation and thoughts about the Why of current design.

And for easy track, list the problems about the proposal for the dual code paths concern in this RFC (described in #58925 (comment)):

  • A. Memory issue

Since every connection is dynamically created, need either pre-allocate a memory pool (may waste of memory) or using malloc (not recommend and strictly controlled in Zephyr).

  • B. Dependency order issue (critical)

Each sensor needs an init stage to open and config it's reporter sensors, right? Sensor can register it's init callback when register to the subsystem.
The problem is the whole sensing topologic with virtual sensors is a DAG (Directed Acyclic Graph), it's not allowed two sensors depends on each other (former a cycle in the graph), for example, if X -> Y -> Z -> X, then when X config it's reporter Z, Z will config Y, Y will config X, then X config Z ... this will enter a dead loop.

In fact, to make everything work correctly, we always need make sure the reporter sensors should init before it's client, we need some sorting process like topological sorting to make the correct init order.

If sensors all open and config their reporters dynamically, I don't know the subsystem how to make sure init them with the right dependency order.

  • C. Board config issue

It will lose some board configurability, since the sensor code need explicitly open specific sensor nodes (like hard binding), so, if want to change the report nodes (same sensor type but different instance) of a client sensor on a board, need change code, but can't just config in board's devicetree.

@yperess
Copy link
Collaborator Author

yperess commented Jul 5, 2023

@KeHIntel thank you for the details and I think you hit the nail on the head with the image which I'm going to re-use in order to keep the conversation directed.

image

In this diagram I would argue that not only can we converge paths 1 & 2, but we should do exactly that. In this diagram, virt_sensor_0 is a client of phy_sensor_0. It can send configuration requests to phy_sensor_0 just like the application client and those will be arbitrated. What you're saying is that paths 3 & 4 will become impossible, but from what I can tell it's just a feedback loop. Effectively, each node inside the sensor subsystem is a sensor (in my design using the struct sensor_driver_api that's provided in Zephyr. The only difference is that these sensors also use a sensing API which allows them to post data into the subsystem. The data can be through of as a pair (sensor_info, data). Once that data is posted to the subsystem, the subsystem can check:

Is anyone listening to the sensor_info node that just posted data?

  • If no, we might be able to discard the data.
  • If yes, check for batching configs and possibly cache it or boardcast it to all the open connections that are listening. In this case that means that when phy_sensor_0 publishes data to the subsystem, that data will be sent to client 1, client 2, and virt_sensor_0. Once that's done, the memory is recycled.

Notice that the above doesn't need to know about connections 3&4. When virt_sensor_0 publishes data associated with its sensor_info node, the subsystem repeats the process. Who are the clients for this data? notify the clients. In this case, forwarding the data to client 1 and virt_sensor_1.

As a separate example, lets assume the existing APIs and configurations. What happens when/if virt_sensor_0 opens a dynamic connection to phy_sensor_2? Now we have 1 virtual sensor that uses 2 static connections and 1 dynamic connection and has to pipe data through 2 very different code paths.

I don't see why this is needed. I'm not saying it can't work or that it doesn't. I am saying that from what I can see, this is more complicated than it needs to be and should be either very strongly justified. Ideally with a solid example or unit test that will fail without this design.

@KeHIntel
Copy link
Collaborator

KeHIntel commented Jul 6, 2023

@yperess, now I understand your idea better, and here is my comment:

Effectively, each node inside the sensor subsystem is a sensor (in my design using the struct sensor_driver_api that's provided in Zephyr

Comments: do you mean that you want to implement each virtual sensor as a Zephyr sensor kernel device driver? Or this is just referring to physical sensor? If you are talking about virtual sensors as Zephyr sensor kernel device driver, I really doubt the effectiveness, given Zephyr sensing subsystem can run into user space, why bother move virtual sensors back to kernel?

The only difference is that these sensors also use a sensing API which allows them to post data into the subsystem. The data can be through of as a pair (sensor_info, data).

Comments: this looks like feasible in high level, but could result in a lot of issues if we deep dive into details:

  1. Memory management – does sensing subsystem need to dynamically allocate memory for (sensor_info, data) for each virtual sensor (given sensing subsystem cannot tell it is a virtual sensor or user application)? – In our proposed design, there is no such problem, as memory is statically allocated. @huhebo also mentioned this in above comment.
  2. Multi-clients data post – if there are two sensor clients run similar algorithm and both post same (sensor_info, data) to sensing subsystem to virtual_sensor_0, what will happen? There could be conflictions like jitter happens in virtual_sensor_0 output. – In our proposed design, there is no such problem, as each virtual sensor are statically configured in device tree, there is only one client can post data on a given sensor.
  3. Virtual sensor config – you mentioned the policy on data post "If no, we might be able to discard the data". Well, this will be a very inefficient design. If no one wants the data, the sensor should not run at all, otherwise it is a waste of power. – In our proposed design, there is also no such problem, if no client is opening an given sensor, that sensor will not be scheduled for run by sensing subsystem.
  4. Dead lock loop – you mentioned that “the above doesn't need to know about connections 3&4”. Well, as @huhebo mentioned in above comment, this will have a risk of dead lock loop among sensors and applications. For example, app1 stream data from sensor1 (rotation matrix) and post data to sensor2 (quaternion), and app2 stream data from sensor2 (quaternion) to sensor1 (rotation matrix). As we can see, a dead loop is created. – In our proposed design, as we enforce check on dependency order, there is no such problem.
  5. Sensor enumeration – I believe the application developer’s user habit is to enumerate all sensors only once after app start, then use it afterward. If app1 starts early then app2 (who post data to virtual_sensor_0), then in sensor enumeration, app1 won’t be able to see virtual_sensor_0, and all its clients (like virtual_sensor_1). This will make app development very complicated, and app initialization depends on other apps. – In our proposed design, there is no such problem.
  6. Virtual sensor arbitration – sensing subsystem is expected to arbitrate configurations for virtual sensors as same as physical sensors. If the sensor driver or sensor app just post data to sensing subsystem, I could not understand how virtual sensor arbitration can be done. If the sensor driver or sensor app is implementing all the virtual sensor callbacks, I could not understand how it differentiates from current virtual sensor design.
  7. Multiple instance – in current sensing subsystem design, as @huhebo mentioned in above, we can support same virtual sensor object duplicated into multiple instances (same virtual sensor code running with different input and output), I could not understand how this can be done with dynamic created virtual sensors.

Hope above comments can help clarifying our design philosophies better. We understand your preference on more flexibility in virtual sensor implementations and configurations, however, as you can see, there will be a lot of new problems coming up.

We believe our current proposal is already a good balance among configuration flexibility, sensor development easiness, sensor management simplification, sensor application decoupling, and system running efficiency. So our recommendation is still to keep it in high level unchanged.

@yperess
Copy link
Collaborator Author

yperess commented Jul 6, 2023

@yperess, now I understand your idea better, and here is my comment:

Effectively, each node inside the sensor subsystem is a sensor (in my design using the struct sensor_driver_api that's provided in Zephyr

Comments: do you mean that you want to implement each virtual sensor as a Zephyr sensor kernel device driver? Or this is just referring to physical sensor? If you are talking about virtual sensors as Zephyr sensor kernel device driver, I really doubt the effectiveness, given Zephyr sensing subsystem can run into user space, why bother move virtual sensors back to kernel?

I'm saying you can use the API without having to make them kernel level objects. The key here is that implementing a virtual sensor should feel like any other sensor driver.

The only difference is that these sensors also use a sensing API which allows them to post data into the subsystem. The data can be through of as a pair (sensor_info, data).

Comments: this looks like feasible in high level, but could result in a lot of issues if we deep dive into details:

  1. Memory management – does sensing subsystem need to dynamically allocate memory for (sensor_info, data) for each virtual sensor (given sensing subsystem cannot tell it is a virtual sensor or user application)? – In our proposed design, there is no such problem, as memory is statically allocated. @huhebo also mentioned this in above comment.

In your current design, do you allocate a buffer for every single connection? Generally speaking a pool is much more efficient since some sensors may never need to buffer the data (buffering really is only ever needed when you configure a latency/batching). If you look at the updated APIs for sensors they use the RTIO subsystem + a memory pool which gives a lot of flexibility. It was designed explicitly to work for the sensor subsystem.

  1. Multi-clients data post – if there are two sensor clients run similar algorithm and both post same (sensor_info, data) to sensing subsystem to virtual_sensor_0, what will happen? There could be conflictions like jitter happens in virtual_sensor_0 output. – In our proposed design, there is no such problem, as each virtual sensor are statically configured in device tree, there is only one client can post data on a given sensor.

Not quite, if you go back to the RTIO model, the sensor subsystem controls the consuming thread for the CQE. When data comes back for the sensor it forwards the information and thus maintains the current callback's context as the current context. As such, the API doesn't actually need to include the sensor_info since it's the current context and will be already known by the subsystem.

  1. Virtual sensor config – you mentioned the policy on data post "If no, we might be able to discard the data". Well, this will be a very inefficient design. If no one wants the data, the sensor should not run at all, otherwise it is a waste of power. – In our proposed design, there is also no such problem, if no client is opening an given sensor, that sensor will not be scheduled for run by sensing subsystem.

You cannot guarantee this. A sensor could start an async bus operation then the clients close the connection. By the time there's data and sensor produced something it might not have clients. You will always need this check.

  1. Dead lock loop – you mentioned that “the above doesn't need to know about connections 3&4”. Well, as @huhebo mentioned in above comment, this will have a risk of dead lock loop among sensors and applications. For example, app1 stream data from sensor1 (rotation matrix) and post data to sensor2 (quaternion), and app2 stream data from sensor2 (quaternion) to sensor1 (rotation matrix). As we can see, a dead loop is created. – In our proposed design, as we enforce check on dependency order, there is no such problem.

I think we need to allow a circular dependency.

  1. That's on the developer to make sure the application behaves like they want.
  2. As long as the virtual sensors don't produce data 1:1, there's no harm in a circular dependency. For example, lets say we have:
    • A virtual fusion sensor that produces 6dof (position and orientation) by using an accel/gyro
    • A stillness virtual sensor (detects when the device is still) that listens to the 6dof virtual sensor
    • 6dof sensor wants to know when the stillness detector fires in order to reset the origin (reduces the cumulative error)
      In the above, we have the 6dof sensor depend on: accel, gyro, and stillness sensors. The stillness sensor depends on the 6dof sensor. But since the stillness sensor only provides 1 sample per lets say 100 6dof samples, this is OK and works well.
  1. Sensor enumeration – I believe the application developer’s user habit is to enumerate all sensors only once after app start, then use it afterward. If app1 starts early then app2 (who post data to virtual_sensor_0), then in sensor enumeration, app1 won’t be able to see virtual_sensor_0, and all its clients (like virtual_sensor_1). This will make app development very complicated, and app initialization depends on other apps. – In our proposed design, there is no such problem.

Not quite, in my WIP PR #56963 you can see that you can create a struct sensing_sensor_info anywhere. It doesn't even have to come from devicetree. Since it uses a linker section, the subsystem will find it and integrate with it. This means that the sensor info nodes are always statically allocated and can be discovered by the app as soon as you boot.

  1. Virtual sensor arbitration – sensing subsystem is expected to arbitrate configurations for virtual sensors as same as physical sensors. If the sensor driver or sensor app just post data to sensing subsystem, I could not understand how virtual sensor arbitration can be done. If the sensor driver or sensor app is implementing all the virtual sensor callbacks, I could not understand how it differentiates from current virtual sensor design.

I'm not sure I follow, arbitration really has more to do with configuration than with distributing data. The data distribution is more of a pub/sub problem which only has one kink and that's the batching configuration. If batching is enabled we might want to buffer the data for longer. Since posting the data is effectively like saying "the current sensor_info node produced data X". The sensor subsystem can iterate through the connections associated with the current sensor_info node and call the various callbacks. Generally speaking I would say that the smallest configured batching latency.

The remaining arbitration is done regardless of data generation. When a connection requests a new config, the subsystem iterates through all the current connections for the same sensor_info node and finds the new "common" config. Then passes that new config to the sensor (using the existing sensor_driver_api)

  1. Multiple instance – in current sensing subsystem design, as @huhebo mentioned in above, we can support same virtual sensor object duplicated into multiple instances (same virtual sensor code running with different input and output), I could not understand how this can be done with dynamic created virtual sensors.

I'm sorry if there was a confusion, see my comment on bullet 5. The sensing_sensor_info objects are created statically and are put together into an iterable section. You can easily create multiple instances just like existing kernel level drivers but in userspace.

Hope above comments can help clarifying our design philosophies better. We understand your preference on more flexibility in virtual sensor implementations and configurations, however, as you can see, there will be a lot of new problems coming up.

We believe our current proposal is already a good balance among configuration flexibility, sensor development easiness, sensor management simplification, sensor application decoupling, and system running efficiency. So our recommendation is still to keep it in high level unchanged.

@KeHIntel
Copy link
Collaborator

KeHIntel commented Jul 6, 2023

@yperess, now I understand your idea better, and here is my comment:

Effectively, each node inside the sensor subsystem is a sensor (in my design using the struct sensor_driver_api that's provided in Zephyr

Comments: do you mean that you want to implement each virtual sensor as a Zephyr sensor kernel device driver? Or this is just referring to physical sensor? If you are talking about virtual sensors as Zephyr sensor kernel device driver, I really doubt the effectiveness, given Zephyr sensing subsystem can run into user space, why bother move virtual sensors back to kernel?

I'm saying you can use the API without having to make them kernel level objects. The key here is that implementing a virtual sensor should feel like any other sensor driver.

By definition in Zephyr, device drivers are kernel drivers. If we want to implement a virtual sensor like other Zephyr sensor device driver, but stay in user space, I am not sure how this can be done. If we need to introduce something new in Zephyr for this, will it become another "2nd path"?

The only difference is that these sensors also use a sensing API which allows them to post data into the subsystem. The data can be through of as a pair (sensor_info, data).

Comments: this looks like feasible in high level, but could result in a lot of issues if we deep dive into details:

  1. Memory management – does sensing subsystem need to dynamically allocate memory for (sensor_info, data) for each virtual sensor (given sensing subsystem cannot tell it is a virtual sensor or user application)? – In our proposed design, there is no such problem, as memory is statically allocated. @huhebo also mentioned this in above comment.

In your current design, do you allocate a buffer for every single connection? Generally speaking a pool is much more efficient since some sensors may never need to buffer the data (buffering really is only ever needed when you configure a latency/batching). If you look at the updated APIs for sensors they use the RTIO subsystem + a memory pool which gives a lot of flexibility. It was designed explicitly to work for the sensor subsystem.

We do not allocate buffer for every single connection, but we allocate buffer for every sensor object. The buffer is just one data struct size, and always holds the last data reported by that sensor. This buffer is must have for each sensor in sensing subsystem. For memory pool, I think it does not work. For example, a new client opens a HALL sensor and want to read the last data that generated 1 hour ago when lid opened. In a pool based solution, the data will be already washed away.

  1. Multi-clients data post – if there are two sensor clients run similar algorithm and both post same (sensor_info, data) to sensing subsystem to virtual_sensor_0, what will happen? There could be conflictions like jitter happens in virtual_sensor_0 output. – In our proposed design, there is no such problem, as each virtual sensor are statically configured in device tree, there is only one client can post data on a given sensor.

Not quite, if you go back to the RTIO model, the sensor subsystem controls the consuming thread for the CQE. When data comes back for the sensor it forwards the information and thus maintains the current callback's context as the current context. As such, the API doesn't actually need to include the sensor_info since it's the current context and will be already known by the subsystem.

The multi-client post problem is about two clients post same data to one sensor object. For example:

  • client1 reads A, G, M with EKF algorithm for quaternion, post data to quaternion sensor in 100Hz
  • client2 reads A, G, M with UKF algorithm for quaternion, post data to quaternion sensor in 100Hz

In such case, the quaternion sensor will output data in 200Hz, with one data from EKF algorithm, another data from UKF algorithm.

  1. Virtual sensor config – you mentioned the policy on data post "If no, we might be able to discard the data". Well, this will be a very inefficient design. If no one wants the data, the sensor should not run at all, otherwise it is a waste of power. – In our proposed design, there is also no such problem, if no client is opening an given sensor, that sensor will not be scheduled for run by sensing subsystem.

You cannot guarantee this. A sensor could start an async bus operation then the clients close the connection. By the time there's data and sensor produced something it might not have clients. You will always need this check.

In sensing subsystem, all sensors (physical and virtual sensors) are scheduled by sensing subsystem. If a virtual sensor has no client, sensing subsystem knows it during runtime, and

  • will not call its 'sensing_sensor_process_t' callback, in this way, that virtual sensor will not run
  • will also check its underlying sensors whether this virtual sensor is the only client, if so, will do the same, and stop these sensor running too
  1. Dead lock loop – you mentioned that “the above doesn't need to know about connections 3&4”. Well, as @huhebo mentioned in above comment, this will have a risk of dead lock loop among sensors and applications. For example, app1 stream data from sensor1 (rotation matrix) and post data to sensor2 (quaternion), and app2 stream data from sensor2 (quaternion) to sensor1 (rotation matrix). As we can see, a dead loop is created. – In our proposed design, as we enforce check on dependency order, there is no such problem.

I think we need to allow a circular dependency.

  1. That's on the developer to make sure the application behaves like they want.

If we can handle it in framework level, why bother push it to developers? After introducing Zephyr sensing subsystem, one top priority for us is to grow ecosystem, and in order to grow it as quickly as we can, we have to make developer's life as easy as possible.

  1. As long as the virtual sensors don't produce data 1:1, there's no harm in a circular dependency. For example, lets say we have:

    • A virtual fusion sensor that produces 6dof (position and orientation) by using an accel/gyro
    • A stillness virtual sensor (detects when the device is still) that listens to the 6dof virtual sensor
    • 6dof sensor wants to know when the stillness detector fires in order to reset the origin (reduces the cumulative error)
      In the above, we have the 6dof sensor depend on: accel, gyro, and stillness sensors. The stillness sensor depends on the 6dof sensor. But since the stillness sensor only provides 1 sample per lets say 100 6dof samples, this is OK and works well.

If the stillness sensor only detect still, your loop works, but real case is that stillness sensor also needs to detect motion and tell 6dof to wake up and work, in such case, your have a dead loop. stillness sensor who wants to detect motion but based on 6dof who is already set its output to origin because stillness sensor told it device is still.

The common approach will be: motion detect sensor based on accel, 6dof based on accel, gyro, motion detect. No loop, no dead lock, lower power (if only need motion detect, no need to run 6dof).

  1. Sensor enumeration – I believe the application developer’s user habit is to enumerate all sensors only once after app start, then use it afterward. If app1 starts early then app2 (who post data to virtual_sensor_0), then in sensor enumeration, app1 won’t be able to see virtual_sensor_0, and all its clients (like virtual_sensor_1). This will make app development very complicated, and app initialization depends on other apps. – In our proposed design, there is no such problem.

Not quite, in my WIP PR #56963 you can see that you can create a struct sensing_sensor_info anywhere. It doesn't even have to come from devicetree. Since it uses a linker section, the subsystem will find it and integrate with it. This means that the sensor info nodes are always statically allocated and can be discovered by the app as soon as you boot.

I understand there is other way than 'devicetree' that can config apps. My point is the unnecessary complexity. From developer's point of view, isn't it better to have a single place ('devicetree') to config all sensors for a given platform, instead of doing it in multiple places with multiple methods?

  1. Virtual sensor arbitration – sensing subsystem is expected to arbitrate configurations for virtual sensors as same as physical sensors. If the sensor driver or sensor app just post data to sensing subsystem, I could not understand how virtual sensor arbitration can be done. If the sensor driver or sensor app is implementing all the virtual sensor callbacks, I could not understand how it differentiates from current virtual sensor design.

I'm not sure I follow, arbitration really has more to do with configuration than with distributing data. The data distribution is more of a pub/sub problem which only has one kink and that's the batching configuration. If batching is enabled we might want to buffer the data for longer. Since posting the data is effectively like saying "the current sensor_info node produced data X". The sensor subsystem can iterate through the connections associated with the current sensor_info node and call the various callbacks. Generally speaking I would say that the smallest configured batching latency.

The remaining arbitration is done regardless of data generation. When a connection requests a new config, the subsystem iterates through all the current connections for the same sensor_info node and finds the new "common" config. Then passes that new config to the sensor (using the existing sensor_driver_api)

Arbitration is mainly referring to configuration. My point is that in order to get arbitrated configuration from sensing subsystem in apps, app has to implement callbacks from sensing_sensor.h in your app. Then this app will become a special app (who uses 'sensor.h' and 'sensing_sensor.h') other than regular sensor apps (who only uses 'sensor.h'). In such case, isn't it still dual path for sensor clients?

  1. Multiple instance – in current sensing subsystem design, as @huhebo mentioned in above, we can support same virtual sensor object duplicated into multiple instances (same virtual sensor code running with different input and output), I could not understand how this can be done with dynamic created virtual sensors.

I'm sorry if there was a confusion, see my comment on bullet 5. The sensing_sensor_info objects are created statically and are put together into an iterable section. You can easily create multiple instances just like existing kernel level drivers but in userspace.

Let's discuss it in bullet 5.

Hope above comments can help clarifying our design philosophies better. We understand your preference on more flexibility in virtual sensor implementations and configurations, however, as you can see, there will be a lot of new problems coming up.
We believe our current proposal is already a good balance among configuration flexibility, sensor development easiness, sensor management simplification, sensor application decoupling, and system running efficiency. So our recommendation is still to keep it in high level unchanged.

@yperess
Copy link
Collaborator Author

yperess commented Jul 6, 2023

@KeHIntel I trimmed the content so it's a bit easier to follow

By definition in Zephyr, device drivers are kernel drivers. If we want to implement a virtual sensor like other Zephyr sensor device driver, but stay in user space, I am not sure how this can be done. If we need to introduce something new in Zephyr for this, will it become another "2nd path"?

This works in my branch, what you need to do is create a partition for the associated data so it can be accessed from userspace.

We do not allocate buffer for every single connection, but we allocate buffer for every sensor object. The buffer is just one data struct size, and always holds the last data reported by that sensor. This buffer is must have for each sensor in sensing subsystem. For memory pool, I think it does not work. For example, a new client opens a HALL sensor and want to read the last data that generated 1 hour ago when lid opened. In a pool based solution, the data will be already washed away.

Can you elaborate on this. When you say every sensor object I assume you mean each type (accel/gryo are separate right?). So the buffer only holds 1 sample? What happens when we configure the sensor to batching mode? If we set the ODR to 500Hz and set the latency to 200ms, we would need to have a buffer for 100 samples (ideally).

The multi-client post problem is about two clients post same data to one sensor object. For example:

  • client1 reads A, G, M with EKF algorithm for quaternion, post data to quaternion sensor in 100Hz
  • client2 reads A, G, M with UKF algorithm for quaternion, post data to quaternion sensor in 100Hz

In such case, the quaternion sensor will output data in 200Hz, with one data from EKF algorithm, another data from UKF algorithm.

If I understood correctly, in the model the client is the virtual sensor. In your above example you have 2 different algorithms (EKF and UKF). These would be 2 separate sensing_sensor_info nodes and each would be a client of A/G/M and produce data. The application to become a client of these virtual sensors each producing 100Hz of data. What you get is:

A / G / M -> EKF -> application client
A / G / M -> UKF -> application client

In sensing subsystem, all sensors (physical and virtual sensors) are scheduled by sensing subsystem. If a virtual sensor has no client, sensing subsystem knows it during runtime, and

  • will not call its 'sensing_sensor_process_t' callback, in this way, that virtual sensor will not run
  • will also check its underlying sensors whether this virtual sensor is the only client, if so, will do the same, and stop these sensor running too

I don't think you need this extra scheduler. For 99% of cases you can just ignore it. But for the 1% you will not know at the time the data was requested. For example:

  1. Client connects
  2. Client requests data
  3. Sensing subsystem sends the request to the underlying physical sensor which starts a bus transaction (i2c, spi, etc)
  4. Zephyr performs a context switch
  5. Client disconnects
  6. Bus transaction completes but now has no open connections (drop the data).

If we can handle it in framework level, why bother push it to developers? After introducing Zephyr sensing subsystem, one top priority for us is to grow ecosystem, and in order to grow it as quickly as we can, we have to make developer's life as easy as possible.

Because you shouldn't handle it in the subsystem. It's not fair to make the assumption that loops are invalid because your clients can always work around you by opening a dynamic connection and using it to generate data. I think it's fine to put on to the developers that you cannot loop 1:1 data producing virtual sensors as it feels like an extreme edge case.

If the stillness sensor only detect still, your loop works, but real case is that stillness sensor also needs to detect motion and tell 6dof to wake up and work, in such case, your have a dead loop. stillness sensor who wants to detect motion but based on 6dof who is already set its output to origin because stillness sensor told it device is still.

The common approach will be: motion detect sensor based on accel, 6dof based on accel, gyro, motion detect. No loop, no dead lock, lower power (if only need motion detect, no need to run 6dof).

Right, this was just an oversimplified example to show a loop that is valid (not necessarily optimized for anything). I'm not arguing that it's not possible to do this without a loop, but that there's no reason to prevent these loops since you can't predict what developers will want to do.

I understand there is other way than 'devicetree' that can config apps. My point is the unnecessary complexity. From developer's point of view, isn't it better to have a single place ('devicetree') to config all sensors for a given platform, instead of doing it in multiple places with multiple methods?

Yes, but if you look at everything in Zephyr, devicetree is just a tool and is not a requirement. You could create a sensor driver using the DEVICE_DEFINE() macro without any devicetree. I'm not suggesting it's a nice way to do things, but we shouldn't assume that things will always be in devicetree. The current implementation actually assumes that every child node of the sensing subsystem is a sensor node. From what I can tell, this means that we're requiring the developers to always use devicetree. Also, it means that every node under the sensing subsystem parent node is required to be a sensor. It's limiting for no real reason.

Arbitration is mainly referring to configuration. My point is that in order to get arbitrated configuration from sensing subsystem in apps, app has to implement callbacks from sensing_sensor.h in your app. Then this app will become a special app (who uses 'sensor.h' and 'sensing_sensor.h') other than regular sensor apps (who only uses 'sensor.h'). In such case, isn't it still dual path for sensor clients?

I'm sorry, you lost me there with the use of the word app. My point is that if you open a connection, you'll get arbitrated. So if you have N open connections to the same sensing_sensor_info node, they'll get arbitrated every time any of them changes or disconnects. So on any change you'd do something like:

// Assume we got 'handle' as an argument which is a void* to the connection

// Create an empty config to represent the arbitrated final config
struct config union_config = {0};

// Crate a filter function to only look at open connections to the handle
auto is_connected = [handle](struct connection *c) { return c->info == handle; };

// Loop through and arbitrate
for (auto c : connections
            | views::filter(is_connected)) {
  arbitrate(&union_config, c.config);
}

@teburd
Copy link
Collaborator

teburd commented Jul 6, 2023

@yperess it was stated all along the subsystem would provide not just arbitration and pub/sub like functionality but scheduling order over the nodes as well.

Referencing the above diagram, in the current design the data from physical sensor 1 is given to virtual sensor 0 before virtual sensor 1 by design. With arbitrary pub/sub this is lost.

  1. The consistent ordering makes diagnosing and debugging easier.
  2. State can be understood to always be up to date, e.g. deciding which filter in a filter bank to run and forward along. E.g. mux like functionality is possible in the DAG because ordering is known.

Lets assume topological ordering and consider the following sort of diagram

In this scenario the bank selector will be done before the banks themselves are given the sensor data as guaranteed by the scheduling order as I understand the subsystem today.

This means that filters in the filter bank that are not selected can skip processing entirely saving time. Data does not need to be kept in filter banks not currently used saving memory.

With generalized pub/sub no guarantee is provided that the bank selector will be updated before the sensor data is given to the filters.

Suppose the bank selector updates at 1Hz while the sensor updates at 100Hz. This still works. No timestamp comparisons are needed or anything like that, no quirky logic involved. Each bank looks at its last selector update as its sole guidance on whether or not to do anything on each sensor update.

Now lets think about doing this same sort of thing with arbitrary pub sub where the ordering is not guaranteed.

To me at least, it looks like the logic involved would be more complex to deal with this sort of setup without topological ordering.

@yperess
Copy link
Collaborator Author

yperess commented Jul 6, 2023

@yperess it was stated all along the subsystem would provide not just arbitration and pub/sub like functionality but scheduling order over the nodes as well.

Referencing the above diagram, in the current design the data from physical sensor 1 is given to virtual sensor 0 before virtual sensor 1 by design. With arbitrary pub/sub this is lost.

  1. The consistent ordering makes diagnosing and debugging easier.
  2. State can be understood to always be up to date, e.g. deciding which filter in a filter bank to run and forward along. E.g. mux like functionality is possible in the DAG because ordering is known.

I think you may have hit on an interesting point here and we might need to take a step back to define arbitration. In CHRE and from my perspective arbitration means that you set some threshold but not an exact requirement. For example:

  • client 1 requests 100Hz
  • client 2 requests 200Hz
  • both clients will get 200Hz

In that sense the bank selector and filter banks will get the same sample rates from the sensor. If we don't arbitrate this way, we risk over complicating the system and also making a lot of assumptions on behalf of the consumer. In the above, do we report to client 1 every other sample? What if they configured a latency of 500ms? Do we buffer 100 samples then copy every other one to the final data? Or maybe just the tail 50? The general idea from other sensor frameworks is to over deliver. Client 1 will get 200Hz of data and it can choose what to use.

Lets assume topological ordering and consider the following sort of diagram

In this scenario the bank selector will be done before the banks themselves are given the sensor data as guaranteed by the scheduling order as I understand the subsystem today.

This means that filters in the filter bank that are not selected can skip processing entirely saving time. Data does not need to be kept in filter banks not currently used saving memory.

This can still be done with a pub/sub. When the bank selector produces a selection event, the bank filters that aren't selected can reduce their sample rate to 0 which means they will no longer get data.

With generalized pub/sub no guarantee is provided that the bank selector will be updated before the sensor data is given to the filters.

Suppose the bank selector updates at 1Hz while the sensor updates at 100Hz. This still works. No timestamp comparisons are needed or anything like that, no quirky logic involved. Each bank looks at its last selector update as its sole guidance on whether or not to do anything on each sensor update.

Now lets think about doing this same sort of thing with arbitrary pub sub where the ordering is not guaranteed.

To me at least, it looks like the logic involved would be more complex to deal with this sort of setup without topological ordering.

You're right, ordering is not guaranteed, but it does give you more flexibility and reduces the complexity. Going back to the DAG you provided, if bank 0 needs samples at 100Hz while bank 1 needs 50Hz and the selector at 1Hz what does the scheduling look like? It's much simpler to say everyone gets 100Hz and you can filter and take whatever samples you want from that. This is even more true when batching is involved.

Lets assume that we agree that everyone above gets 100Hz, the selector gets the sample first and only outputs data 1/100th. When the selector produces data, I'm assuming we'll publish that data before continuing to propagate the sensor data (so it's depth first, where the newest data is sent before the existing event finished propagating). Such a depth first pub is more expensive since memory has to be kept for longer. I would argue that if the events come out of order in the simple pub/sub model where the filter banks get the sensor data before the selector switches then you have 1/100 samples that are bad. If this is time critical, you could easily work around this by having the data come through the selector which would re-publish the selection and the data. I know this is getting into the weeds a bit, but I believe that keeping this simple and not adding a scheduler to the subsystem will make it much more maintainable as well as accessible with lower requirements.

@teburd
Copy link
Collaborator

teburd commented Jul 6, 2023

@yperess it was stated all along the subsystem would provide not just arbitration and pub/sub like functionality but scheduling order over the nodes as well.
Referencing the above diagram, in the current design the data from physical sensor 1 is given to virtual sensor 0 before virtual sensor 1 by design. With arbitrary pub/sub this is lost.

  1. The consistent ordering makes diagnosing and debugging easier.
  2. State can be understood to always be up to date, e.g. deciding which filter in a filter bank to run and forward along. E.g. mux like functionality is possible in the DAG because ordering is known.

I think you may have hit on an interesting point here and we might need to take a step back to define arbitration. In CHRE and from my perspective arbitration means that you set some threshold but not an exact requirement. For example:

  • client 1 requests 100Hz
  • client 2 requests 200Hz
  • both clients will get 200Hz

In that sense the bank selector and filter banks will get the same sample rates from the sensor. If we don't arbitrate this way, we risk over complicating the system and also making a lot of assumptions on behalf of the consumer. In the above, do we report to client 1 every other sample? What if they configured a latency of 500ms? Do we buffer 100 samples then copy every other one to the final data? Or maybe just the tail 50? The general idea from other sensor frameworks is to over deliver. Client 1 will get 200Hz of data and it can choose what to use.

Lets assume topological ordering and consider the following sort of diagram

In this scenario the bank selector will be done before the banks themselves are given the sensor data as guaranteed by the scheduling order as I understand the subsystem today.
This means that filters in the filter bank that are not selected can skip processing entirely saving time. Data does not need to be kept in filter banks not currently used saving memory.

This can still be done with a pub/sub. When the bank selector produces a selection event, the bank filters that aren't selected can reduce their sample rate to 0 which means they will no longer get data.

With generalized pub/sub no guarantee is provided that the bank selector will be updated before the sensor data is given to the filters.
Suppose the bank selector updates at 1Hz while the sensor updates at 100Hz. This still works. No timestamp comparisons are needed or anything like that, no quirky logic involved. Each bank looks at its last selector update as its sole guidance on whether or not to do anything on each sensor update.
Now lets think about doing this same sort of thing with arbitrary pub sub where the ordering is not guaranteed.
To me at least, it looks like the logic involved would be more complex to deal with this sort of setup without topological ordering.

You're right, ordering is not guaranteed, but it does give you more flexibility and reduces the complexity. Going back to the DAG you provided, if bank 0 needs samples at 100Hz while bank 1 needs 50Hz and the selector at 1Hz what does the scheduling look like? It's much simpler to say everyone gets 100Hz and you can filter and take whatever samples you want from that. This is even more true when batching is involved.

Lets assume that we agree that everyone above gets 100Hz, the selector gets the sample first and only outputs data 1/100th. When the selector produces data, I'm assuming we'll publish that data before continuing to propagate the sensor data (so it's depth first, where the newest data is sent before the existing event finished propagating). Such a depth first pub is more expensive since memory has to be kept for longer. I would argue that if the events come out of order in the simple pub/sub model where the filter banks get the sensor data before the selector switches then you have 1/100 samples that are bad. If this is time critical, you could easily work around this by having the data come through the selector which would re-publish the selection and the data. I know this is getting into the weeds a bit, but I believe that keeping this simple and not adding a scheduler to the subsystem will make it much more maintainable as well as accessible with lower requirements.

I think you might have over done it a bit here on the point I was trying to hit home on.

The bank selector data rate can be independent of the sensor. We cannot assume the bank selector will provide an update every time there is a sensor update. Without the topological sort ordering we don't know which will come first, or if we need to wait on one or the other to make a decision.

So we cannot simply use timestamps to infer what the latest is from the bank select without added logic. With topological sort and ordering we don't have to worry about that at all. The bank select will always be updated if needed before the next sample to the filter bank nodes is given and no additional logic or work is required to get this behavior. Without the topological sort there will absolutely need to be additional logic to deal with this potentially wasting resources. Not just hardware resources but also human resources in debugging and diagnosing problems with ordering.

@KeHIntel
Copy link
Collaborator

KeHIntel commented Jul 7, 2023

@yperess the discussion goes pretty long, and let me trim and summarize a little bit

  1. Device driver in user space

This works in my branch, what you need to do is create a partition for the associated data so it can be accessed from userspace.

I do not doubt that you can port sensor_driver_api to user space. My point is that, once it is moved to user space, it becomes something new rather than the standard Zephyr device driver. If it is neither standard Zephyr app nor standard Zephyr device driver, I doubt the value of introducing it to Zephyr.

  1. Memory management

Can you elaborate on this. When you say every sensor object I assume you mean each type (accel/gryo are separate right?). So the buffer only holds 1 sample? What happens when we configure the sensor to batching mode? If we set the ODR to 500Hz and set the latency to 200ms, we would need to have a buffer for 100 samples (ideally).

Each sensor instance, not each sensor type, and yes, it holds the last sample of the sensor, to handle the "HALL sensor data read 1 hour later case" I mentioned above. When virtual sensors become dynamically created, I don't think we can use pool to solve this problem. Batching is another topic, for batching, we will need a dedicated FIFO memory pool, that part is expected to be similar as RTIO.

  1. Multi-clients data post

If I understood correctly, in the model the client is the virtual sensor. In your above example you have 2 different algorithms (EKF and UKF). These would be 2 separate sensing_sensor_info nodes and each would be a client of A/G/M and produce data. The application to become a client of these virtual sensors each producing 100Hz of data. What you get is:

You understanding is correct, and this can be resolved if EKF and UKF algo post data to 2 separate sensing_sensor_info nodes. However, this can also result in unexpected bugs if EKF and UKF algo post data to same sensing_sensor_info node. My point is that this kind of risk can be well mitigated with current sensing subsystem, while the new proposed solution requires sensor developers to carefully avoid it

  1. Virtual sensor config

I don't think you need this extra scheduler. For 99% of cases you can just ignore it. But for the 1% you will not know at the time the data was requested. For example:

No need extra scheduler, sensing subsystem is single thread, it just scans the sensors according to the dependency order from the sensor tree, and selectively call callbacks of the sensors to manage their execution. In this way, sensors with no client do not run, unnecessary processing is not needed, and power is more optimized.

  1. Dead lock loop

Right, this was just an oversimplified example to show a loop that is valid (not necessarily optimized for anything). I'm not arguing that it's not possible to do this without a loop, but that there's no reason to prevent these loops since you can't predict what developers will want to do.

The stillness sensor example actually proved my point that allowing loop in sensor dependencies can easily lead to dead lock. If we upgrade stillness sensor to motion detect sensor, a dead lock happens.

  1. Sensor enumeration

Yes, but if you look at everything in Zephyr, devicetree is just a tool and is not a requirement. You could create a sensor driver using the DEVICE_DEFINE() macro without any devicetree. I'm not suggesting it's a nice way to do things, but we shouldn't assume that things will always be in devicetree. The current implementation actually assumes that every child node of the sensing subsystem is a sensor node. From what I can tell, this means that we're requiring the developers to always use devicetree. Also, it means that every node under the sensing subsystem parent node is required to be a sensor. It's limiting for no real reason.

I am confused, do you mean that you actually prefer dual methods in sensor configuration, while you hate to have dual path for sensor data streaming?

  1. Virtual sensor arbitration

On arbitration, I agree with what Tom commented in above

Without the topological sort there will absolutely need to be additional logic to deal with this potentially wasting resources. Not just hardware resources but also human resources in debugging and diagnosing problems with ordering.

Summary

  • By nature, virtual sensor and application are different (virtual sensor is data producer, application is not), so no matter it is virtual sensor inside sensing subsystem (current design), or sensor device driver in user space (new design), it is not something existed today in Zephyr, dual path cannot be avoided.
  • The current design of sensing subsystem can avoid a lot of potential issues (like arbitration, scheduling, dead lock, etc.) in framework level, which can make developer's life much easier, save them more time, and allow them to focus on sensor development.
  • In order to get ecosystem quickly involved, sensing subsystem has to be "grab-and-go", devicetree perfectly fits to this requirement, adding another mechanism which need additional code changes, allowing runtime links and co-exist with devicetree are making things unnecessarily complicated, and not friendly to ecosystem.

@huhebo
Copy link
Collaborator

huhebo commented Jul 7, 2023

I think @KeHIntel gives a very good and clear summarize in #58925 (comment)

And about 0.Device driver in user space and why we not used existing zephyr sensor_driver_api,
I can give more explanation, for easy to understand the core design ideas and each layers and API positions, please see below diagram:

sensing_subsystem_api

The key for the design is:

  • Sensing Subsystem
    • Core (blue box in the diagram)
      • Including Sensing Subsystem API(A in the diagram), Sensing Sensor API(B in the diagram) and core runtime, we should keep it minimal and stable, not allowed Application and Sensor developers to change.
    • Sensors
      • Extendable, Application and Sensor developers can follow the Sensing Sensor API to create new or overwrite existing sensors.
      • Everything in Sensing Subsystem World is Sensors, Physical sensors bridges outside world.
    • Sensor type based with 1:1 mapped value data structure (we always kept aligned right?)
      • Inside Sensing Subsystem World, no need data format (except some special Algo cases) conversions between sensors and its clients, all based on the sensor type and its value data structure definition.
      • Only Physical sensors need to do conversions like sensor type mapping and data format conversions.

This design can keep very clear boundary and interfaces between different worlds (Sensing Subsystem World, Device Driver World and Other Subsystem World) and keep a minimal & stable subsystem core with APIs, then plus extendable sensors, everyone can just focus on their interesting worlds and parts, it's very helpful for building sensing ecosystems and co-work with other worlds.

Just as @KeHIntel mentioned above, sensor_driver_api (D in the diagram) is defined and designed for device drivers, although we call them all are sensors in Sensing Subsystem and Device driver in kernel space, and in fact they have similar properties, but they do have essential differences, mixed them will cause confusing and critical problems, use sensor_driver_api for Sensing Subsystem sensors (D replace B in the diagram) is possible in technical, I don't see any valuable benefits but will cause many problems:

  • sensor channels VS. sensor types
    sensor_driver_api has no sensor type concept, all with discrete sensor data channels. But Sensing Subsystem is sensor type based, we need expose sensor types to Application client, who will do the sensor type with sensor channels mapping? In this case, only the Sensing Subsystem Core can do this, so, if Application developer want to create a new sensor type, then need change Sensing Subsystem Core code to add the new mapping and easy to introduce bugs, then we can't keep the Sensing Subsystem Core minimal and stable as described above.

  • Different sensor data value format
    sensor_driver_api used struct sensor_value per each channel, like the sensor_channel_get_t API, but Sensing Subsystem is sensor type based with 1:1 mapped value data structure definition like: struct sensing_sensor_value_3d_q31, as we agreed before to use QFormat to leverage zDSP for sensors in Sensing Subsystem, a virtual sensor need to do struct sensor_value -> QFormat after received data from report sensor and after process, convert back to struct sensor_value and return to subsystem. It's very strange and not efficiency. Since subsystem need to do sensor type with sensor channels mapping and data value conversion when pub data to Application client, consider a sensor pub data to Application client and another sensor client at the same time, this is real dual path (pub to Application is sensor type based, but pub to Sensor is sensor channel based, and they have different data formats)

  • Very confusing

    as @KeHIntel mentioned, sensor_driver_api is really designed for device drivers, many APIs not suitable for Sensing Subsystem, for example, the get_decoder and submit is really for async APIs based on RTIO, which will break current Sensing Subsystem's scheduling design, we can let Physical sensor to wrapper underlying sensor device driver with async APIs, but not want all sensors in Sensing Subsystem itselves to work as async sensors.

    __subsystem struct sensor_driver_api {
     sensor_attr_set_t attr_set;
     sensor_attr_get_t attr_get;
     sensor_trigger_set_t trigger_set;
     sensor_sample_fetch_t sample_fetch;
     sensor_channel_get_t channel_get;
     sensor_get_decoder_t get_decoder;
     sensor_submit_t submit;
    };

    You may say these async APIs can be ignored and not implemented, but it's very strange and confusing for developers, they don't understand when need and when not need implemented them.

    sensor_driver_api use trigger APIs for interrupts events, and call users/client's trigger callback (setted with trigger_set) to notify users/clients, then users/clients need call sample_fetch and channel_get to get the sensor data, this may work for physical sensors, but strange for virtual sensors.

    Reuse sensor_driver_api will make the boundary between Sensing Subsystem and Sensor Device Driver very obfuscate, and sensors in Sensing Subsystem and Device Driver Layer looks like very same but running in totally different space and context, and there're many special treatments needed for Sensing Subsystem as described above, so will confuse developers and cause bugs. And due to different requirements and usages, how to maintain this single APIs?

And for the arbitration:

I think you may have hit on an interesting point here and we might need to take a step back to define arbitration. In CHRE and from my perspective arbitration means that you set some threshold but not an exact requirement. For example:

  • client 1 requests 100Hz
  • client 2 requests 200Hz
  • both clients will get 200Hz

I know this definition in some applications (in Windows Application etc), but I prefer below definition especially in embedded system:

  • client 1 requests 100Hz
  • client 2 requests 200Hz
  • sensor works at 200Hz
  • client 1 gets 100Hz
  • client 2 gets 200Hz

Sensing Subsystem can pub data according to client's real rate requirements, for example, let sensor works at the highest frequency from arbitration with all active clients, pub data according to each client's real rate requirements, doing down-sampling (drop data etc) for the slow clients.

Since this is common logic, not make sense for each sensor to do it for themselves, and just as @teburd mentioned above, let each sensor to do this will waste processing resource, not efficiency, and it's hard for sensors to do by themselves, since they don't know the real data report rate to them, compare samples timestamps? seems not good.

And also agree with @teburd :

So we cannot simply use timestamps to infer what the latest is from the bank select without added logic. With topological sort and ordering we don't have to worry about that at all. The bank select will always be updated if needed before the next sample to the filter bank nodes is given and no additional logic or work is required to get this behavior. Without the topological sort there will absolutely need to be additional logic to deal with this potentially wasting resources. Not just hardware resources but also human resources in debugging and diagnosing problems with ordering.

At last, I strong agree with @KeHIntel in #58925 (comment) :

Summary

  • By nature, virtual sensor and application are different (virtual sensor is data producer, application is not), so no matter it is virtual sensor inside sensing subsystem (current design), or sensor device driver in user space (new design), it is not something existed today in Zephyr, dual path cannot be avoided.
  • The current design of sensing subsystem can avoid a lot of potential issues (like arbitration, scheduling, dead lock, etc.) in framework level, which can make developer's life much easier, save them more time, and allow them to focus on sensor development.
  • In order to get ecosystem quickly involved, sensing subsystem has to be "grab-and-go", devicetree perfectly fits to this requirement, adding another mechanism which need additional code changes, allowing runtime links and co-exist with devicetree are making things unnecessarily complicated, and not friendly to ecosystem.

@yperess
Copy link
Collaborator Author

yperess commented Jul 11, 2023

CC @XenuIsWatching

@yperess
Copy link
Collaborator Author

yperess commented Jul 17, 2023

@teburd sorry for the late reply on this.

Context:
At a Google/Intel sync, Tom asked how would the current sensor API work with the subsystem.

Answer
I'll be the first to admit that the decoder API needs a bit of work (fine since the current one is the first iteration and is still experimental). But the overall idea is:

  1. A request comes into the subsystem to read sensor type ACCEL
  2. The pipe driver (a driver implementation in userspace that wraps a real driver) will hold on to this sqe and fire off a read for the physical sensor using its own rtio context.
  3. When the physical sensor is read, the pipe driver will be notified and
    a. request a data block to hold the subsystems struct sensing_sensor_three_axis_data
    b. consume the physical driver's data + decoder and write the data into the allocated struct in (a)
    c. Publish the results to the original sqe
  4. Currently, the subsystem will consume the cqe produced from (3.c) which will contain the userdata as the struct sensing_sensor_info pointer and find all connections that need to be notified.

Number (4) above is where it gets a bit fuzzy because the driver implementation is expected to have the sensor data in the right format currently. My first iteration of the low level sensor drivers was to actually mirror what we wanted for the subsystem. Which I'm thinking might still be a good idea and is worth exploring. Basically the decoder wouldn't decode 1 q31_t node at a time, but instead decode a sensor type at a time. So for example, to decode an accelerometer reading you would pass in an allocated struct sensor_three_axis_data and a frame count. I don't want to get too deep into this because it's more implementation detail than the concern of this issue (which is really just "should we be reusing existing structures and APIs as much as possible?"). This would look something like (please don't hold me to this, it's just a working idea currently):

struct sensor_data_header {
  uint64_t timestamp_ns;
  uint16_t num_readings;
};
struct sensor_three_axis_data {
  struct sensor_data_header header;
  int8_t shift;
  struct {
    union {
      q31_t values[3];
      struct {
        q31_t x;
        q31_t y;
        q31_t z;
      };
    };
  } readings[1];
};

struct sensor_decoder_api {
  int (*get_reading_count)(
      const void *data,
      enum sensor_channel channel,
      uint16_t *num_readings
  );
  /* Cast 'out' to the right struct based on the channel being decoded.
   * For example, if the channel is SENSOR_CHAN_ACCEL_XYZ then 'out' should
   * be 'struct sensor_three_axis_data'.
   *
   * This function will read the 'out' value's header to see how much space is allocated,
   * then it'll start at the frame offset 'fit' and start decoding all the data of the appropriate
   * type. If the data is FIFO data interleaved ACCEL/GYRO, the channel is ACCEL_XYZ, and
   * the 'num_readings' is 3, the decoder is expected to decode the next 3 accelerometer
   * readings from 'data' and move 'fit' accordingly. This does mean that a separate frame
   * iterator is needed for gyro decoding in this model.
   */
  int (*decode)(
      const void *data,
      sensor_decoder_fit *fit, 
      enum sensor_channel channel,
      void *out
  );
};

@lixuzha
Copy link
Collaborator

lixuzha commented Jul 18, 2023

Hi @yperess

I have created a PR #60331 to elaborate how would the current sensor API work with the subsystem base on sensing sensor API.
Conceptually, it is no different from pipe driver.

  1. The pipe driver (a driver implementation in userspace that wraps a real driver) will hold on to this sqe and fire off a read for the physical sensor using its own rtio context.

@yperess
Copy link
Collaborator Author

yperess commented Jul 18, 2023

Hi @yperess

I have created a PR #60331 to elaborate how would the current sensor API work with the subsystem base on sensing sensor API.
Conceptually, it is no different from pipe driver.

  1. The pipe driver (a driver implementation in userspace that wraps a real driver) will hold on to this sqe and fire off a read for the physical sensor using its own rtio context.

See my comments in the PR, the way you set up the pipe is unlikely to work, especially when streaming data from a hardware FIFO.

@teburd
Copy link
Collaborator

teburd commented Jul 19, 2023

@yperess I think it would be really beneficial to see something definitive in the form of an RFC PR, I believe you have quite a lot done on an ongoing branch but nothing to sort of try, tinker with, and evaluate in a way that would be helpful for ongoing discussion.

Secondly I still strongly feel that the topological ordering provided by the sensing subsystem as it is, provides a lot of value. I've stated my sample scenario I believe clear enough. To reiterate my stance though, I feel timestamp comparisons and interpreting sample rates and last seen events seems like a complex way to deal with such ordering issues in my opinion.

Lastly I'm not quite sure what is meant by sensor types are not 1:1 chre aligned or what that has to do with a Zephyr sensing subsystem. Can you please elaborate on what exactly is meant by this and why its a blocking issue for a Zephyr sensing subsystem?

@yperess
Copy link
Collaborator Author

yperess commented Jul 20, 2023

@teburd I have #56963 right now which holds all of my WIP for sensors. It runs and you can play around with it on the TDK board.

For the topological order, I had a chat with @keith-zephyr about this and while I don't personally see a strong value, I'm 100% open to that feature being a Kconfig. So this way, when a new connection is opened or closed, we would sort the connection list (or do something to alter the order of data forwarding). I don't think that my proposal precludes it. The same is true for circular dependencies which the check can cause a runtime failure behind a Kconfig.

@lixuzha
Copy link
Collaborator

lixuzha commented Jul 25, 2023

Hello, @yperess , I have answered your comments on #60331.

I am afraid your below statement is inaccurate, #60331 can work with BMI160,LSM6DSO out of existing accel/gyro sensor device drivers with no change today. The rest icm42688 will need simple upgrade to have soft data_rdy bits implemented in driver.

the way you set up the pipe is unlikely to work, especially when streaming data from a hardware FIFO.

On FIFO, given no existing Zephyr sensor device driver supports FIFO today, the PR does not cover FIFO case for now.

@yperess
Copy link
Collaborator Author

yperess commented Jul 25, 2023

Hello, @yperess , I have answered your comments on #60331.

I am afraid your below statement is inaccurate, #60331 can work with BMI160,LSM6DSO out of existing accel/gyro sensor device drivers with no change today. The rest icm42688 will need simple upgrade to have soft data_rdy bits implemented in driver.

the way you set up the pipe is unlikely to work, especially when streaming data from a hardware FIFO.

On FIFO, given no existing Zephyr sensor device driver supports FIFO today, the PR does not cover FIFO case for now.

Can you explain? If you have 3 separate devices for accel/gyro/die_temp. Y
They each get a read request, it means they'll reach call fetch. You mentioned the bmi160, the fetch implementation waits for data ready at https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/sensor/bmi160/bmi160.c#L731 so that means that the 3 requests can only happen in succession.

While the FIFO APIs aren't merged yet, they're very close. Regardless, once the accel device in the current subsystem reads the FIFO, the gyro instance will starve. It's much easier to use a single device instance mapped to 3 info structs like I have in my branch.

@lixuzha
Copy link
Collaborator

lixuzha commented Jul 25, 2023

Can you explain? If you have 3 separate devices for accel/gyro/die_temp. Y They each get a read request, it means they'll reach call fetch. You mentioned the bmi160, the fetch implementation waits for data ready at https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/sensor/bmi160/bmi160.c#L731 so that means that the 3 requests can only happen in succession.

Sorry, you are right. I check the BMI160 code again. bmi160_sample_fetch in current BMI160 driver did not support separate A/G channels as L728:__ASSERT_NO_MSG(chan == SENSOR_CHAN_ALL);. But it can be enhanced by

--- a/drivers/sensor/bmi160/bmi160.c
+++ b/drivers/sensor/bmi160/bmi160.c
@@ -736,8 +736,25 @@ static int bmi160_sample_fetch(const struct device *dev,
                }
        }

-       if (bmi160_read(dev, BMI160_SAMPLE_BURST_READ_ADDR, data->sample.raw,
-                       BMI160_BUF_SIZE) < 0) {
+       uint8_t reg, size;
+       switch (chan) {
+       case SENSOR_CHAN_ACCEL_XYZ:
+               reg = BMI160_REG_DATA_ACC_X;
+               size = BMI160_AXES * sizeof(uint16_t);
+               break;
+       case SENSOR_CHAN_GYRO_XYZ:
+               reg = BMI160_REG_DATA_GYR_X;
+               size = BMI160_AXES * sizeof(uint16_t);
+               break;
+       case SENSOR_CHAN_ALL:
+               reg = BMI160_REG_DATA_GYR_X;
+               size = 2 * BMI160_AXES * sizeof(uint16_t);
+       default:
+               ret = -EINVAL;
+               goto out;
+       }
+
+       if (bmi160_read(dev, reg, data->sample.raw, size) < 0) {
                ret = -EIO;
                goto out;
        }

About the STATUS register, there are 3 solutions for different cases.

  • Solution 1: For some sensor like BMI160: BMI160_REG_STATUS did not be cleared by reading this register.
    In BMI160 datasheet https://www.mouser.com/datasheet/2/783/BST-BMI160-DS000-1509569.pdf
    Page No.51 Section 2.11.4 and No.53 Section 2.11.6.
    The dtdy_* bit gets reset when one byte of the register for sensor * is read.
    So we can do it like the above code.

  • Solution 2: use soft data ready bits for those sensor REG_STATUS be cleared by reading this register.
    We can do like this:

struct xxx_data {
    uint8_t status; // soft data ready bits
};

static int xxx_sample_fetch(const struct device *dev,
			       enum sensor_channel chan)
{
       struct xxx_data *data = dev->data;
       uint8_t status, mask;

       switch (chan) {
       case SENSOR_CHAN_ACCEL_XYZ:
               mask = ACC_DRDY_MASK;
               break;
       case SENSOR_CHAN_GYRO_XYZ:
               mask = GYRO_DRDY_MASK;
               break;
       case SENSOR_CHAN_ALL:
               mask = GYRO_DRDY_MASK | ACC_DRDY_MASK
       default:
               ret = -EINVAL;
               goto out;
       }
       if (data->status & mask != mask) {
              i2c_read(dev, REG_STATUS, &status);
              data->status |= status;
       }
       if (data->status & mask != mask) {
              ret = -EIO;
              goto out;
       }
       // fetch acc/gyro data by chan
       ...
       data->status &= ~mask; // clear dtdy_bits.
}

While the FIFO APIs aren't merged yet, they're very close. Regardless, once the accel device in the current subsystem reads the FIFO, the gyro instance will starve. It's much easier to use a single device instance mapped to 3 info structs like I have in my branch.

I'm still learning these patches. After that, I'll check how to enhance #60331 to support FIFO.

@yperess
Copy link
Collaborator Author

yperess commented Sep 2, 2023

An interesting statistic on the discussion of HID vs CHRE constants. Android (which uses CHRE is by far the biggest market share)

https://gs.statcounter.com/os-market-share#monthly-202111-202303

@KeHIntel
Copy link
Collaborator

KeHIntel commented Sep 7, 2023

An interesting statistic on the discussion of HID vs CHRE constants. Android (which uses CHRE is by far the biggest market share)

https://gs.statcounter.com/os-market-share#monthly-202111-202303

Understood that there are more Android devices than others, not a surprise, but I want to point out a great portion of Android devices do not have CHRE enabled, Android adoption != CHRE adoption, and CHRE is not an industry standard today.

@nashif
Copy link
Member

nashif commented Sep 7, 2023

isnt apple using HID as well? https://developer.apple.com/documentation/hiddriverkit/3201495-sensors, so in this case windows + ios + osx + linux = 57% > android 39%

@teburd
Copy link
Collaborator

teburd commented Jan 12, 2024

I believe these issues have been addressed in #64478 please verify!

@nashif
Copy link
Member

nashif commented Sep 23, 2024

this has been addressed long time ago, reopen if anything remains

@nashif nashif closed this as completed Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Sensor Subsystem RFC Request For Comments: want input from the community
Projects
Status: Done
Development

No branches or pull requests

8 participants