Decide on data structure for Reporting client #29

flora-hofmann-frequenz · 2024-05-08T10:11:57Z

What's needed?

A data structure that can expose the metrics as well as the states per microgrid and component.

Proposed solution

Option to be discussed:

Full (artificial) mapping from protobuf to Python datatype
Helpers (such as namedtuples) to work

Use cases

Not only metrics, but also states, warnings and errors should be exposed.

Alternatives and workarounds

Any alternative ideas are welcome.

Additional context

As discussed in API usergroup meeting on 6th May 2024:
Currently, the ReportingApiClient only exposes MetricSamples using a namedtuple structure.

To additionally expose States:

At any given timestamp, a component can have multiple states, such as charging and relay_open
Practically, whenever you get a metric you can also get the states information

The example output structure from the protobuf is as follows with the entry point to the GRPC returning those messages.

> // Response containing historical microgrid component metrics in one or multiple microgrids
> //
> // Each microgrid's components are provided as timeseries data structures that encapsulate
> // metrics, bounds, errors and operational state and their associated timestamps for each component
> // within the specified time range.
> //
> // !!! example
> //     Example output structure:
> //     ```
> //     microgrids: [
> //       {
> //         microgrid_id: 1,
> //         components: [
> //           {
> //             component_id: 13,
> //             metric_samples: [
> //               /* list of metrics for multiple timestamps */
> //               { sampled_at: "2023-10-01T00:00:00Z", metric: "DC_VOLTAGE_V", sample: {...}, bounds: {...} },
> //               { sampled_at: "2023-10-01T00:00:00Z", metric: "DC_CURRENT_A", sample: {...}, bounds: {...} }
> //               { sampled_at: "2023-10-01T00:05:00Z", metric: "DC_VOLTAGE_V", sample: {...}, bounds: {...} },
> //               { sampled_at: "2023-10-01T00:05:00Z", metric: "DC_CURRENT_A", sample: {...}, bounds: {...} }
> //             ],
> //             states: [
> //               /* list of states for multiple timestamps */
> //               { sampled_at: "2023-10-01T00:00:13.12Z", states: [...], errors: [...], warnings: [...] },
> //               { sampled_at: "2023-10-01T00:02:22.01Z", states: [...], errors: [...], warnings: [...] },
> //               { sampled_at: "2023-10-01T00:05:02.32Z", states: [...], errors: [...], warnings: [...] },
> //             ]
> //           },
> //           {
> //             component_id: 243,
> //             metric_samples: [ ... ],
> //             states: [ ... ]
> //           },
> //         ]
> //       },
> //       {
> //         microgrid_id: 2,
> //         components: [ ... ]
> //       }
> //     ]
> //     ```

To retrieve MetricSample information, we use the following helper in the ReportingApiClient:

MetricSample = namedtuple(
    "MetricSample", ["timestamp", "microgrid_id", "component_id", "metric", "value"]
)
"""Type for a sample of a time series incl. metric type, microgrid and component ID

A named tuple was chosen to allow safe access to the fields while keeping the
simplicity of a tuple. This data type can be easily used to create a numpy array
or a pandas DataFrame.
"""

The text was updated successfully, but these errors were encountered:

llucax · 2024-05-08T11:28:35Z

One unrelated comment is I recommend using the typing.NamedTuple class, so you can add type hints to it too.

llucax · 2024-05-08T11:35:36Z

For me this is very hard to think about without knowing how do we plan to use this data. As long as we don't have a clear picture, I think the safest bet is to make the client retrieve the data as it comes in the API and then adapt it at another level. Without changing the protocol, we'll still have to do this transformation, and limiting what the user can do with the client artificially doesn't sound like a great idea.

cwasicki · 2024-05-16T21:04:12Z

I think the safest bet is to make the client retrieve the data as it comes in the API and then adapt it at another level.

Agree that there should be a way to retrieve the data as close as possible to the message definition. But I suggest to also keep the current approach as an alternative, since this format is already being used by multiple users and seems to be very useful. @llucax What is your idea of "another level"? A restricted client on top of the slim client? In the same repo?

llucax · 2024-05-17T08:07:27Z

What I mean I wouldn't make the adapted retrieval the only way to retrieve the data, maybe it is more clean in pseudo-code form:

adapted_data = await client.retrieve()  # Not for now, as we don't know if this will fit all use cases

adapted_data = adapt_data(await client.retrieve())  # Yes!

So if somebody needs to retrieve the data from the server in a non-adapted way (or adapt it differently) they can still use the client and don't have to get into the mess of using the raw bindings.

adapt_data() can be in the same repo for now, and I don't think it should be a whole new client for the adapted data if a simple function will suffice. We can see if it makes sense to split later, like we have with dispatch.

flora-hofmann-frequenz · 2024-05-23T08:36:57Z

@llucax: We were thinking of reusing some of the dataclasses in here.
What is your take on that?
Could add a _types.py module and then add/update the ListMicrogridComponentsDataResponse and add the adaptor (as mentioned above).

llucax · 2024-05-23T08:56:12Z

Sounds good in principle, I'm still not very involved in all the protobuf stuff in common that is shared between microgrid and reporting, so I feel I don't have a full picture to have a strong opinion, so please don't hold me accountable if I change my mind in the future when I get more involved 😆 But again I think we are still at an exploration stage so I guess this is all fine, right?

flora-hofmann-frequenz · 2024-05-23T09:22:25Z

Sounds good to me! 👍 Let me keep looking into it and share as I go along for further discussions.

flora-hofmann-frequenz added part:❓ We need to figure out which part is affected priority:❓ We need to figure out how soon this should be addressed type:enhancement New feature or enhancement visitble to users labels May 8, 2024

flora-hofmann-frequenz self-assigned this May 8, 2024

flora-hofmann-frequenz mentioned this issue May 8, 2024

Support request include option for states frequenz-floss/frequenz-client-reporting-python#25

Closed

flora-hofmann-frequenz mentioned this issue May 21, 2024

Include states in data structure frequenz-floss/frequenz-client-reporting-python#59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide on data structure for Reporting client #29

Decide on data structure for Reporting client #29

flora-hofmann-frequenz commented May 8, 2024 •

edited

Loading

llucax commented May 8, 2024

llucax commented May 8, 2024

cwasicki commented May 16, 2024

llucax commented May 17, 2024

flora-hofmann-frequenz commented May 23, 2024

llucax commented May 23, 2024

flora-hofmann-frequenz commented May 23, 2024

Decide on data structure for Reporting client #29

Decide on data structure for Reporting client #29

Comments

flora-hofmann-frequenz commented May 8, 2024 • edited Loading

What's needed?

Proposed solution

Use cases

Alternatives and workarounds

Additional context

llucax commented May 8, 2024

llucax commented May 8, 2024

cwasicki commented May 16, 2024

llucax commented May 17, 2024

flora-hofmann-frequenz commented May 23, 2024

llucax commented May 23, 2024

flora-hofmann-frequenz commented May 23, 2024

flora-hofmann-frequenz commented May 8, 2024 •

edited

Loading