Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DroppedAttributeCount field to Otel/STEF schema #22

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

tigrannajaryan
Copy link
Collaborator

@tigrannajaryan tigrannajaryan commented Jan 24, 2025

Resolves #8

Otel/STEF schema changes:

  • Added "DroppedAttributeCount" field in the "Resource", "Scope", "Span", etc structs.

Also updated testdata files (because the schema is now different) and renamed the incorrect ".tefz" to ".stefz" extension.

@tigrannajaryan tigrannajaryan marked this pull request as ready for review January 24, 2025 15:25
@dmitryax
Copy link
Collaborator

Just to clarify, why does the dropped attributes field need to be embedded in the Attributes?

@tigrannajaryan
Copy link
Collaborator Author

Just to clarify, why does the dropped attributes field need to be embedded in the Attributes?

@dmitryax it does not have to be embedded. We can make it a sibling of Attributes. The reason I embedded it is that we have a repeating pattern of AttributesList and DropedAttributeCount used as a pair in a few place.

I was also a bit hesitant about introducing it, so I don't mind just making the fields siblings if you feel strong about it.

@tigrannajaryan
Copy link
Collaborator Author

@dmitryax BTW, one other somewhat related thing to keep in mind is how STEF handles wire format compatibility.

Unlike ProtoBuf the only way to evolve STEF schema in a way that is backward compatible is by adding new fields at the very end of "struct" or "oneof" types. Adding new fields in the "middle" of a struct or oneof breaks wire compatibility in STEF (but is not a problem in ProtoBuf since the fields are numbered). STEF allows adding such fields at the end of any struct, including nested structs, not just the root struct.

Because of this limitation it is beneficial to design the initial schema in a way that logically related fields are grouped into their own struct, so that if another logically related field needs to be added it can be added to the relevant nested struct, instead of being added as the last struct in a large flat root struct.

I don't think we will want to adding any other fields that are related to Attributes in the future, but if we ever did then having "Attributes" as a separate struct would result in a bit nicer schema. Not a dealbreaker, just a nice to have thing.

@dmitryax
Copy link
Collaborator

Thanks for clarifying. That sounds good to me, assuming that there is no performance overhead of introducing the wrapping struct

@tigrannajaryan
Copy link
Collaborator Author

There is a small performance penalty for adding a nested struct. Hard to benchmark, the difference is very small:

                                │  baseline   │              sibling               │               nested               │
                                │   sec/op    │   sec/op     vs base               │   sec/op     vs base               │
SerializeNative/STEF/none-10      4.378m ± 5%   4.259m ± 4%       ~ (p=0.078 n=12)   4.375m ± 2%       ~ (p=0.799 n=12)
DeserializeNative/STEF/none-10    1.768m ± 0%   1.762m ± 0%       ~ (p=0.319 n=12)   1.793m ± 1%  +1.42% (p=0.000 n=12)
SerializeFromPdata/STEF/none-10   93.61m ± 2%   92.97m ± 1%       ~ (p=0.198 n=12)   92.75m ± 1%       ~ (p=0.242 n=12)
DeserializeToPdata/STEF/none-10   23.25m ± 1%   22.96m ± 1%  -1.25% (p=0.010 n=12)   23.14m ± 1%       ~ (p=0.114 n=12)
geomean                           11.39m        11.25m       -1.25%                  11.39m       -0.02%

                                │  baseline   │              sibling               │               nested               │
                                │  sec/point  │  sec/point   vs base               │  sec/point   vs base               │
SerializeNative/STEF/none-10      65.48n ± 5%   63.70n ± 4%       ~ (p=0.078 n=12)   65.44n ± 2%       ~ (p=0.832 n=12)
DeserializeNative/STEF/none-10    26.45n ± 0%   26.36n ± 0%       ~ (p=0.271 n=12)   26.82n ± 1%  +1.40% (p=0.000 n=12)
SerializeFromPdata/STEF/none-10   1.400µ ± 2%   1.391µ ± 1%       ~ (p=0.173 n=12)   1.387µ ± 1%       ~ (p=0.247 n=12)
DeserializeToPdata/STEF/none-10   347.8n ± 1%   343.4n ± 1%  -1.25% (p=0.010 n=12)   346.1n ± 1%       ~ (p=0.110 n=12)
geomean                           170.4n        168.3n       -1.26%                  170.4n       -0.03%

                                │   baseline   │               sibling               │               nested                │
                                │     B/op     │     B/op      vs base               │     B/op      vs base               │
SerializeNative/STEF/none-10      3.523Mi ± 0%   3.523Mi ± 0%       ~ (p=0.590 n=12)   3.525Mi ± 0%  +0.05% (p=0.039 n=12)
DeserializeNative/STEF/none-10    843.9Ki ± 0%   844.3Ki ± 0%  +0.05% (p=0.000 n=12)   847.0Ki ± 0%  +0.36% (p=0.000 n=12)
SerializeFromPdata/STEF/none-10   124.7Mi ± 0%   124.6Mi ± 0%       ~ (p=0.443 n=12)   127.3Mi ± 0%  +2.09% (p=0.000 n=12)
DeserializeToPdata/STEF/none-10   29.82Mi ± 0%   29.82Mi ± 0%  +0.00% (p=0.000 n=12)   29.82Mi ± 0%  +0.01% (p=0.000 n=12)
geomean                           10.19Mi        10.19Mi       -0.00%                  10.26Mi       +0.63%

                                │  baseline   │              sibling               │               nested               │
                                │  allocs/op  │  allocs/op   vs base               │  allocs/op   vs base               │
SerializeNative/STEF/none-10      2.772k ± 0%   2.775k ± 0%  +0.11% (p=0.000 n=12)   2.784k ± 0%  +0.43% (p=0.000 n=12)
DeserializeNative/STEF/none-10    1.368k ± 0%   1.374k ± 0%  +0.44% (p=0.000 n=12)   1.384k ± 0%  +1.17% (p=0.000 n=12)
SerializeFromPdata/STEF/none-10   256.3k ± 0%   256.3k ± 0%       ~ (p=0.877 n=12)   256.3k ± 0%       ~ (p=0.943 n=12)
DeserializeToPdata/STEF/none-10   623.5k ± 0%   623.5k ± 0%  +0.00% (p=0.000 n=12)   623.5k ± 0%  +0.00% (p=0.000 n=12)
geomean                           27.90k        27.94k       +0.13%                  28.01k       +0.40%

@tigrannajaryan
Copy link
Collaborator Author

@dmitryax I spent a bit more time thinking about this and I think it is best to keep the Otel/STEF schema as close as possible to OTLP Protobufs. I will switch to the sibling schema for DroppedAttributesCount.

Resolves #8

Otel/STEF schema changes:

- Added "DroppedAttributeCount" field in the "Resource", "Scope", "Span", etc structs.

Also updated testdata files (because the schema is now different) and renamed
the incorrect ".tefz" to ".stefz" extension.
@tigrannajaryan tigrannajaryan merged commit 3332a0f into main Jan 29, 2025
2 checks passed
@tigrannajaryan tigrannajaryan deleted the tigran/dropped-attrs branch January 29, 2025 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add DroppedAttributeCount field to Otel/STEF schema
2 participants