Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ottl] Conversion of body into object #31132

Closed
djaglowski opened this issue Feb 8, 2024 · 9 comments
Closed

[ottl] Conversion of body into object #31132

djaglowski opened this issue Feb 8, 2024 · 9 comments
Labels

Comments

@djaglowski
Copy link
Member

djaglowski commented Feb 8, 2024

Component(s)

pkg/ottl

What happened?

Description

Given a log with a string body, users may wish to structure the body while retaining the original string. A simple and intuitive way to do this would be set(body["_raw"], body).

Expected Result

The body becomes an object containing key _raw with corresponding value of the original body.

Actual Result

logs transform error :transform request error: non 200 response. Status Code: 500, Message: failed to process payload in pipeline: failed to execute statement: set(body["_raw"], body), log bodies of type Str cannot be indexed

Collector version

v0.94.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@djaglowski djaglowski added bug Something isn't working needs triage New item requiring triage pkg/ottl labels Feb 8, 2024
Copy link
Contributor

github-actions bot commented Feb 8, 2024

Pinging code owners for pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

I haven't tested this snippet, but you should be able to do this using OTTL's cache feature:

- set(cache["body"], body)
- set(body, ParseJSON(body))
- set(body["_raw"], cache["body"])

@djaglowski
Copy link
Member Author

Thanks @evan-bradley. I wound up with something similar but found it unexpected that it didn't work as described.

@evan-bradley
Copy link
Contributor

I can see how that would be a useful shorthand, but I'm a little nervous about bending the types like that. Right now that error occurs because you need to explicitly overwrite the body to be a map before you can index it. I think we'd run into inconsistencies in the language if we added an exception for this.

If we implement the syntax as you described, I would see two cases where I would similarly expect implicit conversion, but if I wasn't thinking of that shorthand would be surprised that they work.

If we overwrite a key in the body without explicitly converting it, would we then just convert the body to an empty map and lose the original body value? If I'm keying the body, I would expect that the rest of the body exists in the map, but we don't know what format it's in and haven't parsed it yet.

set(body["key"], "value")

Similarly, if we can implicitly convert bodies to maps, it's not clear to me how or whether we should parse string bodies when trying to pull keys out of them.

set(attributes["key"], body["key"])

Generally we've tried to make OTTL require that users are very explicit, so if we introduced a shorthand like this I'd probably want to find a way to make it clearer what is happening here.

@djaglowski
Copy link
Member Author

If we overwrite a key in the body without explicitly converting it, would we then just convert the body to an empty map and lose the original body value?

This is how it works in pkg/stanza and I can say from experience it is used quite regularly. e.g.

- type: move
  from: body
  to: body.raw

If the body is not an object, it's almost always just a string or other simple value, in which case you are just overwriting one value with one of a different type. I don't see this as very different than overwriting a string value with an int value, etc.

That said, I do think the implicit part is that it creates an object and sets the key/value in one step. I can see how this part is abnormal, but it should be easy to get there by composing "create an object" and "set a key to the value of body". This got me thinking that maybe set(body, ParseJSON("{\"_raw\":body}) could work. It doesn't, but even then it seems odd that the user would have to resort to a ParseJSON function as a workaround.

Similarly, if we can implicitly convert bodies to maps, it's not clear to me how or whether we should parse string bodies when trying to pull keys out of them.

I think there is an important distinction here. Parsing a string (whether the body or otherwise) adds structure by isolating pieces of the string. I don't think there's anything wrong with the way we're handling parsing. What I'm suggesting is that is should be possible to directly define a structure and assign it.

Having teased out some of the nuance here, I think maybe we just need an easy way to 1) create an object, 2) assign the body to a key. Then, the third step is just a familiar set. Maybe something like set(body, Map("_raw", body)) or set(body, set(NewObject(), "_raw", body))?

In any case, I think this is a common enough use case where this looks like a sharp edge to users and would strongly recommend we provide a simple way to do this.

@evan-bradley
Copy link
Contributor

I think I understand better now, thanks for the detail.

I think allowing the user to create maps would make sense, either through a Map function or through introducing map literals into the grammar. I would see the result looking something like set(body, Map([["_raw", body], ["key2", "value2"]])) or set(body, {"_raw": body, "key2": "value2"}). Map literals will probably end up being more intuitive. The resulting statement sequence would probably look like this, depending on exactly what the user is looking for:

- set(body, {"_raw": body})
- set(body["mykey"], "myvalue")
- merge_maps(body, ParseJSON(body["_raw"]))

If it's a common enough use case, we could also add an optional parameter to our parse functions to streamline this just a bit. The above would become:

- set(body, ParseJSON(body, keep_raw="_raw"))
- set(body["mykey"], "myvalue")

@TylerHelmuth
Copy link
Member

Map literals would be very helpful.

@evan-bradley evan-bradley added enhancement New feature or request priority:p2 Medium and removed bug Something isn't working needs triage New item requiring triage labels Feb 13, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Apr 15, 2024
@evan-bradley
Copy link
Contributor

I think we're in agreement about a path forward, so I've created an issue here to consolidate the discussion and track the feature: #32388. I'm going to close this issue for now, @djaglowski if you feel like the linked issue is missing anything please feel free to comment there or reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants