[pkg/ottl] Validate that all parts of a path are used by a Context #22744

evan-bradley · 2023-05-24T15:07:04Z

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

The contexts don't do any validation that extraneous path parts or indices are passed by a user in a statement. For example, the following paths both just access the name path and ignore the extra parts:

name.data
name["data"]
name[0]

This can be confusing to users, who will get data they don't expect from what is really a broken path.

Describe the solution you'd like

We need to enforce that Contexts use all parts of a path that have been parsed from a statement. While we could rely on the Contexts' path parsers to check this themselves, I think we should try to enforce this in the OTTL parser and relieve Contexts of the burden of enforcing these checks.

The first part of this should involve making the OTTL AST structure (everything in grammar.go) internal to the OTTL and providing Contexts only with structures created by the OTTL parser. This will make additions to the DSL easier (no need to make sweeping changes to the Contexts when changing the grammar) and should make for a cleaner overall structure.

As part of this, I think we should offer Contexts with structs that look something like the below for use in their path parsing functions. Note that these are very rough drafts and will need to have their implementations revised.

type Path struct {
    // paths is for internal tracking of path objects.
    paths []ottl.Path
    
    // Name returns the name of this path.
    Name() string
    
    // Keys gets the first Key for this path. Subkeys can be obtained
    // through Key.Next.
    Keys() Key

    // Next gets the next Path. The second value returns whether
    // there is another Path available.
    // Next gets the Next path by returning paths[0], which has
    // paths[1:] as its internal slice.
    Next() (Path, bool)

    // isComplete is a method to be used by the OTTL parser to
    // tell the parser whether all paths and keys were used by
    // the Context.
    isComplete() bool
}

type Key struct {
    // keys is for internal tracking of path objects.
    keys []ottl.Key

    // String gets a string key, or nil if this isn't a string key.
    String() *string

    // Int gets an int key, or nil if this isn't an int key.
    Int() *int

    // Next gets the next Key. The second value returns whether
    // there is another Key available.
    // Next gets the Next key by returning keys[0], which has
    // keys[1:] as its internal slice.
    Next() (Key, bool)
}

Describe alternatives you've considered

There may be other changes to the internal structure of the OTTL that I haven't considered here that would help accomplish this.

Additional context

No response

The text was updated successfully, but these errors were encountered:

TylerHelmuth · 2023-05-24T15:16:45Z

@evan-bradley I am fully onboard with finally separating the AST from the Contexts and passing in a standard structure for the contexts to use.

For the actual validation, I want to make sure I am understanding your idea. Is the proposal to create the struct in the Parser, hand it to the Context for use, have the struct keep track of how it is used, and then once the Context is done parsing the path for the Parser to check the struct to see how it was used?

evan-bradley · 2023-05-24T15:33:41Z

Yeah, that's exactly it.

TylerHelmuth · 2023-05-24T15:46:25Z

Sounds like a good strategy to me, that'll save any context creator a bunch of work.

TylerHelmuth · 2023-06-12T18:41:45Z

@evan-bradley ran into an interesting, but common, situation we need to consider with this proposal.

Consider the situation where a multiple paths are supplied, such as body.string for ottllog. If the context checks for Next(), successfully sees one, and then that path's name is not string what should happen?

case "body":
  nextPath, ok := path.Next()
  if ok {
	  if nextPath.Name() == "string" {
		  return accessStringBody(), nil
	  }
          // what should happen here?
  } else {
	  keys := path.Keys()
	  if keys == nil {
		  return accessBody(), nil
	  }
	  return accessBodyKey(*keys), nil
  }

At the moment the situation would cause the pathParser function to return a default fmt.Errorf("invalid path expression %v", path) error, which seems appropriate. But this means that the struct is not handling the incorrect name itself.

These are the error situations I've classified so far

A path name is not recognized
Extra paths are provided but expected
A key is supplied on a path that cannot accept keys

Of these errors, I believe the context must be involved with 1 and 3. I also believe that the context is handling 1 and 3 today; any time a context would accept a second path it will error if the second path is not what it expects.

We can continue to let the contexts handle 1 and 3 and rely on the new grammar-hiding struct to handle case 2. This would spread out the error handling responsibilities tho.

TylerHelmuth · 2023-06-12T21:40:18Z

@evan-bradley here is the current API I'm working with as I explore this issue, please review:

func newPath(fields []field) *Path {
	p := newPathHelper(fields)
	if p == nil {
		return nil
	}
	p.fetched = true
	return p
}

func newPathHelper(fields []field) *Path {
	if len(fields) == 0 {
		return nil
	}
	return &Path{
		name:     fields[0].Name,
		key:      newKey(fields[0].Keys),
		nextPath: newPath(fields[1:]),
	}
}

type Path struct {
	name     string
	key      *Key
	nextPath *Path
	fetched  bool
}

func (p *Path) Name() string {
	return p.name
}

func (p *Path) Next() (Path, bool) {
	if p.nextPath == nil {
		return Path{}, false
	}
	p.nextPath.fetched = true
	return *p.nextPath, true
}

func (p *Path) Keys() *Key {
	return p.key
}

func (p *Path) isComplete() error {
	if !p.fetched {
		return fmt.Errorf("path section '%v' was not fetched", p.name)
	}
	if p.nextPath == nil {
		return nil
	}
	return p.nextPath.isComplete()
}

func newKey(keys []key) *Key {
	if len(keys) == 0 {
		return nil
	}
	return &Key{
		s:       keys[0].String,
		i:       keys[0].Int,
		nextKey: newKey(keys[1:]),
	}
}

// map value manipulation in `contexts/internal` code and tests requires being able to create a new Key.
// Also added `Set*` functions.
func NewEmptyKey() Key {
	return Key{}
}

type Key struct {
	s       *string
	i       *int64
	nextKey *Key
}

func (k *Key) String() *string {
	return k.s
}

func (k *Key) Int() *int64 {
	return k.i
}

func (k *Key) Next() *Key {
	if k.nextKey == nil {
		return nil
	}
	return k.nextKey
}

func (k *Key) SetString(s *string) {
	k.s = s
}

func (k *Key) SetInt(i *int64) {
	k.i = i
}

func (k *Key) SetNext(nk *Key) {
	k.nextKey = nk
}

evan-bradley · 2023-06-14T16:02:11Z

These are the error situations I've classified so far

A path name is not recognized

Extra paths are provided but expected

A key is supplied on a path that cannot accept keys

Of these errors, I believe the context must be involved with 1 and 3. I also believe that the context is handling 1 and 3 today; any time a context would accept a second path it will error if the second path is not what it expects.

We can continue to let the contexts handle 1 and 3 and rely on the new grammar-hiding struct to handle case 2. This would spread out the error handling responsibilities tho.

The error situations you outlined make sense to me. In my testing, the contexts currently only handle the first error case. In the span context I can access name["key"] without an error being thrown since the handler for the name path doesn't consider whether any keys were passed. I expect that only case 1 will be handled by contexts going forward.

You make a good point about the error handling being spread out a bit, but I think that's okay since it relieves the contexts of the burden of handling cases where extra paths or keys are provided and just focusing on the cases where paths are invalid. Additionally, when I tried to exhaustively handle 2 and 3 in the contexts I couldn't do it without the logic getting out of hand pretty quickly.

I think what distinguishes 1 from 2 and 3 is that you can validate 2 and 3 without knowing anything about the context, as where you can't do that with 1.

evan-bradley · 2023-06-14T17:35:24Z

here is the current API I'm working with as I explore this issue, please review

Overall what you have looks good to me. Two nitpicky thoughts:

If we have interfaces for Path and Key, we could have OTTL implement two structs: one that is used and only constructable by the parser, and another that can be constructed anywhere but isn't used by the parser for validation. This would allow getting rid of the Set methods for Key, though I realize it adds some internal complexity.
I think we could optimize the structure just a bit if we kept a slice of grammar objects internally in the struct and just updated the current index on calling Next. That way we don't have to iterate through the objects at parsing time and isComplete could just be a quick O(1) check for i == len(slice) - 1. With the current length of our paths though, this is probably a meaningless optimization, so the linked list approach works fine too.

TylerHelmuth · 2023-06-14T18:10:29Z

If we have interfaces for Path and Key, we could have OTTL implement two structs: one that is used and only constructable by the parser, and another that can be constructed anywhere but isn't used by the parser for validation. This would allow getting rid of the Set methods for Key, though I realize it adds some internal complexity.

I hate those Set methods, I'll look into this.

I think we could optimize the structure just a bit if we kept a slice of grammar objects internally in the struct and just updated the current index on calling Next

Linked list was the first approach I tried bc it felt natural but I can play around with this idea as well.

github-actions · 2023-08-14T03:30:33Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley

See Adding Labels via Comments if you do not have permissions to add labels yourself.

TylerHelmuth · 2023-08-14T16:53:23Z

I don't currently have time to work on this issue so someone else can pick it up. If it is still around later this year I will pick it back up.

github-actions · 2023-10-16T03:30:00Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley

See Adding Labels via Comments if you do not have permissions to add labels yourself.

**Description:** This is the first PR working towards #22744. It adds 2 new interfaces, `path` and `key`, that will be given to contexts to use when interpreting the paths/keys passed from the grammar. Eventually we'll be able to use the `isComplete` functions during parsing to ensure that all parts of the path were used by the context. By the end of that issue we'll have removed all public API's of OTTL's grammar, which is going to cause massive breaking changes to all our contexts and huge PRs. To help decrease the amount of changes per PR these new interfaces are not yet exported and not yet implemented. To see them partially in use checkout #29703. **Link to tracking Issue:** Related to #22744 **Testing:** Added unit tests **Documentation:** <Describe the documentation added.> I'll add godoc comments once these things are made public --------- Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>

**Description:** In order to support dynamic indexing we need the `key.Name` method to be able to invoke Getters during hot path execution. **Link to tracking Issue:** <Issue number if applicable> Related to #22744 **Testing:** <Describe what testing was performed and which tests were added.> Updated tests

This PR publicizes and implements the new `Path` and `Key` interfaces. These interfaces allow contexts to get access to the parsed paths/keys without needed to export the grammar's structs. Since everything was using those structs before, this has significant breaking changes. This PR should not change any functionality. Related to #22744

**Description:** On the continuing quest to unexport the grammar, this PR unexports the grammar's constants. Depends on #29924 Link to tracking Issue: Related to #22744

**Description:** Updates OTTL to be able to detect when a context has not used all parts of a path and its keys. It only validates that each path/key was grabbed - it cannot validate that the context used the value "correctly". An example error message, taken from the unit test, looks like: ``` error while parsing arguments for call to "testing_getsetter": invalid argument at position 0: the path section "not-used" was not used by the context - this likely means you are using extra path sections ``` **Link to tracking Issue:** Closes #22744 **Testing:** Added new unit tests --------- Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>

**Description:** The desire to validate both path segments AND keys led to a bug where a totally valid statement like `replace_match(body["metadata"]["uid"], "*", "12345")` fails since only `metadata` is checked during parsing; `uid` is checked during hot-path get/set of the value. Failing such a statement is not the intention of #22744 and it was incorrect to fail such a statement. In fact, I believe validating the key's use in the context will be even more complex once we introduce dynamic indexing. For these reasons, this PR removes Key validation for now. If, in the future, we want to re-add these validations, our Interfaces allow that. **Link to tracking Issue:** #22744 #30051 **Testing:** <Describe what testing was performed and which tests were added.> Unit tests Also we wouldve caught this issue earlier if we had an e2e test that did complex indexing but unfortunately we did in the transform processor. All the more reason to implement #28642.

…try#29883) **Description:** In order to support dynamic indexing we need the `key.Name` method to be able to invoke Getters during hot path execution. **Link to tracking Issue:** <Issue number if applicable> Related to open-telemetry#22744 **Testing:** <Describe what testing was performed and which tests were added.> Updated tests

This PR publicizes and implements the new `Path` and `Key` interfaces. These interfaces allow contexts to get access to the parsed paths/keys without needed to export the grammar's structs. Since everything was using those structs before, this has significant breaking changes. This PR should not change any functionality. Related to open-telemetry#22744

**Description:** On the continuing quest to unexport the grammar, this PR unexports the grammar's constants. Depends on open-telemetry#29924 Link to tracking Issue: Related to open-telemetry#22744

…0042) **Description:** Updates OTTL to be able to detect when a context has not used all parts of a path and its keys. It only validates that each path/key was grabbed - it cannot validate that the context used the value "correctly". An example error message, taken from the unit test, looks like: ``` error while parsing arguments for call to "testing_getsetter": invalid argument at position 0: the path section "not-used" was not used by the context - this likely means you are using extra path sections ``` **Link to tracking Issue:** Closes open-telemetry#22744 **Testing:** Added new unit tests --------- Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>

**Description:** The desire to validate both path segments AND keys led to a bug where a totally valid statement like `replace_match(body["metadata"]["uid"], "*", "12345")` fails since only `metadata` is checked during parsing; `uid` is checked during hot-path get/set of the value. Failing such a statement is not the intention of open-telemetry#22744 and it was incorrect to fail such a statement. In fact, I believe validating the key's use in the context will be even more complex once we introduce dynamic indexing. For these reasons, this PR removes Key validation for now. If, in the future, we want to re-add these validations, our Interfaces allow that. **Link to tracking Issue:** open-telemetry#22744 open-telemetry#30051 **Testing:** <Describe what testing was performed and which tests were added.> Unit tests Also we wouldve caught this issue earlier if we had an e2e test that did complex indexing but unfortunately we did in the transform processor. All the more reason to implement open-telemetry#28642.

evan-bradley added enhancement New feature or request priority:p2 Medium pkg/ottl labels May 24, 2023

evan-bradley mentioned this issue May 24, 2023

Transform processor cannot operate on body if it is a map #22068

Closed

TylerHelmuth added the help wanted Extra attention is needed label May 30, 2023

TylerHelmuth self-assigned this Jun 7, 2023

TylerHelmuth removed the help wanted Extra attention is needed label Jun 7, 2023

github-actions bot added the Stale label Aug 14, 2023

TylerHelmuth removed the Stale label Aug 14, 2023

TylerHelmuth removed their assignment Aug 14, 2023

TylerHelmuth added the help wanted Extra attention is needed label Aug 14, 2023

github-actions bot added the Stale label Oct 16, 2023

TylerHelmuth removed the Stale label Oct 17, 2023

TylerHelmuth mentioned this issue Nov 5, 2023

[pkg/ottl] Promote OTTL to Beta Stability #28892

Open

11 tasks

TylerHelmuth self-assigned this Dec 7, 2023

TylerHelmuth removed the help wanted Extra attention is needed label Dec 7, 2023

This was referenced Dec 8, 2023

[pkg/ottl] Ottl validate path #29703

Closed

[pkg/ottl] Add private interface for path and key #29704

Merged

TylerHelmuth mentioned this issue Dec 14, 2023

[pkg/ottl] Update key.Name to be able to invoke a Getter #29883

Merged

TylerHelmuth mentioned this issue Dec 15, 2023

[pkg/ottl] Implement Path/Key interfaces #29897

Merged

This was referenced Dec 15, 2023

[pkg/ottl] Unexport the grammar's enumSymbol #29924

Closed

[pkg/ottl] Unexport the grammar's constants #29925

Merged

TylerHelmuth mentioned this issue Dec 18, 2023

[pkg/ottl] Validate that all path elements are used #30042

Merged

TylerHelmuth closed this as completed in #30042 Dec 18, 2023

TylerHelmuth mentioned this issue Dec 18, 2023

[pkg/ottl] Don't check that all keys are used #30047

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pkg/ottl] Validate that all parts of a path are used by a Context #22744

[pkg/ottl] Validate that all parts of a path are used by a Context #22744

evan-bradley commented May 24, 2023

TylerHelmuth commented May 24, 2023

evan-bradley commented May 24, 2023

TylerHelmuth commented May 24, 2023 •

edited

Loading

TylerHelmuth commented Jun 12, 2023

TylerHelmuth commented Jun 12, 2023

evan-bradley commented Jun 14, 2023

evan-bradley commented Jun 14, 2023 •

edited

Loading

TylerHelmuth commented Jun 14, 2023

github-actions bot commented Aug 14, 2023

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Oct 16, 2023

[pkg/ottl] Validate that all parts of a path are used by a Context #22744

[pkg/ottl] Validate that all parts of a path are used by a Context #22744

Comments

evan-bradley commented May 24, 2023

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

TylerHelmuth commented May 24, 2023

evan-bradley commented May 24, 2023

TylerHelmuth commented May 24, 2023 • edited Loading

TylerHelmuth commented Jun 12, 2023

TylerHelmuth commented Jun 12, 2023

evan-bradley commented Jun 14, 2023

evan-bradley commented Jun 14, 2023 • edited Loading

TylerHelmuth commented Jun 14, 2023

github-actions bot commented Aug 14, 2023

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Oct 16, 2023

TylerHelmuth commented May 24, 2023 •

edited

Loading

evan-bradley commented Jun 14, 2023 •

edited

Loading