Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database semantic conventions may violate namespacing guidelines #713

Open
tigrannajaryan opened this issue Sep 28, 2022 · 12 comments
Open
Labels
help wanted Extra attention is needed

Comments

@tigrannajaryan
Copy link
Member

Problem Description

We have database conventions like this: db.name, db.statement, etc. Here the db is the namespace and we have attributes that are common for all database under this namespace. We have a large number of these.

We also have conventions like this: db.mssql.instance_name where db.mssql is the namespace. The implied idea is that database specific attributes are placed in db.<database-name> namespace, although it is not explicitly called out anywhere.

This is a problem. The enumeration is not bounded and can contain any value in the future. In the future we may need to add database-specific conventions support for a database that has a name that matches any of the numerous attributes under the db namespace.

However, it will be impossible because it will be a violation of namespacing guidelines, which say:

Names SHOULD NOT coincide with namespaces. For example if service.instance.id is an attribute name then it is no longer valid to have an attribute named service.instance because service.instance is already a namespace. Because of this rule be careful when choosing names: every existing name prohibits existence of an equally named namespace in the future, and vice versa: any existing namespace prohibits existence of an equally named attribute key in the future.

We have a situation when future evolution of semantic conventions may be impossible because of the current design.

Possible Solutions

I list a couple solutions below. If you can think of another way please comment so that we can discuss that too.

Solution 1

Move all database specific conventions to a properly isolated namespace, e.g. instead of using db.<database-name> as the namespace use db.special.<database-name> as the namespace or some other namespace that is guaranteed not to clash with any other attributes in other namespaces.

The downside is that we need to change existing conventions and also that database-specific conventions will use somewhat longer attribute names (db.special.cassandra.page_size is longer and less readable than db.cassandra.page_size that we use currently).

Solution 2

Explicitly call out that certain database names are disallowed. This list will contain everything that is already an attribute under db namespace. We can probably also reserve some names for use either as attributes under db namespace (and thus disallow them as database names) or as database names (and thus disallow them as attribute names).

Any future database that has a name that clashes with existing attribute under db namespace will need to have its name transformed such that it no longer conflicts with an attribute name.

For example if a hypothetical future database called "system" needs to have some specific attributes in the conventions then we can place such attribute under db.systemdb namespace to make sure it does not conflict with db.system generic attribute.

The benefit of this solution is that we don't need to change existing conventions.

@Oberon00
Copy link
Member

I think this is a problem that might stay theoretical. So I am in favor of solution 2. We could also suggest a common transformation for such cases. Appending db seems fine, we could also prepend tech_ or similar. We could then specify that "top-level" names must avoid a name ending in db for example.

@tigrannajaryan
Copy link
Member Author

@open-telemetry/specs-approvers any thoughts on this?

@reyang
Copy link
Member

reyang commented Sep 29, 2022

+1 with what @Oberon00 said #713

@hughesjj
Copy link

hughesjj commented Oct 3, 2022

I'd also be in favor of # 2, but I'd prefer any standard to the current undefined way to handle such a collision. I mostly agree with # 2 because it's less arduous to implement. I do like the explicitness of the "db.special" namespacing, but the pragmatist in me feels it would be too much disruption to solve an admittedly unlikely problem to occur.

If we do # 2, I propose we explicitly reserve the "*db" suffix for the sake of formality.

@tigrannajaryan
Copy link
Member Author

tigrannajaryan commented Oct 4, 2022

OK, I think we are all in agreement so far that #2 is the way to go. We need a PR that makes this clarification in the spec, both in database.md specifically and in attribute-naming.md generally.

@tigrannajaryan tigrannajaryan added the help wanted Extra attention is needed label Oct 4, 2022
@jack-berg jack-berg transferred this issue from open-telemetry/opentelemetry-specification Feb 7, 2024
@pyohannes
Copy link
Contributor

Any clarifications should also apply to messaging and RPC semantic conventions, where the same logic is used.

@trask
Copy link
Member

trask commented Apr 24, 2024

discussed in database semconv meeting:

  • we believe db.<vendor>.* is a good structure
  • we believe that the general issue here can be addressed after database semconv stability

@tigrannajaryan
Copy link
Member Author

  • we believe that the general issue here can be addressed after database semconv stability

@trask Ideally we should automate the enforcement of the general rule "names cant coincide with namespace" and be done with it.

@jack-berg
Copy link
Member

@trask Ideally we should automate the enforcement of the general rule "names cant coincide with namespace" and be done with it.

So you're suggesting just adding a build check to this repo to assert this?

@tigrannajaryan
Copy link
Member Author

@trask Ideally we should automate the enforcement of the general rule "names cant coincide with namespace" and be done with it.

So you're suggesting just adding a build check to this repo to assert this?

Yes.

@trask trask moved this to Post Stability in Database Client Semantic Conventions Apr 25, 2024
@pyohannes
Copy link
Contributor

discussed in database semconv meeting:

  • we believe db.<vendor>.* is a good structure
  • we believe that the general issue here can be addressed after database semconv stability

We discussed this in the messaging workgroup and we are in line with this conclusion.

@lmolkova
Copy link
Contributor

Related: #1068

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
Status: Post Stability
Development

No branches or pull requests

9 participants