Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parser #714

Closed
wants to merge 6 commits into from
Closed

Conversation

breedx-splk
Copy link
Contributor

This builds on #711 and will require a rebase.

The existing web.xml parser was only looking for <display-name>. If a web.xml only contains a <servlet-name>, the current parser will return nothing. So this change improves that and can also use the <servlet-name> when the <display-name> is not present. If both are present, the display name is preferred.

Furthermore, it's valid to have multiple display names in the web.xml file, so the parser now just picks the first one.

This also fixes the test data, which originally did not have the <servlet> element in the tree, which masked another parsing bug.

setDisplayName = true;
List<String> nameElements = Arrays.asList("display-name", "servlet-name");
if ((names.size() < 2) // If we haven't already found both names
&& rootElementName.equals(
Copy link
Contributor

@laurit laurit Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit confusing to me. Previous check rootElementName.equals(currentElement.peek()) verified that currently handled <display-name> element is a child of the root element. Current check rootElementName.equals(currentElement.peekLast()) verifies that the document has the expected root element and accepts display-name in any position. If the desire is to just verify the root element you don't need the current element stack at all. <display-name> can appear in multiple positions in web.xml. It can appear as a child of the root <web-app> or under <servlet> or <filter> maybe more. Are you sure it makes sense to use <display-name> for servlet or filter, they are likely to be meaningless, something like Main servlet same for the <servlet-name>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I follow, and I think you have a good point. This change is really mixing up two things -- making the depth more flexible AND taking into account <service-name>.

Where I was coming from was an overly simplistic .war file that only had a single servlet definition and no display name. I had expected us to want to provide the servlet name as the fallback otel service name, but that was not happening. When putting this together, I also kinda forgot that a single <web-app> can support multiple servlet definitions (this is like brain archeology happening in realtime here).

In any case, I now think that this change is not right. I want to suggest that we leave it how it was (only using <display-name> immediately under <web-app>) but to enhance that by falling back to using either <display-name> or <servlet-name> of the first servlet in the event that the <display-name> could not be used. I could be convinced that we might only want to do that if there is exactly only ONE servlet defined.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test case and tweaked the data to make it a little clearer that the outer <display-name> is always preferred. @laurit lemme know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main question here is whether it is preferable to detect a bad name or no name at all. If your goal is to detect some kind of name no matter what then I think you should consider doing multiple passes over the descriptors. In first pass consider only the top level <display-name> and if that doesn't yield a result in any of the deployed apps try something else. Perhaps detecting name from different sources should be structured as different resource providers so they could be enabled/disabled separately? I suspect that using <servlet-name> won't always work well will as it yields the same name for different apps because the web.xml for newer app was copied from an older app and the main servlet has the same name. Maybe it would be more worthwhile to look for alternate sources for service name like maven metadata.

@breedx-splk
Copy link
Contributor Author

Closing to prevent merging while figuring out the improvement.

@breedx-splk breedx-splk closed this Feb 1, 2023
@breedx-splk breedx-splk reopened this Feb 1, 2023
@mateuszrzeszutek mateuszrzeszutek dismissed their stale review February 2, 2023 15:37

maybe it's not correct after all

@breedx-splk
Copy link
Contributor Author

Alright, it sounds like this may be of limited utility at the moment, so let's close it until there's stronger user-informed guidance around this.

@breedx-splk breedx-splk closed this Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants