Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WSGI & Flask integrations to follow new semantic conventions #299

Merged
merged 3 commits into from
Dec 9, 2019

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Nov 21, 2019

Updates flask & WSGI integration to follow new semantic conventions for HTTP as of open-telemetry/opentelemetry-specification#263 (see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md).

The HTTP client (requests) integration will need an update too, and for that the wsgi opentelemetry.ext.wsgi.http_status_to_canonical_code will need to be used by the requests-instrumentation -- either by just adding a dependency (since ext-wsgi is and will stay dependency-free itself, this would technically not be a problem), or by splitting this function into a new package. This will be done in a follow-up PR.

@Oberon00 Oberon00 marked this pull request as ready for review November 25, 2019 10:34
@Oberon00 Oberon00 requested a review from a-feld as a code owner November 25, 2019 10:34
@codecov-io
Copy link

codecov-io commented Nov 25, 2019

Codecov Report

Merging #299 into master will decrease coverage by 1.38%.
The diff coverage is 47.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #299      +/-   ##
==========================================
- Coverage   85.66%   84.28%   -1.39%     
==========================================
  Files          33       33              
  Lines        1612     1648      +36     
  Branches      180      195      +15     
==========================================
+ Hits         1381     1389       +8     
- Misses        185      204      +19     
- Partials       46       55       +9
Impacted Files Coverage Δ
...ry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py 69.02% <47.36%> (-21.89%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 693391c...bd0584d. Read the comment docs.

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments, nothing blocking.

# pylint:disable=too-many-branches,too-many-return-statements
if code < 100:
return StatusCanonicalCode.UNKNOWN
if code <= 299:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it would be worth adding a check for 200 at the top since i suspect that's going to be by large the status code we hit. Possibly a premature optimization though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't want to guess. 204 No Content is probably also a popular response code to POST requests and the like. I think before that I'd rather do things like the classical _StatusCanonicalCode=StatusCanonicalCode default parameter to avoid accessing globals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean a dict with 500 entries?

"http.server_name": environ["SERVER_NAME"],
"http.scheme": environ["wsgi.url_scheme"],
"host.port": int(environ["SERVER_PORT"]),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a case where any of these environment variables wouldn't be present? If so using environ.get may be better, unless raising an exception is preferable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to PEP3333, these are required: https://www.python.org/dev/peps/pep-3333/#environ-variables. I guess there might be nonconforming implementations out there. I have no strong opinion here. Maybe it would be better to come up with a convention of using try:/except: to make sure the integrations don't break applications.

if flavor.upper().startswith(_HTTP_VERSION_PREFIX):
flavor = flavor[len(_HTTP_VERSION_PREFIX) :]
if flavor:
result["http.flavor"] = flavor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use setifnotnone here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, since I use "" as second argument to get above, in order to unconditionally use upper and startswith. I think there is really no way to save a conditional here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right!

Conflicts:
	ext/opentelemetry-ext-flask/src/opentelemetry/ext/flask/__init__.py
	ext/opentelemetry-ext-flask/tests/test_flask_integration.py
	ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py
	ext/opentelemetry-ext-wsgi/tests/test_wsgi_middleware.py
Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and I think it couldn't hurt to protect against non-PEP3333-compliant envs as in https://github.com/open-telemetry/opentelemetry-python/pull/299/files#r350626307.

@c24t
Copy link
Member

c24t commented Dec 5, 2019

@Oberon00 let me know when you think this is ready to merge

@Oberon00
Copy link
Member Author

Oberon00 commented Dec 5, 2019

Unless you think I should address the non-PEP3333-compliant environ safety issues first, I think this is ready (but it might cause merge conflicts for the named tracers PR #301).

@toumorokoshi toumorokoshi added the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Dec 9, 2019
Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, looks like you had enough, but I guess another one wouldn't hurt :)

"http.status_code": 404,
"http.status_text": "NOT FOUND",
},
{"http.status_code": 404, "http.status_text": "NOT FOUND"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random and more general question: is it needed to have "status_text"? feels to me like it adds a lot to the payload and is often identical.

Copy link
Member Author

@Oberon00 Oberon00 Dec 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the status text is optional. But note that the integration does not insert static text here but actual "output" from the WSGI application. EDIT: See spec: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md#common-attributes

# pylint:disable=too-many-branches,too-many-return-statements
if code < 100:
return StatusCanonicalCode.UNKNOWN
if code <= 299:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.

elif not url.startswith(environ["wsgi.url_scheme"] + ":"):
# Something fishy is in RAW_URL. Let's fall back to request_uri()
url = wsgiref_util.request_uri(environ)
def setifnotnone(dic, key, value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should follow the underscore naming convention, even for this small method name.

else:
url = wsgiref_util.request_uri(environ)
result["http.url"] = wsgiref_util.request_uri(environ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels to me like http.url could be decomposed into http.target and http.host or something along those lines: otherwise if I wanted the target, I would have to know and check for url in the case that target did not exist, and perform a split to extract that myself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could decompose, but I see no benefit in it. Who do you mean with "myself" above? I think the back-end is a better place for such logic, the integrations should send information to the back-end as "raw" as they want as more raw usually means more info.

@c24t c24t merged commit 889baba into open-telemetry:master Dec 9, 2019
@c24t c24t removed the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Dec 11, 2019
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants