Feature/tracecontext integration test #228

toumorokoshi · 2019-10-21T04:33:46Z

This introduces the w3c tracing validation service as an integration test for opentelemetry's tracecontext implementation, enabling us to validate our tracecontext implementation without authoring another complete suite.

Currently the integration test fails due to w3c/trace-context#341

This includes fixes to the tracecontexthttptextformatter to adhere to the specification.

toumorokoshi · 2019-10-21T04:35:31Z

there are two outstanding issues (in addition to the ows failing test case that I believed needs to be fixed upstream):

wsgi integration tests are failing (due to tracecontext now being the default implementation, and it's specification to return new spans in the case where it's unable to parse or find headers)
black code formatting considers the ./target/ directory, which contains cloned code and thus should not be considered.

I am working toward fixing these issues.

toumorokoshi · 2019-10-21T04:36:47Z

opentelemetry-api/tests/context/propagation/test_tracecontexthttptextformat.py

@@ -44,60 +44,6 @@ def test_no_traceparent_header(self):
        span_context = FORMAT.extract(get_as_list, output)
        self.assertTrue(isinstance(span_context, trace.SpanContext))

-    def test_from_headers_tracestate_entry_limit(self):


these tests were invalid behaviors in reference to the spec. I removed them as the integration test will validate it is working as intended.

toumorokoshi · 2019-10-21T04:37:58Z

opentelemetry-api/src/opentelemetry/trace/__init__.py

@@ -304,6 +305,31 @@ def format_span_id(span_id: int) -> str:
    return "0x{:016x}".format(span_id)


+def generate_span_id() -> int:


the w3c tracecontext spec calls for generating valid span contexts in the case where one cannot parse tracecontext headers. As such it was necessary to lift this code into the API.

Oberon00

Requesting changes because I think we need to think the change with generate_spancontext through. I think we should rather fix the handling of INVALID_SPANCONTEXT in the SDK. Moving generate_spancontext to the API would break open-telemetry/oteps#58 (even though I'm not sure if that will be merged anytime soon, I'd rather not make it impossible to implement). Note that we encountered this problem already in #226 which uses a different solution for the same problem.

mauriciovasquezbernal · 2019-10-21T12:57:57Z

2. black code formatting considers the ./target/ directory, which contains cloned code and thus should not be considered.

I had the same problem while implementing the Jaeger exporter, I had to add exceptions to the different checkers to solve that. Maybe you can take some inspiration from it: #174.

mauriciovasquezbernal

According to my understanding there are three different places where to handle the case when the incoming request doesn't contain valid trace information:

The propagator (proposed in this PR).
The integration (proposed in ext/wsgi: use current span when extracting fails #226).
SDK (what @Oberon00 is proposing).

I'm not sure at this point what is the best place to do that, the only thing I can say is that we have to document it so people implementing it understand it's their responsibility to take care of it.

mauriciovasquezbernal · 2019-10-21T13:04:52Z

examples/trace/server.py

+        )
+    return "hello"
+
+


I think it is better to move this to something under the tests, it could confuse people that is looking at the example.

I can move this, just wanted to re-use existing examples. But I see the argument to not pollute short and sweet ones.

toumorokoshi · 2019-10-22T03:55:58Z

@Oberon00 I'm fine with moving that to the SDK as long as the API alone does not need to propagate valid tracecontext headers. I believe I was told that was a requirement.

I've started a ticket around how we should deal with invalid spancontext from formatters. Let's discuss that aspect there:

#233

c24t

Very nice, great to see that we're using the W3C tests.

You might want to rebase to pick up #229 and make sure this still works as expected.

I agree that verify_tracecontext should move into its own test file, even if it means duplicating code with the example. I also think we should find a way of getting the W3C test code that doesn't involve cloning it again with every run, but I don't know that a git submodule is the best solution.

Are you planning to fix the other test errors before merging?

c24t · 2019-10-22T23:12:14Z

scripts/tracecontext-integration-test.sh

+# clone w3c tracecontext tests
+mkdir -p target
+rm -rf ./target/trace-context
+git clone https://github.com/w3c/trace-context ./target/trace-context


Out of curiosity: why do it this way instead of adding as a submodule?

I think submodules work fine here, but generally track a branch vs a commit. I'll look through and may re-add as a followup, or fix it here depending on approvers.

c24t · 2019-10-22T23:15:07Z

scripts/tracecontext-integration-test.sh

@@ -0,0 +1,26 @@
+#!/bin/bash
+# set -e 


Should this be commented out?

no, will re-add.

c24t · 2019-10-22T23:17:52Z

scripts/tracecontext-integration-test.sh

@@ -0,0 +1,26 @@
+#!/bin/bash


Suggested change

#!/bin/bash

#!/usr/bin/env bash

I think /bin/bash is usually safe, but better to use env in case the user wants a different bash.

AFAIK only /bin/sh is safe (sh is guaranteed by POSIX to exist)

happy to take lowest common denominator to reduced potential issues. I'll go with /bin/sh.

c24t · 2019-10-22T23:37:40Z

scripts/tracecontext-integration-test.sh

+    # send a sigint, to ensure
+    # it is caught as a KeyboardInterrupt in the
+    # example service.
+    kill -2 $EXAMPLE_SERVER_PID


This doesn't seem to kill the development server?

An alternative way is to run the server, spawn a separate unit test, fetch the test result and shutdown the server. I've explored this here and it seems to work well.
With this approach we don't need to wait for indefinite time and later kill the server process.

Yeah, I was trying to find a quick way to do this in bash. I can send a standard kill signal which will kill the server, but at the cost of not calling shutdown.

In general I'll try to vet this a little more. Thanks @reyang for the example, I'll switch to a python script if I can't get the bash one fixed up more.

@c24t for clarification, I've resolved the issue that kept the dev server from being killed (just switched it back to a sigkill). Was there something else you think needed to be improved aside from that?

The approach Reiley linked will not print out the traces as the unit tests enact them, which I found was critical to debugging the issues. The current output intermingles them as shown in https://travis-ci.org/open-telemetry/opentelemetry-python/jobs/604220488#L1083

c24t · 2019-10-22T23:43:38Z

tox.ini

+basepython: python3.7
+deps =
+  # needed for tracecontext
+  aiohttp~=3.6 


Suggested change

aiohttp~=3.6

aiohttp~=3.6

c24t · 2019-10-22T23:43:54Z

tox.ini

+  pip install -e {toxinidir}/ext/opentelemetry-ext-http-requests
+  pip install -e {toxinidir}/ext/opentelemetry-ext-wsgi
+
+commands = 


Suggested change

commands =

commands =

c24t · 2019-10-22T23:44:59Z

I'm still seeing these errors:

======================================================================
ERROR: test_tracestate_ows_handling (__main__.TraceContextTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 640, in test_tracestate_ows_handling
    ['tracestate', 'foo=1 '],
  File "test.py", line 113, in make_single_request_and_get_tracecontext
    return (self.get_traceparent(headers), self.get_tracestate(headers))
  File "test.py", line 108, in get_tracestate
    tracestate.from_string(value)
  File "/Users/libc/src/opentelemetry-python/target/trace-context/test/tracecontext/tracestate.py", line 54, in from_string
    raise ValueError('illegal key-value format {!r}'.format(member))
ValueError: illegal key-value format 'foo=1 '

======================================================================
FAIL: test_multiple_requests_with_illegal_traceparent (__main__.AdvancedTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 847, in test_multiple_requests_with_illegal_traceparent
    self.assertEqual(len(parent_ids), 3)
AssertionError: 1 != 3

======================================================================
FAIL: test_multiple_requests_with_valid_traceparent (__main__.AdvancedTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 818, in test_multiple_requests_with_valid_traceparent
    self.assertEqual(len(parent_ids), 3)
AssertionError: 1 != 3

======================================================================
FAIL: test_multiple_requests_without_traceparent (__main__.AdvancedTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 831, in test_multiple_requests_without_traceparent
    self.assertEqual(len(parent_ids), 3)
AssertionError: 1 != 3

----------------------------------------------------------------------
Ran 40 tests in 1.009s

FAILED (failures=3, errors=1)

opentelemetry-api/src/opentelemetry/context/propagation/tracecontexthttptextformat.py

reyang · 2019-10-23T01:56:55Z

opentelemetry-api/src/opentelemetry/trace/__init__.py

            type(self).__name__,
            format_trace_id(self.trace_id),
            format_span_id(self.span_id),
+            repr(self.trace_state),


"trace_state={!r}".format(self.trace_state)

reyang · 2019-10-23T01:58:54Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

@@ -1,5 +1,4 @@
 # Copyright 2019, OpenTelemetry Authors
-#


any reason to remove this line?

probably my fat finger, will remove.

reyang · 2019-10-23T02:23:31Z

scripts/tracecontext-integration-test.sh

+}
+trap on-shutdown EXIT
+cd ./target/trace-context/test
+python test.py http://127.0.0.1:5000/verify-tracecontext


nit: empty line before EOF

Dismissing my review, as I don't have time to follow this at the moment

mauriciovasquezbernal

It looks great to me that the tracecontext test is now integrated.

I left some general comments.

mauriciovasquezbernal · 2019-10-28T14:42:29Z

examples/trace/server.py

@@ -14,6 +14,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+import json


Is it used?

no, weird that linting didn't catch that

mauriciovasquezbernal · 2019-10-28T14:43:13Z

examples/trace/server.py

@@ -44,7 +46,9 @@


 @app.route("/")
-def hello():
+def index():
+    """An example which starts a span within the span created for


It's a nice comment but I think it's not related to this PR.

I previously added it when the validation server was joined. I've rebased in c24t's changes so this will disappear.

mauriciovasquezbernal · 2019-10-28T14:52:26Z

opentelemetry-api/src/opentelemetry/context/propagation/tracecontexthttptextformat.py

+            # typing.Dict's update is not recognized by pylint:
+            # https://github.com/PyCQA/pylint/issues/2420
+            tracestate[key] = value  # pylint:disable=E1137
+    if len(tracestate) > _TRACECONTEXT_MAXIMUM_TRACESTATE_KEYS:


Could this check be moved inside the loop by using a counter?
Having the check outside means the parsing of all headers is done before. Maybe this could be a treat for DoS attacks.

probably DoS could be achieved other ways, like adding a huge list of values that need to be parsed by regex. but agreed this could be a quick shortcut out.

mauriciovasquezbernal · 2019-10-28T14:56:32Z

opentelemetry-api/src/opentelemetry/trace/__init__.py

@@ -62,6 +62,7 @@
 """

 import enum
+import random


Is it used?

That's weird.. shouldn't linting have caught that?

will remove.

mauriciovasquezbernal · 2019-10-28T15:03:56Z

opentelemetry-api/tests/context/propagation/test_tracecontexthttptextformat.py

            },
        )
        self.assertEqual(span_context, trace.INVALID_SPAN_CONTEXT)
+        self.assertNotEqual(span_context.span_id, "1234567890123456")


Although this assertion is not wrong, it is implicit in the previous one.

mauriciovasquezbernal · 2019-10-28T15:10:59Z

opentelemetry-api/tests/context/propagation/test_tracecontexthttptextformat.py

@@ -213,3 +169,31 @@ def test_propagate_invalid_context(self):
        output = {}  # type:typing.Dict[str, str]
        FORMAT.inject(trace.INVALID_SPAN_CONTEXT, dict.__setitem__, output)
        self.assertFalse("traceparent" in output)
+
+    def test_tracestate_empty_header(self):
+        """Do not propagate invalid trace context.


Is this comment accurate?

sorry copy-pasted boilerplate. will fix.

mauriciovasquezbernal · 2019-10-28T15:12:45Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

@@ -301,7 +300,6 @@ def set_status(self, status: trace_api.Status) -> None:

 def generate_span_id() -> int:
    """Get a new random span ID.
-


Is there a particular reason to remove these empty lines?

mauriciovasquezbernal · 2019-10-28T15:20:29Z

tests/w3c_tracecontext_validation_server.py

+from opentelemetry.ext import http_requests
+from opentelemetry.ext.wsgi import OpenTelemetryMiddleware
+from opentelemetry.sdk.trace import Tracer
+from opentelemetry.sdk.trace.export import (


What is the consol exporter used for?
To have debug information in case the tests fail?

correct. If you look at the output of the build, it actually logs the trace information as the test is executing, which was really helpful when it's a bit harder to pdb into separate processes.

mauriciovasquezbernal · 2019-10-28T15:58:59Z

tests/w3c_tracecontext_validation_server.py

@@ -0,0 +1,75 @@
+#!/usr/bin/env python3


I have some doubts about the location of this file. Shouldn't it be on opentelemetry-api/tests/context/propagation?

That could be a good place, but I felt it wasn't accurate as currently the tracecontext test suite would fail without SDK behavior (specifically creating new spans when a spancontext is invalid from the propagator).

Verifying that our tracecontext is compliant with the w3c tracecontext reference is valuable. Adding a tox command to verify that the TraceContext propagator adheres to the w3c spec.

As the tracecontext spec calls for the creation of new, valid spans in the case of recieving invalid data from headers, it is necessary to have functions that generate valid span and trace ids.

…m ones. This fixes all the errors, leaving 6 failures.

The tracecontexthttptextformat now adheres completely to the w3c tracecontext test suite. moving the test endpoint to a non-root, to ensure that the basic example is clear. Adding unit tests to test_tracecontexhttptextformat that were helpful.

moving the generate span / trace id methods back to API. no longer needed due to open-telemetry#235 moving test service to it's own module. modifying shell script to use bourne shell, using posix standard location

Ensuring resources installed to the target directory are not included in style and linting. Modifying tox invocation to include python version to ensure it's called by travis-ci. Fixing tests that are no longer valid due to previous changes ( tracecontext returning INVALID_SPAN, start_as_current_span called)

c24t

LGTM!

I think there's still some cleanup to do in the tox file and script (in particular, running the test server as #228 (comment)), but the tests are valuable enough that I think we ought to merge them in and follow up with improvements.

mauriciovasquezbernal

LGTM.

I think further improvements can be done in follow up PRs so we can have it for our today's release.

toumorokoshi · 2019-10-29T16:25:25Z

@mauriciovasquezbernal it looks like you approved the changes, but not the full PR. can you approve the full PR? Currently only c24t is listed as approved.

mauriciovasquezbernal · 2019-10-29T17:20:45Z

@toumorokoshi I am not an approver on this repo so mine doesn't count. I just marked it as approved to signal that it looks good to me.

reyang

LGTM

Closes open-telemetry#193 Signed-off-by: Olivier Albertini <olivier.albertini@montreal.ca>

toumorokoshi requested review from a-feld, c24t, carlosalberto, lzchen, Oberon00 and reyang as code owners October 21, 2019 04:33

toumorokoshi commented Oct 21, 2019

View reviewed changes

Oberon00 previously requested changes Oct 21, 2019

View reviewed changes

Oberon00 mentioned this pull request Oct 21, 2019

ext/wsgi: use current span when extracting fails #226

Closed

mauriciovasquezbernal reviewed Oct 21, 2019

View reviewed changes

toumorokoshi mentioned this pull request Oct 22, 2019

Define where to handle invalid span contexts #233

Closed

c24t reviewed Oct 22, 2019

View reviewed changes

c24t reviewed Oct 23, 2019

View reviewed changes

opentelemetry-api/src/opentelemetry/context/propagation/tracecontexthttptextformat.py Outdated Show resolved Hide resolved

reyang reviewed Oct 23, 2019

View reviewed changes

toumorokoshi force-pushed the feature/tracecontext-integration-test branch from 791baaf to bdb607b Compare October 25, 2019 05:02

mauriciovasquezbernal reviewed Oct 28, 2019

View reviewed changes

toumorokoshi added 6 commits October 28, 2019 13:43

Adding tracecontext checker to tox

872d22c

Verifying that our tracecontext is compliant with the w3c tracecontext reference is valuable. Adding a tox command to verify that the TraceContext propagator adheres to the w3c spec.

Migrating generate_trace/span_id to api

922087c

As the tracecontext spec calls for the creation of new, valid spans in the case of recieving invalid data from headers, it is necessary to have functions that generate valid span and trace ids.

w3c tracecontext: changing invalid span contents to return new, rando…

a0b04fd

…m ones. This fixes all the errors, leaving 6 failures.

stopgap

c7a094b

Addressing feedback

638e0fa

moving the generate span / trace id methods back to API. no longer needed due to open-telemetry#235 moving test service to it's own module. modifying shell script to use bourne shell, using posix standard location

toumorokoshi added 3 commits October 28, 2019 13:44

tracecotnext tests now run with py37-tracecontext

d656a63

using bourne shell syntax for function declaration

ed628aa

toumorokoshi force-pushed the feature/tracecontext-integration-test branch from 8d24e0a to 16a95d5 Compare October 28, 2019 20:50

addressing feedback

2bf7d12

toumorokoshi force-pushed the feature/tracecontext-integration-test branch from 16a95d5 to 2bf7d12 Compare October 28, 2019 20:54

toumorokoshi and others added 3 commits October 28, 2019 14:15

fixing linting / broken requests test

6194a5f

addressing feedback

ad5ba02

Formatting, EOF newlines

b5be7cc

c24t approved these changes Oct 29, 2019

View reviewed changes

Reblacken

7dd47da

mauriciovasquezbernal approved these changes Oct 29, 2019

View reviewed changes

reyang approved these changes Oct 29, 2019

View reviewed changes

toumorokoshi merged commit 602d42a into open-telemetry:master Oct 29, 2019

This was referenced Oct 29, 2019

Update versions and requirements to 0.2a0 #250

Merged

W3C TraceContext compliance #197

Closed

reyang mentioned this pull request Oct 30, 2019

Integrate w3c TraceContext validation into CI and tox #202

Closed

mauriciovasquezbernal mentioned this pull request Nov 7, 2019

otel-trace-py implementation lightstep/opentelemetry-auto-instr-python#3

Merged

codeboten mentioned this pull request Nov 7, 2019

Add w3c tests lightstep/opentelemetry-auto-instr-python#4

Closed

reyang mentioned this pull request May 26, 2020

W3C TraceContext compliance open-telemetry/opentelemetry-cpp#74

Closed

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020

feat(logger): pass logger to plugin packages (open-telemetry#228)

7d6ca67

Closes open-telemetry#193 Signed-off-by: Olivier Albertini <olivier.albertini@montreal.ca>

		@@ -304,6 +305,31 @@ def format_span_id(span_id: int) -> str:
		return "0x{:016x}".format(span_id)


		def generate_span_id() -> int:

		@@ -301,7 +300,6 @@ def set_status(self, status: trace_api.Status) -> None:

		def generate_span_id() -> int:
		"""Get a new random span ID.

Feature/tracecontext integration test #228

Feature/tracecontext integration test #228

Conversation

toumorokoshi commented Oct 21, 2019

toumorokoshi commented Oct 21, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 left a comment • edited Loading

Choose a reason for hiding this comment

mauriciovasquezbernal commented Oct 21, 2019

mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

toumorokoshi commented Oct 22, 2019

c24t left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c24t commented Oct 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c24t left a comment

Choose a reason for hiding this comment

mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

toumorokoshi commented Oct 29, 2019

mauriciovasquezbernal commented Oct 29, 2019

reyang left a comment

Choose a reason for hiding this comment

Oberon00 left a comment •

edited

Loading