Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added logging instrumentation to enable log - trace correlation #345

Merged
merged 2 commits into from
Mar 10, 2021

Conversation

owais
Copy link
Contributor

@owais owais commented Feb 19, 2021

Description

Added logging instrumentation to enable log - trace correlation

This commit adds a new logging instrumentation. The instrumentation
patches standard library logging module to inject tracing context
variables (otelSpanID, otelTraceID, otelServiceName) into log record
objects. It also optionally calls logging.basicConfig() and sets a
logging format that makes use of these vars if instructed by the user.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added unit tests

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@owais owais force-pushed the logging-correlation branch from 63a4437 to 4628626 Compare February 19, 2021 08:25
@owais owais changed the title wip Added logging instrumentation to enable log - trace correlation Feb 19, 2021
@owais owais force-pushed the logging-correlation branch 9 times, most recently from 13edebb to f61b1f7 Compare February 19, 2021 10:45
@owais owais marked this pull request as ready for review February 19, 2021 10:45
@owais owais requested review from a team, toumorokoshi and ocelotl and removed request for a team February 19, 2021 10:45
@owais owais force-pushed the logging-correlation branch 2 times, most recently from 7896720 to 2c7ce4f Compare February 20, 2021 21:05
Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the environment variables and basicConfig should be re-thought, but the correlation looks good!

- ``debug``
- ``warning``

Manually calling logging.basicConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the need for implicit ordering like this is a good argument why opentelemetry shouldn't provide a basicConfig configuration via environment variables.

If the user has to remember to call opentelemetry instrumentation first to get the opentelemetry call, why not just expose useful constants (e.g. log format) and document how to use them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a constant defined and can document it. We can also update documentation to show how users can do this completely with custom code (manually calling basicConfig) without dealing with environment variables.

The environment variables solve an important use case though which is 100% auto-instrumented apps where users cannot or don't want to add manual code to enable tracing. We support this with the opentelemetry-bootstrap command and the pattern so far to allow customization of behavior has been to use env vars. So we really do need to support this through env vars.

This can still be totally used manually by adding a bit of code. I'll update the documentation to cover that use case as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variables solve an important use case though which is 100% auto-instrumented apps where users cannot or don't want to add manual code to enable tracing. We support this with the opentelemetry-bootstrap command and the pattern so far to allow customization of behavior has been to use env vars. So we really do need to support this through env vars.

A part of me feels like no one would want opentelemetry to expose such an opinionated standard log, but I see your point that the idea is to make it really easy for even beginners to python / application development to get observability.

SGTM to leave the env vars.


logging.setLogRecordFactory(record_factory)

if environ.get(OTEL_PYTHON_LOG_CORRELATION, "false").lower() == "true":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue for the following:

  1. remove these environment variables and the basicConfig porcelain altogether.

If 1 is not acceptable I would argue factoring this out into a separate method, since this is complecting two different behaviors (attaching custom fields to the log record, modifying logging) without a way to use them separately.

Porcelain functions like these typically help the user by orchestrating a complex process that they may not be completely familiar with. In this case, I could achieve the same result as these environment variables by a single call to logging.basicConfig, and I believe it would be pretty common for a python opentelemetry user to be familiar with the logging API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can split them into multiple methods or we can add an argument to the instrument() function that controls whether to call basicConfig or not in addition to the env var. I think the later would be preferable as all other instrumentations only export instrument() and uninstrument() functions so adding a 3rd method to this one would need a lot of justification.

So how about something like instrument(setup_logging_format=False). This would only call basicConfig if the user explicitly passed setup_logging_fromat=True.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added arguments to the instrument method to allow users to configure the instrumentation manually through code while keeping env var support to allow opentelemetry-instrument command users to configure without having to add any custom code.

@owais owais force-pushed the logging-correlation branch 3 times, most recently from a475908 to b5911de Compare February 23, 2021 18:05
Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!


.. code-block::

%(asctime)s %(levelname)s [%(name)s] [%(filename)s:%(lineno)d] [trace_id=%(otelTraceID)s span_id=%(otelSpanID)s service.name=%(otelServiceName)s] - %(message)s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just link to the above default format? copy-paste can make it difficult to maintain code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how linking will work in docs. Do you mean link to variable definition directly to source/github or define this string only once in docs with an anchor and link to that from everywhere else?

I tried a different solution. I'm assigning docstrings explicitly by setting __doc__ = "....".format(DEFAULT_LOGGING_FORMAT). This way the docs will always be up to date and logging format will have a single source of truth for both code and docs. It feels a bit weird but technically it's not inferior in any way. I can't think of a downside but may be I didn't think hard enough.

What do you think about this approach?

https://github.com/open-telemetry/opentelemetry-python-contrib/pull/345/files#diff-b3cc41cf4c57f13f0ac89e4d3edec4ce0156b65ed65e94fbd95660d1f763ae22R37
and
https://github.com/open-telemetry/opentelemetry-python-contrib/pull/345/files#diff-b3cc41cf4c57f13f0ac89e4d3edec4ce0156b65ed65e94fbd95660d1f763ae22R48

@owais owais force-pushed the logging-correlation branch 3 times, most recently from bf7326c to 620a9ce Compare February 25, 2021 09:49
Copy link

@seemk seemk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a potential nitpick regarding service.name 🙂

# See the License for the specific language governing permissions and
# limitations under the License.

DEFAULT_LOGGING_FORMAT = "%(asctime)s %(levelname)s [%(name)s] [%(filename)s:%(lineno)d] [trace_id=%(otelTraceID)s span_id=%(otelSpanID)s service.name=%(otelServiceName)s] - %(message)s"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should service.name be changed to resource.service.name?

There's an OTEP with a few approvals for log correlation conventions (open-telemetry/oteps#114) which goes over resource correlation as well.

Personally I like the shorter version, but if it is merged to the spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've updated it.

@owais owais force-pushed the logging-correlation branch 2 times, most recently from 6dff13d to 27c341e Compare February 27, 2021 15:39
@owais
Copy link
Contributor Author

owais commented Mar 1, 2021

@aabmass PTAL when you can. Thanks.

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 81 to 82
record.otelSpanID = ""
record.otelTraceID = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is invalid span, it will log trace_id= span_id=? Would leaving the zeroes be better i.e. trace_id=0 span_id=0?

Comment on lines 21 to 24
The integration patches the standard library logging module to automatically inject tracing context into log record objects
and optionally calls ``logging.basicConfig()`` to set a logging format with placeholders for span ID, trace ID and service name.

The following keys are injected into the patched log record objects:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "patch" terminology sounds like we are actually monkey patching, but since you updated it to just set it with logging.setLogRecordFactory() I think we should rephrase throughout

logging.basicConfig(format=log_format, level=log_level)

def _uninstrument(self, **kwargs):
LoggingInstrumentor._is_instrumented = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would keeping the old_factory and resetting it here be easier than tracking the _is_instrumented and _factory_registered booleans?

Environment variables
---------------------

OTEL_PYTHON_LOG_CORRELATION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTEL_PYTHON_LOG_CORRELATION
^^^^^^^^^^^^^^^^^^^^^^^^^^^

This env var must be set to ``true`` in order to enable trace context injection into logs by calling ``logging.basicConfig()`` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit they can also set it in LoggingInstrumentor constructor which I think you mention elsewhere, but this is misleading

@owais
Copy link
Contributor Author

owais commented Mar 4, 2021

Thanks @aabmass. Will address the comments soon.

@codeboten please hold off on merging for now. I'll let you know once this is ready. Thanks!

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking merge until comments are addressed as per your request @owais

@owais owais force-pushed the logging-correlation branch 4 times, most recently from 14bbcb3 to 379ab65 Compare March 5, 2021 00:46
@owais
Copy link
Contributor Author

owais commented Mar 5, 2021

@codeboten This should be good to go now. Thanks.

This commit adds a new logging instrumentation. The instrumentation
patches standard library logging module to inject tracing context
variables (otelSpanID, otelTraceID, otelServiceName) into log record
objects. It also optionally calls `logging.basicConfig()` and sets a
logging format that makes use of these vars if instructed by the user.
@owais owais force-pushed the logging-correlation branch from ef671cf to 34a96bb Compare March 10, 2021 00:10
@codeboten codeboten merged commit 9ef4410 into open-telemetry:main Mar 10, 2021
@owais owais deleted the logging-correlation branch March 10, 2021 13:12
resource = provider.resource if provider else None
if resource:
service_name = resource.attributes.get("service.name")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@owais I have been going through the codebase to see places we are relying on the current global tracer provider and in particularly where it doesn't fit with the multiple tracer providers usecase. This part of code is getting service name from the global tracer provider. Is this still something we want to do here?

Side question: Does this log-trace correlation also apply to logs emitted by opentemetry python packages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still something we want to do here?

No, I think we should update this to extract it from span (resource) just like the span and trace ID.

Side question: Does this log-trace correlation also apply to logs emitted by opentemetry python packages?
Yes. As it stands today, once enabled, this feature will apply to all logs produced by the python process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants