Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable logging filter #49

Merged
merged 2 commits into from
Apr 2, 2022
Merged

Add configurable logging filter #49

merged 2 commits into from
Apr 2, 2022

Conversation

br3ndonland
Copy link
Owner

Description

When applications with APIs are deployed, it is common to perform "health checks" on them. Health checks are usually performed by making HTTP requests to a designated API endpoint. These checks are made at frequent intervals, and so they can fill up the access logs with large numbers of unnecessary log records.

To avoid logging health checks, it would be helpful to have a way to filter health checks out of the logs.

Changes

This PR will add a configurable logging filter to inboard.logging_conf.

  • Filters can be provided as a comma-separated string with the environment variable LOG_FILTERS, like LOG_FILTERS="/health, /heartbeat".
  • The environment variable value will be converted into a set of strings.
  • Each log message will then be checked for each filter in the set. If any matches are present in the log message, the logger will not log that message.

There are multiple ways to match these filters. inboard simply performs a substring membership check to see if any of the filters in the set have a match in the log message. This should work for most use cases, but the match could be overly greedy. For example, if /health is in LOG_FILTERS, it would filter out calls to /health, but it would also filter out calls to /healthy.

Another approach could be to operate on the LogRecord args, but the results would vary based on how the program supplies the log record.

  • As seen in the type stubs for the logging module, the args can be either tuple[object, ...] or Mapping[str, object]. The args could be converted to a set, and compared with the set of filters using isdisjoint. This set comparison approach is potentially more precise than string matching.
  • Gunicorn's access logger gunicorn.access supports several custom message fields, which it calls "atoms" in its source code, and "identifiers" in the docs. Gunicorn formats these atoms and supplies them as LogRecord args. The URL path could be logged with the %(U)s atom, and inboard could check to see if there is an exact match in the set of filters.
  • Uvicorn's access logger uvicorn.access automatically supplies the URL path as its own arg, which could also work nicely with this approach.
  • However, LogRecord args aren't always supplied as strings directly, so there would probably need to be isinstance checks added to check each of the args. For example, when watching files with watchgod (now renamed to watchfiles), Uvicorn logs the list of files being watched as a single arg. Lists are not hashable, so the list would need to be converted to another type (like set, or str as the logger does when formatting messages) in order to perform equality comparisons.

Substring membership checks should be adequate for this feature, and they avoid the complications of working with LogRecord args directly.

Related

Filters identify log messages to filter out, so that the logger does not
log messages containing any of the filters. If any matches are present
in a log message, the logger will not output the message.

The environment variable `LOG_FILTERS` can be used to specify filters as
a comma-separated string, like `LOG_FILTERS="/health, /heartbeat"`. To
then add the filters to a class instance, the `LogFilter.set_filters()`
method can make the set of filters from the environment variable value.

One of the primary use cases for log message filters is health checks.
When applications with APIs are deployed, it is common to perform
"health checks" on them. Health checks are usually performed by making
HTTP requests to a designated API endpoint. These checks are made at
frequent intervals, and so they can fill up the access logs with large
numbers of unnecessary log records. To avoid logging health checks,
add those endpoints to the `LOG_FILTERS` environment variable.
@vercel
Copy link

vercel bot commented Apr 2, 2022

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/br3ndonland/inboard/9mpRgFfvGc84eAAZuwyARSekyLjP
✅ Preview: https://inboard-git-logging-filter-br3ndonland.vercel.app

@codecov
Copy link

codecov bot commented Apr 2, 2022

Codecov Report

Merging #49 (fba2fd2) into develop (80cbaf4) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##           develop       #49   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines          266       281   +15     
=========================================
+ Hits           266       281   +15     
Flag Coverage Δ
unit 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
inboard/__init__.py 100.00% <100.00%> (ø)
inboard/logging_conf.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80cbaf4...fba2fd2. Read the comment docs.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Apr 2, 2022

Sourcery Code Quality Report

❌  Merging this PR will decrease code quality in the affected files by 0.93%.

Quality metrics Before After Change
Complexity 0.70 ⭐ 0.83 ⭐ 0.13 👎
Method Length 59.59 ⭐ 59.56 ⭐ -0.03 👍
Working memory 6.86 🙂 7.05 🙂 0.19 👎
Quality 79.86% 78.93% -0.93% 👎
Other metrics Before After Change
Lines 714 833 119
Changed files Quality Before Quality After Quality Change
inboard/init.py 95.67% ⭐ 95.34% ⭐ -0.33% 👎
inboard/logging_conf.py 71.00% 🙂 75.15% ⭐ 4.15% 👍
tests/test_logging_conf.py 83.21% ⭐ 79.29% ⭐ -3.92% 👎
tests/test_start.py 79.34% ⭐ 79.19% ⭐ -0.15% 👎

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation
tests/test_start.py TestStartServer.test_start_server_uvicorn_reload_dirs 2 ⭐ 157 😞 12 😞 56.57% 🙂 Try splitting into smaller methods. Extract out complex expressions
tests/test_logging_conf.py TestLoggingOutput.test_logging_filters 2 ⭐ 134 😞 10 😞 62.70% 🙂 Try splitting into smaller methods. Extract out complex expressions
tests/test_start.py TestStartServer.test_start_server_uvicorn_gunicorn_custom_config 0 ⭐ 120 😞 9 🙂 68.44% 🙂 Try splitting into smaller methods
tests/test_start.py TestStartServer.test_start_server_uvicorn_gunicorn 0 ⭐ 100 🙂 10 😞 69.00% 🙂 Extract out complex expressions
inboard/logging_conf.py LogFilter.set_filters 2 ⭐ 32 ⭐ 10 😞 77.42% ⭐ Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

  • ⭐ excellent
  • 🙂 good
  • 😞 poor
  • ⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

@br3ndonland br3ndonland merged commit 97e35b7 into develop Apr 2, 2022
@br3ndonland br3ndonland deleted the logging-filter branch April 2, 2022 20:53
br3ndonland added a commit that referenced this pull request Apr 2, 2022
#36

Type annotations were updated for Python 3.10 in #36.
This `from typing import Set` was added in #49 prior to merging #36,
and was overlooked when merging the PR.

This commit will replace `typing.Set` with a parametrized generic.
https://peps.python.org/pep-0585/
https://docs.python.org/3/whatsnew/3.9.html
br3ndonland added a commit that referenced this pull request Apr 17, 2022
#49
https://inboard.bws.bio/logging

The docs on filtering log messages were initially written with examples
that used the VSCode debugger. This commit will update the docs to use
one of the inboard Docker images instead. This change makes the docs
more relevant to users who may be using the inboard Docker images, but
not working with the inboard source code directly.
br3ndonland added a commit that referenced this pull request Apr 17, 2022
#49
https://inboard.bws.bio/logging

The docs on filtering log messages were initially written with examples
that used the VSCode debugger. This commit will update the docs to use
one of the inboard Docker images instead. This change makes the docs
more relevant to users who may be using the inboard Docker images, but
not working with the inboard source code directly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant