Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring reset_logger to reset every 12 hours #1666

Merged
merged 5 commits into from
Oct 11, 2019

Conversation

vrdmr
Copy link
Member

@vrdmr vrdmr commented Oct 11, 2019

Description

Refactoring the reset loggers to bring it to the MonitorHandler level. The logger now resets every 12 hours.

Issue #


PR information

  • The title of the PR is clear and informative.
  • There are a small number of commits, each of which has an informative message. This means that previously merged commits do not appear in the history of the PR. For information on cleaning up the commits in your pull request, see this page.
  • Except for special cases involving multiple contributors, the PR is started from a fork of the main repository, not a branch.
  • If applicable, the PR references the bug/issue that it fixes in the description.
  • New Unit tests were added for the changes made and Travis.CI is passing.

Quality of Code and Contribution Guidelines


This change is Reviewable

tests/ga/test_monitor.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 11, 2019

Codecov Report

❗ No coverage uploaded for pull request base (release-2.2.43@90d69fb). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@                Coverage Diff                @@
##             release-2.2.43    #1666   +/-   ##
=================================================
  Coverage                  ?   66.61%           
=================================================
  Files                     ?       80           
  Lines                     ?    11207           
  Branches                  ?     1573           
=================================================
  Hits                      ?     7465           
  Misses                    ?     3412           
  Partials                  ?      330
Impacted Files Coverage Δ
azurelinuxagent/common/logger.py 88.66% <ø> (ø)
azurelinuxagent/common/cgroupstelemetry.py 94.69% <ø> (ø)
azurelinuxagent/ga/monitor.py 82.11% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90d69fb...8d3a8ed. Read the comment docs.

@vrdmr
Copy link
Member Author

vrdmr commented Oct 11, 2019

Important: Currently, the hash map is shared across levels and won't generate a new entry into the hashmap for the same messages with different log levels. This could suppress a newer log entry with a higher-level coming in later.


if time_now >= (self.last_reset_loggers_time +
MonitorHandler.RESET_LOGGERS_PERIOD):
logger.reset_periodic()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the pattern we use today for all these periodic actions; we should consider if instead we should follow this:

    try:
        logger.reset_periodic()
   finally:
         self.last_reset_loggers_time = time_now

so that if there is an exception we don't keep calling the reset code continuously (to do after release)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

tests/ga/test_monitor.py Show resolved Hide resolved
Copy link
Contributor

@pgombar pgombar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments

tests/ga/test_monitor.py Outdated Show resolved Hide resolved
# Resetting the logger time states.
monitor_handler = get_monitor_handler()
monitor_handler.last_reset_loggers_time = datetime.datetime.utcnow() - timedelta(hours=1)
MonitorHandler.RESET_LOGGERS_PERIOD = timedelta(milliseconds=100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, and this part of the test should be done directly on the logger, rather than on the logger via the monitor (plus move test to the logger tests)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(could be done separately, in the interest of moving forward with the release)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgombar
It is needed. We are not using the RESET_LOGGERS_PERIOD for any sleep, but for the calculation to reset the logger. I am just moving the last time sent to 1 hour, but also need to force the calculation of time_now >= now(-1hr) + 100ms to be true, to hit that code.

if time_now >= (self.last_reset_loggers_time + MonitorHandler.RESET_LOGGERS_PERIOD):
    logger.reset_periodic()

@narrieta: This part of the test is testing the monitor pieces of the code and not the logger pieces of the code. We already test the periodic_* in the test_loggers (eg: test_periodic_does_not_emit_if_previously_sent). Yes, there needs to be a test specifically for only logger.reset_periodic() in the test_logger, and that can be added later (will send a new PR later for that, against develop).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, my bad... good as it is now. thanks!

Copy link
Contributor

@larohra larohra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if relevant for this specific release, but we should also have some test to ensure that the periodic logger is actually logging for the timeframe we're setting it to (and not have a curveball like we did today).

Everything else LGTM