Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eventd]: Close rsyslog plugin when rsyslog SIGTERM and EOF is sent #19035

Merged

Conversation

zbud-msft
Copy link
Contributor

@zbud-msft zbud-msft commented May 22, 2024

Backport (#18835)

Fix #18771, #18335

Microsoft ADO (number only):27882794

How I did it

Add signalOnClose for omprog as well as close rsyslog plugin when receives an EOF.

How to verify it

Verify rsyslog_plugin is running inside bgp or swss container

Run docker exec -it bgp supervisorctl restart rsyslogd

Before change:

This will not kill current rsyslog_plugin process but instead rsyslogd will now break off its end of writing to cin and send EOF to rsyslog_plugin, however will not send a signal SIGTERM or SIGKILL to rsyslog_plugin. Therefore, rsyslog plugin will run in an infinite loop forever, constantly calling getline raising CPU to 100% inside docker.

After change of adding signalOnClose="on" to conf file inside omprog, rsyslogd will now send SIGTERM to rsyslog_plugin process running inside container, and rsyslog_plugin will die.

? ( ): rsyslog_plugin/578637 ... [continued]: read()) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22)

UT (will add sonic-mgmt testcase for storming events with logs)

RCA:

  1. When rsyslogd is terminated, no signal is sent to child process of rsyslog_plugin meaning that rsyslog_plugin will be constantly trying to read from cin with no writer on the other end of the pipe. This leads to rsyslog_plugin process will constantly be reading via getline infinitely.

  2. Because rsyslog is terminated and the spawned rsyslog_plugin is still alive, when rsyslog starts backup again, and log is triggered, a new rsyslog_plugin will be spawned for that rsyslog process, which can lead to many "ghost" rsyslog_plugin processes that will be at high CPU usage.

Why I did it

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…o stream (sonic-net#18835)

Fix sonic-net#18771

Microsoft ADO (number only):27882794

How I did it

Add signalOnClose for omprog as well as close rsyslog plugin when receives an EOF.

How to verify it

Verify rsyslog_plugin is running inside bgp or swss container

Run docker exec -it bgp supervisorctl restart rsyslogd

Before change:

This will not kill current rsyslog_plugin process but instead rsyslogd will now break off its end of writing to cin and send EOF to rsyslog_plugin, however will not send a signal SIGTERM or SIGKILL to rsyslog_plugin. Therefore, rsyslog plugin will run in an infinite loop forever, constantly calling getline raising CPU to 100% inside docker.

After change of adding signalOnClose="on" to conf file inside omprog, rsyslogd will now send SIGTERM to rsyslog_plugin process running inside container, and rsyslog_plugin will die.

? ( ): rsyslog_plugin/578637 ... [continued]: read()) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22)

UT (will add sonic-mgmt testcase for storming events with logs)

RCA:

1. When rsyslogd is terminated, no signal is sent to child process of rsyslog_plugin meaning that rsyslog_plugin will be constantly trying to read from cin with no writer on the other end of the pipe. This leads to rsyslog_plugin process will constantly be reading via getline infinitely.

2. Because rsyslog is terminated and the spawned rsyslog_plugin is still alive, when rsyslog starts backup again, and log is triggered, a new rsyslog_plugin will be spawned for that rsyslog process, which can lead to many "ghost" rsyslog_plugin processes that will be at high CPU usage.
@zbud-msft zbud-msft requested a review from lguohan as a code owner May 22, 2024 00:31
@zbud-msft zbud-msft changed the title [eventd]: Close rsyslog plugin when rsyslog SIGTERM and EOF is sent t… [eventd]: Close rsyslog plugin when rsyslog SIGTERM and EOF is sent May 22, 2024
@lguohan lguohan merged commit 561bb54 into sonic-net:202305 May 24, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants