Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auditbeat] Package: Nullify Librpm's rpmsqEnable #11628

Merged
merged 2 commits into from
Apr 5, 2019

Conversation

cwurm
Copy link
Contributor

@cwurm cwurm commented Apr 3, 2019

We've had a number of problems with Librpm's use of signal traps (#10633). We've tried to fix it by using Librpm functions to disable them (#10694), but there is still the occasional test failure that seems related to it. I've also seen it happen at least once locally.

Now I've started testing the dataset on OpenSUSE and found our fix prevents it from working at all. The Python system test will reliably fail, with the Auditbeat test process shown as terminated by an uncaught SIGTERM. If I remove our disabling logic it works, but only on OpenSUSE, and again not on CentOS. I don't know exactly what in Librpm or our use of it is causing the behavior on OpenSUSE, my assumption is that something is going wrong with how we try to unset the signal traps, with the original ones not being restored.

So I'm proposing a more radical solution - overriding the rpmsqEnable function in Librpm that sets and unsets signal traps. This is possible since we dlopen/dlsym the library into the process, so any functions that are already defined will be used instead of what the library comes with. In a way, this is exactly what the rpmsqSetInterruptSafety function does in newer versions of Librpm (see rpm-software-management/rpm@56f49d7). It is also what gdb did with this patch following their bug report. I should have investigated their fix more closely the last time around.

Hopefully, this will eliminate the residual test failures.

I'll open another PR to enable the OS family suse for the package dataset that depends on this.

@cwurm cwurm added review needs_backport PR is waiting to be backported to other branches. Auditbeat SecOps labels Apr 3, 2019
@cwurm cwurm requested a review from a team as a code owner April 3, 2019 13:33
@elasticmachine
Copy link
Collaborator

Pinging @elastic/secops

Copy link
Contributor

@adriansr adriansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

rpmsqEnable(-SIGPIPE, NULL);
extern int rpmsqEnable (int signum, void *handler);
int
rpmsqEnable (int signum, void *handler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder who is calling this function. Is it called internally by librpm?

If that's the case I'm surprised that you can override it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's internal to librpm. This works because we dynamically load librpm using dlopen() after the program is started. Any symbols that are already defined at that point will not be redefined.

@cwurm cwurm merged commit e047de6 into elastic:master Apr 5, 2019
@cwurm cwurm deleted the package_nullify_rpmsqenable branch April 5, 2019 13:30
cwurm pushed a commit to cwurm/beats that referenced this pull request Apr 5, 2019
Overrides the `rpmsqEnable` function in Librpm that sets and unsets signal traps. Hopefully, this will eliminate the residual test failures.

(cherry picked from commit e047de6)
@cwurm cwurm added v7.0.0 and removed needs_backport PR is waiting to be backported to other branches. labels Apr 5, 2019
cwurm pushed a commit to cwurm/beats that referenced this pull request Apr 5, 2019
Overrides the `rpmsqEnable` function in Librpm that sets and unsets signal traps. Hopefully, this will eliminate the residual test failures.

(cherry picked from commit e047de6)
@cwurm cwurm added the v6.7.2 label Apr 5, 2019
cwurm pushed a commit that referenced this pull request Apr 5, 2019
Both openSUSE and SLES use RPM under the hood, so we can use the code we already have for CentOS/Fedora.

Depends on #11628.

Fixes elastic/beats-tester#115.
cwurm pushed a commit to cwurm/beats that referenced this pull request Apr 5, 2019
Both openSUSE and SLES use RPM under the hood, so we can use the code we already have for CentOS/Fedora.

Depends on elastic#11628.

Fixes elastic/beats-tester#115.

(cherry picked from commit ebdf66d)
cwurm pushed a commit to cwurm/beats that referenced this pull request Apr 5, 2019
Both openSUSE and SLES use RPM under the hood, so we can use the code we already have for CentOS/Fedora.

Depends on elastic#11628.

Fixes elastic/beats-tester#115.

(cherry picked from commit ebdf66d)
cwurm pushed a commit that referenced this pull request Apr 5, 2019
Overrides the `rpmsqEnable` function in Librpm that sets and unsets signal traps. Hopefully, this will eliminate the residual test failures.

(cherry picked from commit e047de6)
cwurm pushed a commit that referenced this pull request Apr 5, 2019
…qEnable (#11665)

Cherry-pick of PR #11628 to 6.7 branch. Original message: 

Overrides the `rpmsqEnable` function in Librpm that sets and unsets signal traps. Hopefully, this will eliminate the residual test failures.
cwurm pushed a commit that referenced this pull request Apr 7, 2019
Cherry-pick of PR #11634 to 7.0 branch. Original message: 

Both openSUSE and SLES use RPM under the hood, so we can use the code we already have for CentOS/Fedora.

Depends on #11628.

Fixes elastic/beats-tester#115.
cwurm pushed a commit that referenced this pull request Apr 7, 2019
Cherry-pick of PR #11634 to 6.7 branch. Original message: 

Both openSUSE and SLES use RPM under the hood, so we can use the code we already have for CentOS/Fedora.

Depends on #11628.

Fixes elastic/beats-tester#115.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants