-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster/gce: Add env var to enable apiserver basic audit log. #41211
cluster/gce: Add env var to enable apiserver basic audit log. #41211
Conversation
# Lumberjack doesn't offer any way to disable size-based rotation. It also | ||
# has an in-memory counter that doesn't notice if you truncate the file. | ||
# 1000000000 (MiB) is a large number that fits in 31 bits. If the log grows | ||
# by 1MiB/s, it will rotate after ~31 years if apiserver never restarts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these logs are about 300B/request, so 1MiB/s translates into ~3500 Req/s. That seems reasonable to me, but could we be spammy enough to bring this into the O(days) category? I think it would fill up in 1 day at ~40M req/s, which is almost certainly higher than any one machine can handle, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
40M QPS does seem higher than you'd reasonably expect. At 1.3M QPS, you could hit the limit in 30 days. Even that seems rather high for a kube-apiserver though. There is room in the 31 bits to double this limit if you want. I only chose 1000000000 so it's obvious to humans that it's an arbitrary large number. 2000000000 would also have mostly the same effect.
The long-term fix is to add a new flag or special value for --audit-log-maxsize
that makes it bypass Lumberjack and write directly to a file. However, I was hoping to avoid touching anything in the actual binary to make this change minimally invasive. Maybe I can do that as a separate PR on master only after we get this one in and cherrypicked into 1.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you don't configure logrotate for this log? That would solve this comment, I think.
I'm actually pretty concerned with anything log-like that can grow unbounded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to disable the audit log feature's built-in log rotation (Lumberjack) so we can use the same logrotate
facility we already have for kube-apiserver.log
.
The current logrotate
settings rotate everything matching the wildcard /var/log/*.log
so this new /var/log/kube-apiserver-audit.log
is included. I verified that it rotates as expected.
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait I just realized the container-linux version doesn't share the logrotate config. In that case, I think I should revert my change from container-linux and only update GCI since that's what I am able to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, GCI/COS is covered. I was looking at the Salt config, which looks like it needs to be updated? https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/logrotate/init.sls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so that's where logrotate is configured for non-GCI. I've pushed an update to that file. Thanks!
9ab9f37
to
06533af
Compare
This more or less LGTM, except the comment above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like most of these things are using 'defined' to mean true instead of needing to have the exact magic value/string "true". but I'm ok with this if you really want it this way.
# We currently only support enabling with a fixed path and with built-in log | ||
# rotation "disabled" (large value) so it behaves like kube-apiserver.log. | ||
# External log rotation should be set up the same as for kube-apiserver.log. | ||
params="${params} --audit-log-path=/var/log/kube-apiserver-audit.log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it wrong that I like the params +=
style you used the other 2 times. And is sometimes used in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like +=
better too, but I wanted to follow the style used within each file so it's self-consistent.
I'm actually thinking about reverting my change to the trusty
and container-linux
versions because I don't know how log rotation works on those and updating those other flavors is beyond the scope of what I'm trying to accomplish (getting this working for GCE w/ GCI).
/approve |
I preferred to require "true" because other env vars like Given the choice between checking for "true" and checking for "false", I chose "true" because I saw many examples of doing that: ENABLE_MANIFEST_URL, ENABLE_CLUSTER_AUTOSCALER, etc. |
Yes, please check for |
For now, this is focused on a fixed set of flags that makes the audit log show up under /var/log/kube-apiserver-audit.log and behave similarly to /var/log/kube-apiserver.log. Allowing other customization would require significantly more complex changes. Audit log rotation is handled externally by the wildcard /var/log/*.log already configured in configure-helper.sh.
06533af
to
7500746
Compare
[APPROVALNOTIFIER] This PR is APPROVED The following people have approved this PR: enisoc, eparis Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/lgtm |
@zmerlynn: you can't LGTM a PR unless you are an assignee. In response to this comment:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm |
@k8s-bot cvm gke e2e test this |
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735) |
Automatic merge from submit-queue fluentd-gcp: Add kube-apiserver-audit.log. **What this PR does / why we need it**: Add `kube-apiserver-audit.log` from #41211 to fluentd config, so the audit log gets sent to the same place as `kube-apiserver.log`. **Which issue this PR fixes**: **Special notes for your reviewer**: We would like to backport this to release-1.5 also. **Release note**: ```release-note The apiserver audit log (`/var/log/kube-apiserver-audit.log`) will be sent through fluentd if enabled. ```
Commit found in the "release-1.5" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled the same as for
kube-apiserver.log
.What this PR does / why we need it:
Add a knob to enable basic audit logging in GCE.
Which issue this PR fixes:
Special notes for your reviewer:
We would like to cherrypick/port this to release-1.5 also.
Release note: