Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster/gce: Add env var to enable apiserver basic audit log. #41211

Merged
merged 1 commit into from
Feb 15, 2017

Conversation

enisoc
Copy link
Member

@enisoc enisoc commented Feb 9, 2017

For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.

Audit log rotation is handled the same as for kube-apiserver.log.

What this PR does / why we need it:

Add a knob to enable basic audit logging in GCE.

Which issue this PR fixes:

Special notes for your reviewer:

We would like to cherrypick/port this to release-1.5 also.

Release note:

The kube-apiserver [basic audit log](https://kubernetes.io/docs/admin/audit/) can be enabled in GCE by exporting the environment variable `ENABLE_APISERVER_BASIC_AUDIT=true` before running `cluster/kube-up.sh`. This will log to `/var/log/kube-apiserver-audit.log` and use the same `logrotate` settings as `/var/log/kube-apiserver.log`.

@enisoc enisoc added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Feb 9, 2017
@enisoc enisoc requested a review from cjcullen February 9, 2017 19:33
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 9, 2017
@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 9, 2017
@k8s-reviewable
Copy link

This change is Reviewable

# Lumberjack doesn't offer any way to disable size-based rotation. It also
# has an in-memory counter that doesn't notice if you truncate the file.
# 1000000000 (MiB) is a large number that fits in 31 bits. If the log grows
# by 1MiB/s, it will rotate after ~31 years if apiserver never restarts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like these logs are about 300B/request, so 1MiB/s translates into ~3500 Req/s. That seems reasonable to me, but could we be spammy enough to bring this into the O(days) category? I think it would fill up in 1 day at ~40M req/s, which is almost certainly higher than any one machine can handle, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40M QPS does seem higher than you'd reasonably expect. At 1.3M QPS, you could hit the limit in 30 days. Even that seems rather high for a kube-apiserver though. There is room in the 31 bits to double this limit if you want. I only chose 1000000000 so it's obvious to humans that it's an arbitrary large number. 2000000000 would also have mostly the same effect.

The long-term fix is to add a new flag or special value for --audit-log-maxsize that makes it bypass Lumberjack and write directly to a file. However, I was hoping to avoid touching anything in the actual binary to make this change minimally invasive. Maybe I can do that as a separate PR on master only after we get this one in and cherrypicked into 1.5?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you don't configure logrotate for this log? That would solve this comment, I think.

I'm actually pretty concerned with anything log-like that can grow unbounded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to disable the audit log feature's built-in log rotation (Lumberjack) so we can use the same logrotate facility we already have for kube-apiserver.log.

The current logrotate settings rotate everything matching the wildcard /var/log/*.log so this new /var/log/kube-apiserver-audit.log is included. I verified that it rotates as expected.

https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L124

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait I just realized the container-linux version doesn't share the logrotate config. In that case, I think I should revert my change from container-linux and only update GCI since that's what I am able to test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, GCI/COS is covered. I was looking at the Salt config, which looks like it needs to be updated? https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/logrotate/init.sls

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so that's where logrotate is configured for non-GCI. I've pushed an update to that file. Thanks!

@enisoc enisoc force-pushed the configure-audit-log branch 2 times, most recently from 9ab9f37 to 06533af Compare February 10, 2017 19:34
@zmerlynn
Copy link
Member

This more or less LGTM, except the comment above.

Copy link
Contributor

@eparis eparis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like most of these things are using 'defined' to mean true instead of needing to have the exact magic value/string "true". but I'm ok with this if you really want it this way.

# We currently only support enabling with a fixed path and with built-in log
# rotation "disabled" (large value) so it behaves like kube-apiserver.log.
# External log rotation should be set up the same as for kube-apiserver.log.
params="${params} --audit-log-path=/var/log/kube-apiserver-audit.log"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it wrong that I like the params += style you used the other 2 times. And is sometimes used in this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like += better too, but I wanted to follow the style used within each file so it's self-consistent.

I'm actually thinking about reverting my change to the trusty and container-linux versions because I don't know how log rotation works on those and updating those other flavors is beyond the scope of what I'm trying to accomplish (getting this working for GCE w/ GCI).

@eparis
Copy link
Contributor

eparis commented Feb 14, 2017

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 14, 2017
@enisoc
Copy link
Member Author

enisoc commented Feb 14, 2017

I feel like most of these things are using 'defined' to mean true instead of needing to have the exact magic value/string "true". but I'm ok with this if you really want it this way.

I preferred to require "true" because other env vars like ENABLE_GARBAGE_COLLECTOR have set the precedent that you can set it to "false" to disable it. That works in those cases because the value is passed to a boolean command-line flag if non-empty. In this case, there is no boolean command-line flag to enable/disable audit logging, so I would have to explicitly check for "false" to avoid enabling the log.

Given the choice between checking for "true" and checking for "false", I chose "true" because I saw many examples of doing that: ENABLE_MANIFEST_URL, ENABLE_CLUSTER_AUTOSCALER, etc.

@zmerlynn
Copy link
Member

Yes, please check for true. I don't know why there are other styles in here at all, but e.g. GKE tends to just pass FOO={true,false} in the kube-env.

For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.

Audit log rotation is handled externally by the wildcard /var/log/*.log
already configured in configure-helper.sh.
@enisoc enisoc force-pushed the configure-audit-log branch from 06533af to 7500746 Compare February 14, 2017 23:18
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: enisoc, eparis

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@zmerlynn
Copy link
Member

/lgtm

@k8s-ci-robot
Copy link
Contributor

@zmerlynn: you can't LGTM a PR unless you are an assignee.

In response to this comment:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@zmerlynn zmerlynn self-assigned this Feb 14, 2017
@zmerlynn
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 14, 2017
@zmerlynn
Copy link
Member

@k8s-bot cvm gke e2e test this

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)

@k8s-github-robot k8s-github-robot merged commit 4c02f29 into kubernetes:master Feb 15, 2017
k8s-github-robot pushed a commit that referenced this pull request Feb 15, 2017
Automatic merge from submit-queue

fluentd-gcp: Add kube-apiserver-audit.log.

**What this PR does / why we need it**:

Add `kube-apiserver-audit.log` from #41211 to fluentd config, so the audit log gets sent to the same place as `kube-apiserver.log`.

**Which issue this PR fixes**:

**Special notes for your reviewer**:

We would like to backport this to release-1.5 also.

**Release note**:
```release-note
The apiserver audit log (`/var/log/kube-apiserver-audit.log`) will be sent through fluentd if enabled.
```
@enisoc enisoc added this to the v1.5 milestone Feb 15, 2017
k8s-github-robot pushed a commit that referenced this pull request Feb 24, 2017
…-upstream-release-1.5

Automatic merge from submit-queue

Automated cherry pick of #41211

Cherry pick of #41211 on release-1.5.

#41211: cluster/gce: Add env var to enable apiserver basic audit log.
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.5" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

@enisoc enisoc deleted the configure-audit-log branch March 27, 2017 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants