Makefile: Build agent with tag seccomp to support seccomp #353

nitkon · 2018-09-03T03:59:54Z

Inorder to get runc/libcontainer/seccomp/seccomp_linux.go
built in, build agent with seccomp tag.

Fixes: #104

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com

jodh-intel · 2018-09-03T09:08:39Z

Hi @nitkon - thanks for raising!

The CI's are failing:

/usr/bin/kata-agent: error while loading shared libraries: libseccomp.so.2: cannot open shared object file: No such file or directory

The problem is that https://github.com/kata-containers/tests/blob/master/.ci/install_kata_image.sh is building a new image using this PR, but the (Clear Linux) mini-os doesn't provide that library by default.

As such, I think you'll need to add seccomp support to the images in osbuilder before this will pass:

https://github.com/kata-containers/osbuilder/blob/master/rootfs-builder/clearlinux/config.sh#L18

jodh-intel · 2018-09-03T09:09:45Z

Makefile

@@ -31,6 +31,7 @@ COMMIT_NO_SHORT := $(shell git rev-parse --short HEAD 2> /dev/null || true)
 COMMIT := $(if $(shell git status --porcelain --untracked-files=no),${COMMIT_NO}-dirty,${COMMIT_NO})
 VERSION_COMMIT := $(if $(COMMIT),$(VERSION)-$(COMMIT),$(VERSION))
 ARCH := $(shell go env GOARCH)
+BUILDTAGS := seccomp


It might be useful to build the agent without seccomp (but make it the default) so maybe you could add support for WITH_SECCOMP=no or something.

wdyt @bergwolf, @sboeuf?

It might also be good to add the fact that the agent has been build with seccomp support to the announce fields that are logged on agent startup:

https://github.com/kata-containers/agent/blob/master/agent.go#L610

@jodh-intel : Why would it be useful to have seccomp support optional, whats the usecase to turn it off ?

Don't get me wrong - I'd like it enabled :)

But some folk may not want it and be happy to use a system like the current one without having to hack the makefiles and osbuilder scripts. It would help to know the performance and memory density impact of using seccomp.

/cc @grahamwhaley.

Indeed, I think we should at least assess if there is any density (footprint) or boot speed impact.
You could try using https://github.com/kata-containers/tests/tree/master/metrics/report to do a before/after comparison to see if there is any measurable (and repeatable) difference.
If there is, then maybe we will discuss having two agent versions.
If there is not, we would probably just go ahead and enable it.

(traditionally we have just moved ahead and enabled things, but that does mean over time we suffer 'death of a thousand cuts' and our footprint and boot speed slowly creeps up and up.... we should make these conscious decisions ;-) )

nitkon · 2018-09-03T09:12:26Z

Hi @jodh-intel ,
Does this PR help?

jodh-intel · 2018-09-03T09:39:08Z

Thanks @nitkon.

nitkon · 2018-09-03T13:40:10Z

I am running the tests with and without the seccomp PR on my local machine. The second time the tests are running painfully slow(running since 3 hours), will update the results. @grahamwhaley @jodh-intel

===== starting test [boot times] =====
command: docker: yes
docker pull'ing: ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
Digest: sha256:72f832c6184b55569be1cd9043e4a80055d55873417ea792d989441f207dd2c7
Status: Image is up to date for ubuntu:latest
docker pull'd: ubuntu
.....
 run 74
 run 75
 run 76
 run 77
 run 78
 run 79
 run 80
 run 81


 run 82
 run 83
 run 84
 run 85
 run 86
 run 87
 run 88

grahamwhaley · 2018-09-03T14:01:14Z

@nitkon ouch! That boot time test just launches date in a container that instantly quits, 100 times. You can trivially run that one test by hand - from https://github.com/kata-containers/tests/blob/master/metrics/report/grabdata.sh#L76-L80:

bash time/launch_times.sh -i ubuntu -n 100

Is it the seccomp version that is running slow? (that is of course my presumption). May be worth a quick test from the commandline something like:

$ time docker run --rm -ti --runtime=kata-runtime ubuntu uname -a

and seeing if that has a large run/quit time for instance. Then we can diagnose from there.

jodh-intel · 2018-09-03T14:10:42Z

"security through inactivity" :-)

nitkon · 2018-09-03T14:19:16Z

On the contrary, since I already had the patch applied I ran those tests quickly, generated the *json files and then was trying without my patch and with an initrd image without seccomp support when it became slow!
Rebooted my system, docker prune and trying again.

grahamwhaley · 2018-09-03T14:24:26Z

:-) thanks for the confirmation @nitkon - we should try and figure out what is causing it. The only time in the past when I have seen huge slowdowns on tests is either:

we have left many qemu instances running, which seem to 'clog up' things
we are running very short of system memory (then the system thrashes trying to find space for the VM).
and maybe if there are many many redundant/stale files and directories down in the /var/vc type directories, as vc tries to parse them.

nitkon · 2018-09-03T14:52:14Z

@grahamwhaley : I am running on my x86 laptop as I do not have a x86 server, with 8gb RAM and a little space on hard disk. Probably that's the reason the tests are running very slow.

grahamwhaley · 2018-09-03T14:57:48Z

Ah, pretty small machine @nitkon ;-) Over in kata-containers/tests#650 I added the ability to just run subsets of tests to the grabdata.sh (the PR is not merged yet). You could then narrow to density (-d) and time (-t), but it would still run the same number of tests, so probably still slow on your machine :-( You might have to modify the script to run less tests - the report generator should still handle that (it may generate error text on the pages that do not have data available, but it should fail moderately gracefully and still generate a pdf file).
Otherwise, maybe we can find somebody else to run the test for you...

nitkon · 2018-09-03T15:24:10Z

@grahamwhaley @jodh-intel : Its slowing my system in my second attempt as well. Meanwhile, I will check to see how to run lesser number of tests. However, if anyone has access to the x86 server, can they run and check the memory/boot footprint? @ydjain ?

nitkon · 2018-09-04T08:25:43Z

@jodh-intel @grahamwhaley : Here is the metrics report generated. Thanks to @ydjainopensource for running the tests on his 16gb x86 machine.
metrics_report.pdf

grahamwhaley · 2018-09-04T10:43:00Z

@nitkon @ydjainopensource - thanks for the report!
There does seem to be something quite odd going on with those results. @ydjainopensource, can I check what/how you ran those (the machine config etc...). For instance:

On the PSS footprint (page3). There is a huge difference between the footprints - like without_seccomp is double-ish. That doesn't feel right. Also, the % row looks very wrong - that is likely me ;-), so I will go check on that...
On the scaling (page4), the busybox and mysql lines look 'mixed up' - I suspect this is also a symptom of some odd memory consumption going on in the test setup.

Looking at the last page, there is no 'image version' listed - would you happen to be running with an initrd setup? I've not tried the report with one of those.
I also note the runtime version is different for the with_seccomp run - maybe that is a necessary change to check this PR - can you confirm/detail pls?

I'll see if I can get to try and re-create this later on - something needs debugging, quite possibly in the report generation (maybe there are assumptions that will not be true for all users etc.).

I guess I should note also, that the reportgen grabdata should be run on a 'quiet' system - that is, don't run it in the background and carry on doing other stuff - as that will potentially throw out the results.

ghost · 2018-09-04T11:01:19Z

@grahamwhaley,

I have the following machine config :
i7-6700K, 16 GB RAM, 1 TB 7200 rpm SATA, Ubuntu 16.04
I checked-out nitkon's branches adding seccomp support to runtime, agent and osbuilder.
I used the fedora initrd with agent_init.
I couldn't run anything else while the tests were going on. So disturbances are unlikely. I however, did notice a considerable lag even after the tests were complete. I think this was due to majority of the other background process pages being moved to swap. I did reboot my system after the first run......not sure, if this would affect it this much.

Will turning swap off and rerunning the tests solve this issue?

grahamwhaley · 2018-09-04T12:43:29Z

Hi @ydjainopensource
Having swap enabled could effect some of the tests as the system tries to swap out pages to allocate new VMs.
The boot time test should be fairly immune, as it only tries to launch a single transient container at a time. The scaling test I don't think will be affected by swap, apart from the length of time it takes to run the test. But, having a (for want of a better phrase) 'dirty' system and then rebooting to a clean one could have an effect on the footprint tests if there were maybe some parts of previous kata containers left around running (which does happen sometimes during development).

I guess we need to make a note on the report docs that ideally the test would be run on a fresh clean system every time.

Let me know if you're able to do the runs again. I'll let you know if I find a slot (but I don't generally have an initrd set up, so that's going to take me a little bit to get going as well).

ghost · 2018-09-04T12:45:22Z

@grahamwhaley No issues. I will reboot my system and try running the tests again later today.

ghost · 2018-09-04T15:51:04Z

i have retried running the script here are the results.
metrics_report.pdf
metrics_report_all_four.pdf
@grahamwhaley PTAL

grahamwhaley · 2018-09-04T16:07:19Z

Thanks @ydjainopensource - much appreciated.
To me, that new run (just the two new runs) looks much more sane.
From that, looking from the pov of adding seccomp:

page3 - PSS footprint has gone up ~80%
- (I really think I need to go fix the % row math there btw...)
page4 - system container density has dropped quite dramatically (which correlates with the PSS footprint growth).
- either visually compare the like-for-like lines (so busybox-vs-busybox etc.), or read the containers-per-Gb numbers from the table
page5 - boot time to the workload has gone up ~14% or so - and it seems the hit has come from the 'in kernel' time - a jump from about .3s to .6s

I think that has given us the data to:

kick off a discussion if we are happy to take the hits to enable seccomp (they are fairly hefty hits I think)
- maybe an option is to not have these enabled by default and/or provide multiple sets of kernels/images - which we have discussed before for other things, but normally dismiss due to the extra complexity it adds to all layers of test/deploy/debug/maintain etc.
get somebody else to run the tests on a different system to confirm (I will, but it might have to be next week I'm afraid, I've got some things to wrap up and then am OOO for the back end of the week).

/cc @egernst @sboeuf @bergwolf et. al.

sboeuf · 2018-09-04T18:27:45Z

@grahamwhaley those numbers are pretty high and they're not fitting the current trend where we try to optimize and improve boot time and footprint.
I am not against seccomp, and if the user absolutely want to use it, then he should be able to do so. But this definitely needs to be configurable from the runtime side. And from the agent side, because this means the image is different, I guess we would need several images, which I don't like either...

Can we discuss the importance of seccomp support? I think a drop in performances can only be acceptable if the security is obviously improved.

nitkon · 2018-09-05T11:07:51Z

So what's the conclusion here? Would seccomp not be enabled by default? /cc @sboeuf @grahamwhaley @jodh-intel
Then should we make the rootfs configurable like USE_SECCOMP=true and then only include
libseccomp package in the image? Also, the agent will be built with seccomp tag if USE_SECCOMP is true.

ghost · 2018-09-05T11:32:48Z

@nitkon I think we should add seccomp however, we don't make it the default. As suggested by you, if someone wants he exports a global and rebuilds runtime/agent/osbuilder and he's good to go.

This would neither add unnecessary performance penalties nor degrade security.

As for testing, as noted by @jcvenegas here, we have a very busy CI today, so testing every build with/without seccomp sounds unresonable. Instead we just test it while making a release.

WDYT?

codecov · 2018-09-05T13:17:25Z

Codecov Report

Merging #353 into master will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #353      +/-   ##
==========================================
+ Coverage   47.21%   47.22%   +0.01%     
==========================================
  Files          15       15              
  Lines        2442     2471      +29     
==========================================
+ Hits         1153     1167      +14     
- Misses       1140     1154      +14     
- Partials      149      150       +1

nitkon · 2018-09-05T13:18:08Z

Updated my PR. Also look at osbuilder PR too.
@ydjainopensource @jodh-intel

ghost

LGTM.

Just a small question, shouldn't we include individual file changes into it too? I mean it makes more sense for someone viewing the history to know all the files changed for adding seccomp support. WDYT @jodh-intel ?

marcov · 2018-09-05T15:02:27Z

If seccomp is not supported, I think this should be well documented in the current limitations. I agree that running in a VM makes seccomp a bit less important, but still the user should be well aware of the limitation.

Also WDYT about documenting how to enable seccomp support in documentation? Cause this somehow affects:

kernel
rootfs
agent

jodh-intel · 2018-09-06T08:47:04Z

If someone wants to pick up the limitations aspect of this...

Document that seccomp support is not currently available documentation#241

nitkon · 2018-09-06T09:09:53Z

Hi @jodh-intel, I shall be picking this up. Will send a PR soon.
Hi @marcov , I shall be documenting how to enable seccomp once the PR to enable seccomp gets merged.

jodh-intel · 2018-09-06T09:17:34Z

Thanks @nitkon! 😄

sboeuf · 2018-09-10T16:26:44Z

Makefile

@@ -31,6 +31,9 @@ COMMIT_NO_SHORT := $(shell git rev-parse --short HEAD 2> /dev/null || true)
 COMMIT := $(if $(shell git status --porcelain --untracked-files=no),${COMMIT_NO}-dirty,${COMMIT_NO})
 VERSION_COMMIT := $(if $(COMMIT),$(VERSION)-$(COMMIT),$(VERSION))
 ARCH := $(shell go env GOARCH)
+ifeq ($(SECCOMP),yes)
+BUILDTAGS := seccomp
+endif


The same way we have INIT := no by default, please make sure we have SECCOMP := no too.
This way, we can merge this PR, but leaving the default way of building the agent the way it is.

@jodh-intel @sboeuf : Sorry, I didn't understand the comment completely.

When SECCOMP=yes will be passed when making a local rootfs image, If we use SECCOMP := no in this Makefile, just like INIT, it shall override its value, thus having no effect of environment variable passed.

We can set SECCOMP to no only if it's not set, but then still agent would be built the same way go build -tags "" -o kata-agent -ldflags "-X main.version=1.2.0-5765593309eab087598c093ff226500af158df9c-dirty"

ifeq ($(SECCOMP),) SECCOMP := no endif

Probably I am missing something.

Actually, I'm a bit confused now too as your code will work as we want currently.

I think @sboeuf might simply be suggesting that for consistency with the INIT option that you supplement your change with an explicit:

# Set to 'yes' to build agent with seccomp support SECCOMP := no

@jodh-intel : If I set SECCOMP := no in the Makefile like INIT it would over-ride what was passed by the user as an environment variable when building the rootfs.

Probably this is what he means? To set it to "no" if not set by the user?
https://github.com/kata-containers/osbuilder/blob/93ad0491ef6d3d0b9d20956071b6b39d0a787b4d/rootfs-builder/rootfs.sh#L16

@nitkon: variables passed from command line have priority over definitions inside Makefiles: https://www.gnu.org/software/make/manual/make.html#Override-Directive

@marcov : Oh I was thinking the other way round. Now it makes sense. Update my PR. :-D

The only time this won't work is if the user wants to enable seccomp and tries to flag that by setting an actual environment variable:

$ export SECCOMP="yes" $ make # SECCOMP will still be "no" here.

@jodh-intel: Yea, I had tried it the same way as you mentioned and so thought it to be the other way round. :-)

Inorder to get runc/libcontainer/seccomp/seccomp_linux.go built in, build agent with seccomp tag. Fixes: #104 Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com

jodh-intel · 2018-09-11T13:47:23Z

lgtm

jodh-intel reviewed Sep 3, 2018

View reviewed changes

jodh-intel mentioned this pull request Sep 3, 2018

rootfs: Include libseccomp support in rootfs kata-containers/osbuilder#154

Merged

jodh-intel mentioned this pull request Sep 3, 2018

virtcontainers: Pass seccomp profile inside VM kata-containers/runtime#689

Merged

ghost approved these changes Sep 5, 2018

View reviewed changes

grahamwhaley mentioned this pull request Sep 5, 2018

metrics: report: % diff row seems to calculate incorrectly kata-containers/tests#709

Closed

sboeuf suggested changes Sep 10, 2018

View reviewed changes

Makefile: Conditionally build agent with tag seccomp

00a5588

Inorder to get runc/libcontainer/seccomp/seccomp_linux.go built in, build agent with seccomp tag. Fixes: #104 Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com

sboeuf approved these changes Sep 11, 2018

View reviewed changes

sboeuf merged commit fd28fe4 into kata-containers:master Sep 11, 2018

grahamwhaley mentioned this pull request Nov 6, 2018

seccomp: initrd: failing to get initrd seccomp images to boot kata-containers/osbuilder#194

Closed

Makefile: Build agent with tag seccomp to support seccomp #353

Makefile: Build agent with tag seccomp to support seccomp #353

Conversation

nitkon commented Sep 3, 2018

jodh-intel commented Sep 3, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nitkon commented Sep 3, 2018

jodh-intel commented Sep 3, 2018

nitkon commented Sep 3, 2018 • edited Loading

grahamwhaley commented Sep 3, 2018

jodh-intel commented Sep 3, 2018

nitkon commented Sep 3, 2018 • edited Loading

grahamwhaley commented Sep 3, 2018

nitkon commented Sep 3, 2018

grahamwhaley commented Sep 3, 2018

nitkon commented Sep 3, 2018 • edited Loading

nitkon commented Sep 4, 2018 • edited Loading

grahamwhaley commented Sep 4, 2018

ghost commented Sep 4, 2018

grahamwhaley commented Sep 4, 2018

ghost commented Sep 4, 2018

ghost commented Sep 4, 2018

grahamwhaley commented Sep 4, 2018

sboeuf commented Sep 4, 2018

nitkon commented Sep 5, 2018

ghost commented Sep 5, 2018

codecov bot commented Sep 5, 2018 • edited Loading

Codecov Report

nitkon commented Sep 5, 2018

ghost left a comment

Choose a reason for hiding this comment

marcov commented Sep 5, 2018

jodh-intel commented Sep 6, 2018

nitkon commented Sep 6, 2018 • edited Loading

jodh-intel commented Sep 6, 2018

Choose a reason for hiding this comment

nitkon Sep 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jodh-intel commented Sep 11, 2018 • edited by amshinde Loading

nitkon commented Sep 3, 2018 •

edited

Loading

nitkon commented Sep 3, 2018 •

edited

Loading

nitkon commented Sep 3, 2018 •

edited

Loading

nitkon commented Sep 4, 2018 •

edited

Loading

codecov bot commented Sep 5, 2018 •

edited

Loading

nitkon commented Sep 6, 2018 •

edited

Loading

nitkon Sep 11, 2018 •

edited

Loading

jodh-intel commented Sep 11, 2018 •

edited by amshinde

Loading