Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: Improve seccomp format to be more expressive #657

Merged
merged 1 commit into from
Feb 25, 2017

Conversation

grantseltzer
Copy link
Contributor

I'm opening this to continue this discussion about improving the seccomp specification. @mrunalp @runcom @justincormack

While of course runtime-spec shouldn't make changes with specific projects in mind, it's worth noting that this format is currently being used in cri-o and docker.

The change proposes that Syscalls be grouped together in the case that their listings be exactly the same. Conditionals can be expressed on architectures and capabilities also. Comments are a nice feature for dev-ops teams as well.

There's inconsistencies (example) in the use of seccomp specifications across oci projects, I think heading in the direction of this improvement can help standardize adoption of the spec.

See the following example of this format:

{
    "archMap": [{
        "architecture": "SCMP_ARCH_X86_64",
        "subArchitectures": [
            "SCMP_ARCH_X86",
            "SCMP_ARCH_X32"
        ]
    }],
    "syscalls": [{
        "names": [
            "accept",
            "accept4",
            "access",
            "alarm",
            "alarm",
            "bind",
            "brk",
            "capget",
            "capset",
            "chdir",
            "chmod"
        ],
        "action": "SCMP_ACT_ALLOW",
        "args": [],
        "comment": "this is a comment!",
        "includes": {},
        "excludes": {}
    }, {
        "names": [
            "iopl",
            "ioperm"
        ],
        "action": "SCMP_ACT_ALLOW",
        "args": [],
        "comment": "",
        "includes": {
            "caps": [
                "CAP_SYS_RAWIO"
            ]
        },
        "excludes": {}
    }, {
        "names": [
            "clone"
        ],
        "action": "SCMP_ACT_ALLOW",
        "args": [{
            "index": 1,
            "value": 2080505856,
            "valueTwo": 0,
            "op": "SCMP_CMP_MASKED_EQ"
        }],
        "comment": "",
        "includes": {
            "arches": [
                "s390",
                "s390x"
            ]
        },
        "excludes": {
            "caps": [
                "CAP_SYS_ADMIN"
            ]
        }
    }]
}

Signed-off-by: grantseltzer grantseltzer@gmail.com

@justincormack
Copy link
Contributor

Sorry for not making a proposal yet...

This format is definitely better, and was deliberately designed to be fully backward compatible.

However, it is still impossible to express many valid seccomp programs using this format.

See seccomp/libseccomp#44 for a discussion of some of the issues.

One option would be to allow {"program":"base64 encoded bytes of bpf code"} but it is not terribly human readable.

@grantseltzer
Copy link
Contributor Author

grantseltzer commented Jan 16, 2017

One option would be to allow {"program":"base64 encoded bytes of bpf code"} but it is not terribly human readable.

Can you elaborate on where that bit would go in the specification?

I read through the thread you linked, is the issue with seccomp itself or the programs that are generating/reading the seccomp profiles?

@grantseltzer grantseltzer changed the title [Don't Merge] config: Improve seccomp format to be more expressive config: Improve seccomp format to be more expressive Jan 16, 2017
@justincormack
Copy link
Contributor

That would be an option to replace the entire seccomp part of the specification. The problem is with the JSON specification of the profiles which is defined by libseccomp, not with seccomp itself. In particular you cannot write rules that say "these values for this syscall are ok, but these ones are not".

@grantseltzer
Copy link
Contributor Author

grantseltzer commented Jan 17, 2017

So you're saying something like this doesn't work?

 "syscalls": [{
        "names": [
            "syscallA"
        ],
        "action": "SCMP_ACT_ALLOW",
        "args": [],
        "comment": "syscallA with no arguments, allowed",
        "includes": {},
        "excludes": {}
    }, {
        "names": [
            "syscallA"
        ],
        "action": "SCMP_ACT_ERRNO",
        "args": ["fakeArg","otherFakeArg"],
        "comment": "syscallA with some specific args, not allowed",
        "includes": {},
        "excludes": {}
    }
]}

@mheon
Copy link
Contributor

mheon commented Jan 17, 2017

There are certain types of filters that can't be expressed via libseccomp's current syntax - the most common one would be embedding a blacklist within a whitelist (or vice versa), as is mentioned in the libseccomp issue @justincormack linked.

There may be other ways around this aside from directly inputting user-specified BPF (layered Seccomp filters, for example). I'd say this is an issue for another time, though. This is a definite improvement over the existing filter syntax, and is already being used by Docker in the wild. We should update the spec to match what's actually being used before we consider changing it again.

// that already use the old seccomp profile.
Architectures []Arch `json:"architectures,omitempty"`
ArchMap []Architecture `json:"archMap,omitempty"`
Syscalls []*LinuxSyscall `json:"syscalls"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the values becoming pointers here? Is there a reason you'd want an explicit null in a syscalls array?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backwards compatibility with the old format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes. Hmm, maybe @runcom remembers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason behind those being pointers. When I refactored the seccomp profile in docker I just kept the pointers from their previous version (https://github.com/docker/docker/pull/26200/files#diff-05cb14d3dc2005d727f9d28cd0daece7L7)

Copy link
Contributor Author

@grantseltzer grantseltzer Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of an advantage to having them as pointers unless we're worried about it affecting backwards compatibility but I don't see it as an issue. As @wking pointed out, there's no time we'd want an explicit null.

// Architectures is kept to maintain backward compatibility for projects
// that already use the old seccomp profile.
Architectures []Arch `json:"architectures,omitempty"`
ArchMap []Architecture `json:"archMap,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

archMap is a strange name for a property that is not an object/map. Maybe call the new setting architectures2? Or take advantage of the pre-1.0 lack of backwards compat (SemVer wording for pre-releases in §9) and just use architectures for the new field. I'm not a maintainer, but my personal preference is for the latter, since then we don't have to define how to handle configs that set both properties simultaneously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it maps one architecture to others, I guess it might be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArchMap threw me off at first as well. Architectures2 doesn't really give it any meaning, and I'd prefer not to change the old field. ArchAndSubArches? ArchHierarchy?

Copy link
Contributor

@wking wking Jan 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectures2 doesn't really give it any meaning...

Like dup2 (and 3 and 4) the meaning is "This is what we should have done for architectures{n-1}, but we aren't changing that because we're preserving backwards compat". But if architectures was poorly named to begin with (I don't know), picking a better name now makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we implement this change, would there be precedence to eventually remove the old Architectures? As in give a few releases warning that it'll be deprecated? If so that could change how we choose to name archMap

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we implement this change, would there be precedence to eventually remove the old Architectures? As in give a few releases warning that it'll be deprecated?

Removing or changing semantics for currently-valid values for existing property requires a major version bump. Of course, it doesn't have to be the next major version bump, but given the pace of the spec I don't see a reason to give a two+ major window.

And yeah, if we pick a different name it will look less silly once we drop the original. My preference is still to drop the original now (before we cut 1.0), but if we intend to drop the original when we cut 2.0 that would be a reason to not choose architectures2.

@grantseltzer
Copy link
Contributor Author

@mrunalp PTAL

@crosbymichael crosbymichael added this to the 1.0.0 milestone Feb 15, 2017
@crosbymichael
Copy link
Member

@justincormack do you have any feedback on this? We think a change like this would be good to have in 1.0 as it greatly reduces the size of the spec and its a much nicer output.

I think BPF support can come at a later date as its more additive than a schema change

Copy link
Member

@crosbymichael crosbymichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simple format, I just have a couple of questions on some of the fields.

Name string `json:"name"`
Action LinuxSeccompAction `json:"action"`
Args []LinuxSeccompArg `json:"args,omitempty"`
Name string `json:"name,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a Name field here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backwards compatibility. This is another example of the conversation above, we have to decide at what point should we break previous versions, if at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its already broken with the change so we can remove it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crosbymichael we managed to make the format fully backward compatible when we changed it in Docker. We didnt deprecate anything, but I guess we can fix it up in Docker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to deprecate this is docker to get rid of this field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removing the Name field

Action Action `json:"action"`
Args []LinuxSyscallArg `json:"args"`
Comment string `json:"comment"`
Includes Filter `json:"includes"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are the includes and excludes meant to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditionals are for arches and capabilities. Something like this:

{
        "names": [
            "clone"
        ],
        "action": "SCMP_ACT_ALLOW",
        "args": [{
            "index": 1,
            "value": 2080505856,
            "valueTwo": 0,
            "op": "SCMP_CMP_MASKED_EQ"
        }],
        "comment": "",
        "includes": {
            "arches": [
                "s390",
                "s390x"
            ]
        },
        "excludes": {
            "caps": [
                "CAP_SYS_ADMIN"
            ]
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know that the conditionals should be in the spec, I think they should be dealt with upstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, i'm not sure if we should specifies these conditional filters. You can handle this in the caller and not add the syscall if you don't want it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justincormack Not sure what you mean by upstream. Wouldn't be want filters that disable syscalls only for certain arguments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrunalp there are no such things in libseccomp to support these conditionals (includes, excludes), it has to be implemented by the runtime author. Instead of having every runtime handle this, the runtime author should just not add the rule if they don't want it based on setting a cap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Removing conditionals

// and its sub-architectures
type Architecture struct {
Arch Arch `json:"architecture"`
SubArches []Arch `json:"subArchitectures"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is subarches supposed to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this:

    "archMap": [{
        "architecture": "SCMP_ARCH_X86_64",
        "subArchitectures": [
            "SCMP_ARCH_X86",
            "SCMP_ARCH_X32"
        ]
    }],

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is the usecase for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case is that people often want to configure amd64 machines to allow i386 images to run and if you want to do this you must specify, or it will be disallowed. This is mainly here so you can specify all the combinations, so that you can ship a single seccomp profile for multiple architectures, rather than having to package a different one for each architecture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be in the spec either - the fixed form is better, as it is an exact spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then should it just be a single field or an array?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justincormack @crosbymichael so "archMap" should just be an array right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see we need ArchMap. The list in Architectures is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, removing ArchMap

@justincormack
Copy link
Contributor

justincormack commented Feb 16, 2017

@crosbymichael yes, I think it makes the spec more compact and portable so we should have it for 1.0

Edit: I dont think the conditionals should be in it; by the time runc sees it it should be "compiled" to an exact spec.

@justincormack
Copy link
Contributor

So basically the aggregation parts (names) should be there, but not the conditional parts...

@grantseltzer
Copy link
Contributor Author

@crosbymichael @mrunalp @runcom

Made the changes from feedback, ready for further review/discussion

@@ -529,7 +529,19 @@ type LinuxSeccompArg struct {

// LinuxSyscall is used to match a syscall in Seccomp
type LinuxSyscall struct {
Name string `json:"name"`
Action LinuxSeccompAction `json:"action"`
Args []LinuxSeccompArg `json:"args,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to be replacing LinuxSeccompArg with LinuxSyscallArg, but that leaves a dangling LinuxSeccompArg type (at least as of c36c819):

$ git grep LinuxSeccompArg origin/pr/657
origin/pr/657:specs-go/config.go:// LinuxSeccompArg used for matching specific syscall arguments in Seccomp
origin/pr/657:specs-go/config.go:type LinuxSeccompArg struct {

Also, we'll need to update the Markdown spec to cover this change, since there is definitely a schema change going on and the Go types are not normative. The JSON Schema will need updating as well.

$ git show --stat --oneline origin/pr/657
c36c819 improve seccomp format to be more expressive
 specs-go/config.go | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

Copy link
Contributor Author

@grantseltzer grantseltzer Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed that, did not mean for that replacement.

Updating markdown as well.

Name string `json:"name"`
Action LinuxSeccompAction `json:"action"`
Args []LinuxSeccompArg `json:"args,omitempty"`
Names []string `json:"names,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably no longer need an omitempty here now that Name has been removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout

@grantseltzer grantseltzer force-pushed the improve-seccomp-spec branch 2 times, most recently from 3e075ba to b9b5bbe Compare February 22, 2017 23:11
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
@crosbymichael
Copy link
Member

crosbymichael commented Feb 24, 2017

LGTM

Approved with PullApprove

1 similar comment
@mrunalp
Copy link
Contributor

mrunalp commented Feb 25, 2017

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit ae7a541 into opencontainers:master Feb 25, 2017
@grantseltzer
Copy link
Contributor Author

Thanks for the merge!

@vbatts vbatts mentioned this pull request Mar 6, 2017
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Mar 6, 2017
The:

  "type": [
    "string"
  ]

syntax added in 652323c (improve seccomp format to be more
expressive, 2017-01-13, opencontainers#657) is not valid:

  $ ./validate ./config-schema.json <../config.json
  The document is not valid. see errors :
  - linux.seccomp.syscalls.0.names: Invalid type. Expected: string, given: array

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Apr 7, 2017
It used to have this, but the omitempty was dropped in 652323c
(improve seccomp format to be more expressive, 2017-01-13, opencontainers#657).
However, the docs that landed in 3ca5c6c (config-linux.md: fix
seccomp, 2017-03-02, opencontainers#706) list the property as optional, and if it is
optional, we can leave it unset instead of serializing an empty array.

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Apr 12, 2017
Before this commit, linux.seccomp.sycalls was required, but we didn't
require an entry in the array.  That means '"syscalls": []' would be
technically valid, and I'm pretty sure that's not what we want.

If it makes sense to have a seccomp property that does not need
syscalls entries, then syscalls should be optional (which is what this
commit is doing).

If it does not makes sense to have an empty/unset syscalls then it
should be required and have a minimum length of one.

Before 652323c (improve seccomp format to be more expressive,
2017-01-13, opencontainers#657), syscalls was omitempty (and therefore more
optional-feeling, although there was no real Markdown spec for seccomp
before 3ca5c6c, config-linux.md: fix seccomp, 2017-03-02, opencontainers#706, so
it's hard to know).  This commit has gone with OPTIONAL, because a
seccomp config which only sets defaultAction seems potentially valid.

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Apr 12, 2017
Before this commit, linux.seccomp.sycalls was required, but we didn't
require an entry in the array.  That means '"syscalls": []' would be
technically valid, and I'm pretty sure that's not what we want.

If it makes sense to have a seccomp property that does not need
syscalls entries, then syscalls should be optional (which is what this
commit is doing).

If it does not makes sense to have an empty/unset syscalls then it
should be required and have a minimum length of one.

Before 652323c (improve seccomp format to be more expressive,
2017-01-13, opencontainers#657), syscalls was omitempty (and therefore more
optional-feeling, although there was no real Markdown spec for seccomp
before 3ca5c6c, config-linux.md: fix seccomp, 2017-03-02, opencontainers#706, so
it's hard to know).  This commit has gone with OPTIONAL, because a
seccomp config which only sets defaultAction seems potentially valid.

Also add the previously-missing 'required' property to the seccomp
JSON Schema entry.

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Apr 25, 2017
Before this commit, linux.seccomp.sycalls was required, but we didn't
require an entry in the array.  That means '"syscalls": []' would be
technically valid, and I'm pretty sure that's not what we want.

If it makes sense to have a seccomp property that does not need
syscalls entries, then syscalls should be optional (which is what this
commit is doing).

If it does not makes sense to have an empty/unset syscalls then it
should be required and have a minimum length of one.

Before 652323c (improve seccomp format to be more expressive,
2017-01-13, opencontainers#657), syscalls was omitempty (and therefore more
optional-feeling, although there was no real Markdown spec for seccomp
before 3ca5c6c, config-linux.md: fix seccomp, 2017-03-02, opencontainers#706, so
it's hard to know).  This commit has gone with OPTIONAL, because a
seccomp config which only sets defaultAction seems potentially valid.

The SCMP_ACT_KILL example is prompted by:

On Tue, Apr 25, 2017 at 01:32:26PM -0700, David Lyle wrote [1]:
> Technically, OPTIONAL is the right value, but unless you specify the
> default action for seccomp to be SCMP_ACT_ALLOW the result will be
> an error at run time.
>
> I would suggest an additional clarification to this fact in
> config-linux.md would be very helpful if marking syscall as
> OPTIONAL.

I've phrased the example more conservatively, because I'm not sure
that SCMP_ACT_ALLOW is the only possible value to avoid an error.  For
example, perhaps a SCMP_ACT_TRACE default with an empty syscalls array
would not die on the first syscall.  The point of the example is to
remind config authors that without a useful syscalls array, the
default value is very important ;).

Also add the previously-missing 'required' property to the seccomp
JSON Schema entry.

[1]: opencontainers#768 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Apr 25, 2017
Before this commit, linux.seccomp.sycalls was required, but we didn't
require an entry in the array.  That means '"syscalls": []' would be
technically valid, and I'm pretty sure that's not what we want.

If it makes sense to have a seccomp property that does not need
syscalls entries, then syscalls should be optional (which is what this
commit is doing).

If it does not makes sense to have an empty/unset syscalls then it
should be required and have a minimum length of one.

Before 652323c (improve seccomp format to be more expressive,
2017-01-13, opencontainers#657), syscalls was omitempty (and therefore more
optional-feeling, although there was no real Markdown spec for seccomp
before 3ca5c6c, config-linux.md: fix seccomp, 2017-03-02, opencontainers#706, so
it's hard to know).  This commit has gone with OPTIONAL, because a
seccomp config which only sets defaultAction seems potentially valid.

The SCMP_ACT_KILL example is prompted by:

On Tue, Apr 25, 2017 at 01:32:26PM -0700, David Lyle wrote [1]:
> Technically, OPTIONAL is the right value, but unless you specify the
> default action for seccomp to be SCMP_ACT_ALLOW the result will be
> an error at run time.
>
> I would suggest an additional clarification to this fact in
> config-linux.md would be very helpful if marking syscall as
> OPTIONAL.

I've phrased the example more conservatively, because I'm not sure
that SCMP_ACT_ALLOW is the only possible value to avoid an error.  For
example, perhaps a SCMP_ACT_TRACE default with an empty syscalls array
would not die on the first syscall.  The point of the example is to
remind config authors that without a useful syscalls array, the
default value is very important ;).

Also add the previously-missing 'required' property to the seccomp
JSON Schema entry.

[1]: opencontainers#768 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants