Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

stage1: Ensure ptmx device usable by non-root for all flavours. #3484

Merged
merged 4 commits into from
Feb 10, 2017

Conversation

jodh-intel
Copy link
Contributor

kvm-based stage1 flavours previously used a sym-link from /dev/ptmx to
/dev/pts/ptmx. However, the /dev/pts/ptmx device has perms 000 meaning
that non-root users could not access the device. Creating /dev/ptmx
unconditionally works with all flavours and matches behaviour on current
distributions.

Fixes #3252.

Signed-off-by: James O. D. Hunt james.o.hunt@intel.com

@ghost
Copy link

ghost commented Dec 16, 2016

Can one of the admins verify this patch?

@jodh-intel
Copy link
Contributor Author

@Reviewer lucab

@jodh-intel
Copy link
Contributor Author

Fix tested with fly, coreos, qemu and lkvm stage1's as follows:

$ sudo rkt run --insecure-options=image --stage1-name="$ACI" docker://ubuntu --user=1000 --exec=sh -- -c "cd /tmp && script -c 'echo foo'"

@jonboulle
Copy link
Contributor

@rktbot ok to test

@jodh-intel
Copy link
Contributor Author

Hi @jonboulle, @lucab - I don't have permission to view the Jenkins errors above. Are they real errors, or maybe an infrastructure issue (since F24 fly seems to be working)?

@jonboulle
Copy link
Contributor

@jodh-intel looks like a legit failure to me - can you see the console output from one of the runs here? https://jenkins-rkt-public.prod.coreos.systems/job/rkt-github-ci/1757/os_type=fedora-24,stage1_flavor=coreos/consoleFull - I can access that unauthed

@lucab
Copy link
Member

lucab commented Dec 19, 2016

@jodh-intel our Jenkins also has an anonymous endpoint (the first entry in the list), logs for your PR are at https://jenkins-rkt-public.prod.coreos.systems/job/rkt-github-ci/1757/.

@lucab lucab added this to the v1.23.0 milestone Jan 5, 2017
@jodh-intel
Copy link
Contributor Author

Hi @lucab - I'm having trouble trying to work out why this PR fails under the Jenkins environment. The logs above suggest an EPERM issue, but when I run the tests locally, I see completely different behaviour (one test passes locally and the others timeout after 1 hour due to what appears to be systemd crashing). I can try to debug these different failures further, but my primary concern is that I cannot recreate the same failures scenarios seen under Jenkins. Can you offer any advice?

@lucab
Copy link
Member

lucab commented Jan 10, 2017

Not really, but it may actually be a sum of different failures instead of a single issue. Also, jenkins tests are probably run without a controlling terminal while you are probably running it from an interactive terminal.

@jodh-intel
Copy link
Contributor Author

@lucab - Hi - I think the Jenkins failure might be caused by #3539, but I don't know what the umask is set to in your Jenkins environment to know for sure. Certainly, if I set my umask to 0002 locally, I don't see the error for TestAppUserGroup.

Idea

Since there are so many stage1 flavours, operating systems and test environments, for debugging issues like this it would be really useful to see full details of the environment before the build/test starts. Jenkins and Semaphore do display some basic details, but not really enough.

You could maybe re-purpose inspect.go and have that dump out details at build time, or you could install and run a tool I wrote exactly for this purpose: procenv - it's already in the Fedora, debian and ubuntu archives.

@s-urbaniak
Copy link
Contributor

The TestAppUserGroup is a known flake, tracked in #3477.

This looks green enough for me right now, so I am inclined to merge it, but I'll defer to @lucab for the final word.

Introducing procenv sounds like a good idea to me. I'll open an issue for that.

@s-urbaniak
Copy link
Contributor

As discused OOB with @lucab let's bump this one release further for a more thorough review.

@s-urbaniak s-urbaniak modified the milestones: v1.24.0, v1.23.0 Jan 19, 2017
@lucab lucab self-requested a review January 19, 2017 09:24
@jodh-intel jodh-intel force-pushed the issue-3252-ptmx-permissions branch from e22c5e2 to fe9f80c Compare January 23, 2017 10:59
@lucab
Copy link
Member

lucab commented Jan 25, 2017

/cc @mxinden for a look on the test-infra changes.

@lucab
Copy link
Member

lucab commented Jan 25, 2017

I re-triggered the semaphore as it stalled mid-way. I think I'm fine with the technical changes in here (and thanks for the test!), my only doubt is why we actually have this discrepancies between kvm and systemd-nspawn. Is there something we are doing fundamentally wrong when setting up either one, or something we should tweak in the stage1 environment?

@jodh-intel as you did the investigation and fixing of the whole story, do you have some more insights to add?

@lucab
Copy link
Member

lucab commented Feb 1, 2017

@jodh-intel FYI this needs a rebase.

@jodh-intel jodh-intel force-pushed the issue-3252-ptmx-permissions branch from fe9f80c to 49d1c61 Compare February 1, 2017 13:50
@jodh-intel
Copy link
Contributor Author

@lucab - I could simplify the change so that prepare-app.c unconditionally applies the logic used for kvm-based stage1's. The rationale for the current heuristic approach was that:

  • prepare-app isn't told the type of stage1 it is working with.
  • for the non-kvm case, a symlink is sufficient.
  • I tried to avoid changing the behaviour for the non-kvm case.

Thanks for the notification - branch now rebased.

@jodh-intel jodh-intel force-pushed the issue-3252-ptmx-permissions branch from 49d1c61 to 980c340 Compare February 1, 2017 15:30
@squeed squeed modified the milestones: v1.25.0, v1.24.0 Feb 2, 2017
kvm-based stage1 flavours previously used a sym-link from /dev/ptmx to
/dev/pts/ptmx. However, the VM-based stage1's /dev/pts/ptmx device has
perms 000 meaning that non-root users could not access the device.

Determine if /dev/ptmx (or it's link target) is usable by all users and
if not recreate the device.

Fixes rkt#3252.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Added an extra parameter to runRktAsUidGidAndCheckOutput() to specify
whether the expected string provided is a literal or a regular
expression.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The new test supports the fix for rkt#3252 by making assertions about the
pseudoterminal devices.

Change required renaming TestPathsStat() to TestProcPathsStat().

Also added helper functions runRktAsGidAndCheckREOutput() and
runRktAndCheckREOutput().

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
@jodh-intel jodh-intel force-pushed the issue-3252-ptmx-permissions branch from 980c340 to 79df37d Compare February 7, 2017 10:41
Copy link
Member

@lucab lucab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lucab lucab merged commit 54bd071 into rkt:master Feb 10, 2017
@lucab lucab mentioned this pull request Feb 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants