-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure under gVisor: failed to get filesystem from image: chtimes /bin/bash: invalid argument #1020
Comments
hi @ddgenome i wasnt able to reproduce this on my local docker container. |
Yes, it is GKE. The cluster is running Kubernetes 1.15.7-gke.23. The nodes are running cos_containerd under gVisor. The issue seems similar to #1021 . |
This could be due to #940 which went in this release. |
Well, the error seems to align with that hypothesis. So presumably this line is returning an error: Now, since the previous calls to |
I haven't had much time to look into this but I wonder if we need to some sanity checks on the header times. I agree that it seems most likely one of the times is bad. I noticed that golang http://man7.org/linux/man-pages/man2/utimensat.2.html |
I'm not able to repro the bug using the executor:v0.17.0 container on my local machine Dockerfile
package.json
package-lock.json
kaniko command
|
The issue might be a capability, perhaps |
Troubleshooting this issue further it seems the issue is with kaniko running under gVisor, GKE Sandbox for the specific case I have tested. I created a permissive pod security policy to eliminate that variable and ran the builds under that policy. With kaniko 0.17.1 I was able to successfully complete the build on a "normal" GKE node but the build failed in the same manner, i.e., I can confirm that I can also successfully complete the build using Docker locally using the standard Docker runtime, not runsc. |
I guess the question now is whether running under gVisor is a goal of kaniko exectutor. If not, I guess this issue can be closed. |
@ddgenome We do want to support running under gVisor and do not want to break that support. I will fix this and verify manually before next release. Thanks |
@ddgenome i was looking at gvisor capabilities https://gvisor.dev/docs/user_guide/compatibility/linux/amd64/ |
nvm, you mentioned before you will need to set capability |
Sorry for the confusion, but the capabilities investigation turned out to be a red herring, see this comment where that issue was eliminated. As such, I am not sure whether #1035 will help the issue. Here is the requested gVisor version information:
I thought that perhaps Might it be acceptable to just ignore any failure from |
If, however you did want to check if the required capabilities were available, I have found you need at least these set:
However, building more complicated images may require other capabilities. |
I am able to reproduce this locally running the latest release of runsc and the same package.json and package-lock.json you use above with kaniko executor 0.17.1:
If you replace What version of runsc are you using? |
To more fully test the variable space, I ran the following command:
where
Here are the results:
All of the failures have the familiar message:
Are you able to reproduce running under gVisor (this issue has the Would you be open to a PR that logs any error from |
@tejal29 have you tested this using runc? I have not (haven't been able to set it up yet). Probably should remove the
@ddgenome I'm sorry, I don't understand your question. Are you asking if we can open a PR so you can test with the extra logging? Is a PR enough? New PRs don't result in any new images (until the PR is merged) |
@cvgw , I've tested using runc. Both kaniko executor 0.16.0 and 0.17.1 succeed under runc.
I was asking if you would consider ignoring any error from |
I think that could be a reasonable fix, but ya I would want to better understand what the issue is before making any changes. I'm gonna try to get runc setup on my machine this week so that I can debug this. |
I was able to repro this after getting gvisor installed. After some debugging I saw that the value of Working on a fix. |
It does appear to be an issue with the zero value for When https://play.golang.org/p/-e6FdyWcqp1 I believe the correct fix here would be to convert any zero |
The zero value of time.Time is not a valid argument to os.Chtimes because of the syscall that os.Chtimes calls. Instead we update the zero value of time.Time to 1970-01-01 00:00:00Z as this is the zero value of Unix Epoch
The zero value of time.Time is not a valid argument to os.Chtimes because of the syscall that os.Chtimes calls. Instead we update the zero value of time.Time to 1970-01-01 00:00:00Z as this is the zero value of Unix Epoch
The zero value of time.Time is not a valid argument to os.Chtimes because of the syscall that os.Chtimes calls. Instead we update the zero value of time.Time to the zero value of Unix Epoch
The zero value of time.Time is not a valid argument to os.Chtimes because of the syscall that os.Chtimes calls. Instead we update the zero value of time.Time to the zero value of Unix Epoch
…id-arg Fix #1020 os.Chtimes invalid arg
Thanks! |
@ddgenome would you mind giving either tag |
Would I be hitting this if I got...
|
@bimargulies-google no, you would see almost the exact same error as original reported except with maybe a different file path. That error looks more like #830 |
@cvgw here are the results you requested:
runsc is now release-20200219.0, runsc-nightly is release-20200211.0-39-g8dae8a10f01b, and runsc-head is release-20200219.0-46-ga92087f0f8fe. The new version looks good, thanks again! So it sounds like the difference is that the gVisor implementation of the system call underlying |
Here is the script I used to test the various versions of kaniko against the various versions of gVisor: https://gist.github.com/ddgenome/fa1da223569fdb602592a87249c1bf10 , if that helps anyone. |
Actual behavior
A build that succeeds under v0.16.0 fails under v0.17.0 with the error:
Expected behavior
I expect the build to complete successfully.
To Reproduce
Steps to reproduce the behavior:
Create a package.json and package-lock.json using a command like
Run the kaniko build with the following arguments
Additional Information
Please provide or clearly describe any files needed to build the Dockerfile (ADD/COPY commands)
Any package.json and package-lock.json should suffice.
Triage Notes for the Maintainers
I am seeing the failure when the build is being run in Kubernetes running on Container-optimized OS with the containerd runtime. The error occurs both when caching is being used and when it is not.
Here are the debug logs without caching:
--cache
flagThe text was updated successfully, but these errors were encountered: