-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kaniko build's performance much slower comparing with DID solution #875
Comments
I've noticed similar issues - I use GitLab runner on Kubernetes, and in the same way as you described, ran dind and kaniko at the same time, kaniko is much slower. At the moment I've switched to using kaniko on Cloud Build, and there its pretty fast and caches better than docker. |
Thanks for the information, I believe you are talking about https://cloud.google.com/blog/products/application-development/build-containers-faster-with-cloud-build-with-kaniko. Unfortunately we are using an internal docker registry based on quay.io, so it cannot benefit us. |
It seems a lot of time is spent snapshotting the filesystem, which I believe is used to ensure we get an end result with multiple layers. By using It can of course be nice to have layers, so improving performance like this is a compromise. I ended up with 15 minutes instead of 25 minutes for one of my builds. |
have the same question. in jenkins used dind faster than Kaniko . most of the time spent [Taking snapshot of full filesystem] [Unpacking rootfs as cmd COPY] how to improve this? |
I have the same question. I tried kaniko build on gitlab and it's also slower than with docker. |
Same here. |
Experiencing the same issue. In fact I don't see any difference in runtimes when using |
I'm using kaniko in GitLab CI/CD with runners in a DigitalOcean Kubernetes cluster (3x 2GB 1vCPU). Benchmark: create-react-app (multi-stage build)FROM node:12-alpine as build
WORKDIR /home/app/
COPY package.json ./
COPY yarn.lock ./
RUN yarn
COPY . .
RUN yarn build
FROM nginx:1.13.12-alpine
COPY --from=build /home/app/build /var/www
COPY nginx.conf /etc/nginx/nginx.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"] Building locally with docker build on my laptop: Building with kaniko in a GitLab runner: Same as previous with Using Docker in Docker: |
We've been experiencing similar problems with kaniko in case of builds that produce a large number of small files on the filesystem in the intermediate stages. Multi-stage builds also seem to contribute to slow speed |
I expect the reason for this difference in speed is that "native" docker manages the layered filesystem using overlayfs (overlay2): so taking a snapshot is as simple as telling the FS driver to finish a layer. While kaniko doesn't natively track that on the filesystem, so it has to stop and stat everything in the filesystem in order to take a snapshot. I'd be interested in whether this is a fundamental limitation of the kaniko design, or whether if you can have a user-mode file system driver or overlayfs running in the docker container running kaniko, you could obtain the matching speeds. |
@bsmedberg-xometry I love your explanation as I fully agree. I have just recently watched a very good talk about the "backend" of the Docker daemon in which a guy responsible for the file-system at Docker explains the differences. Whilst it sounds like possible to actually do what you have suggested, I think that it can't be achieved without changing the source-code of kaniko. |
I understand the filesystem snapshotting issue is driven by not using overlayfs, but what would explain the inordinate time it takes kaniko to push a layer to the cache? |
We are also having this issue. Switching to Kaniko solved some other DIND issues we were having, but added 12+ minutes to our build times.
|
@tjtravelnet Did you use any of the new |
Build times are insanely long compared to DIND even with caching activated. Environment:
|
Same experience on my side with Kubernetes gitlab runners. The build is a WAY longer than on my computer and I build on a pentium... |
Has the similar issue, end up with add |
We can observe this behavior, too - but from my point of view it's not a real problem here. Of course it would be nice if the snapshot taking could be tuned, but it will never reach the performance of an overlayfs based snapshot / layer creation. |
We are running the GitLab runner in AKS.
With DiND it takes around |
We have builds running in Kaniko that, due to the file system snapshots, are taking unacceptably long. This does not seem to have been remedied by using |
Same here. I tried used Kaniko in Google Cloud Build to get better caching behavior, but it's so slow that it's not worth it. Using I've turned my attention to Docker Buildx instead as it seems to combine the best of both worlds: fast builds and reliable caching. |
Curious, are are you using Buildx with Cloud Build? |
I tried to, but unfortunately my team is using GCP Container Registry and it doesn't seem to support Buildx cache artifacts. Artifact Registry on the other hand seems to work fine with Buildx, but since it's a lot more expensive that Container Registry, I'm not sure if it's worth it for us. |
same problem, |
i have the same problem |
Me too. |
If you consider using those flags, please check the docs first and proceed with caution, as using those flag may cause errors for you. At the time of writing:
|
running a Kaniko pod in a microk8s Kubernetes with setting So there might be some firewall/network issue when host network is not exposed |
Same thing here |
the same. upd: with such flags, it works on the same level as docker for me. decreased from 45 minutes to 8 minutes for a fairly dense image stage: build
rules:
- !reference [.master_or_web__rules, rules]
script:
- >-
/kaniko/executor
--context $CI_PROJECT_DIR/image
--dockerfile $CI_PROJECT_DIR/image/Dockerfile
--destination ${CI_DOCKER_IMAGE}:${CI_COMMIT_SHORT_SHA}
--destination ${CI_DOCKER_IMAGE}:latest
--cache=false
--cache-repo=${CI_DOCKER_IMAGE}:latest
--cache-ttl=1h
--force
--cleanup
--single-snapshot |
I also have the problem. |
Same here, kaniko builds still take ridiculously long even with "--snapshotMode=redo" and "--use-new-run". |
Yup, as it so happens I ran into this again today, with the same ⬆️ settings: Basically, I tried to build an image very similar to buildpack-deps with |
Sure, but then you give up on layer caching entirely since you only take a single snapshot at the very end. EDIT: I see you even set |
When using kaniko layer caching, disabling compression with |
We have a very simple Dockerfile which inherits a ubuntu jdk 8 image, run a few shell commands and copy a few files. Please note the RUN commands comes at the very first.
Our CI is built on top of Kubernetes, the Jenkins build will be run in a slave pod.
We've enabled DID & Kaniko in separate slave images and trigger the builds with Kaniko and Docker. Here is the performance result of building & pushing images we've observed:
Dockerfile by removing all RUN commands:
Dockerfile having 10 RUN commands:
May I know why Kaniko is so much slower than DID solution if there are RUN commands in Dockerfile? Can this part speed up?
We've tried the --cache & the --cache-repo parameters, the performance of Kaniko build did not improve at all. Here is the details:
However the performance is much worse with cache, taking 254s. I think the cache uploading or downloading is also a time killer.
Please help explain the cache issue and advice how we can further improve the performance for Kaniko build.
The Dockerfile we used likes below:
FROM abc
COPY *.jar /app/app.jar
RUN jar -xvf app.jar &&
rm -rf app.jar &&
mkdir -p /layer_build/lib/snapshots &&
mkdir -p /layer_build/lib/releases &&
mkdir -p /layer_build/app &&
find BOOT-INF/lib -name 'SNAPSHOT' -type f -exec mv {} /layer_build/lib/snapshots ; &&
mv BOOT-INF/lib/* /layer_build/lib/releases &&
rm -rf BOOT-INF/lib &&
mv * /layer_build/app
FROM def
COPY --from=0 layer_build/lib/snapshots/ /app/BOOT-INF/lib/
COPY --from=0 layer_build/lib/releases/ /app/BOOT-INF/lib/
COPY --from=0 layer_build/app/ /app/
WORKDIR /app
CMD ["/bin/bash", "-c", "/app/bin/run.sh"]
The text was updated successfully, but these errors were encountered: