Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bundle clean force error #269

Closed
2 tasks done
tarrantrj opened this issue Apr 27, 2020 · 9 comments
Closed
2 tasks done

bundle clean force error #269

tarrantrj opened this issue Apr 27, 2020 · 9 comments

Comments

@tarrantrj
Copy link

  • I'm on Docker
    • I understand Docker may be unsupported

Description

Since the update to v4 a few days ago (April 23rd?) we encountered a similar error to #268.

After making including the chown change in our pipeline we are now encountering an issue

Cleaning all the gems on your system is dangerous! If you're sure you want to remove every system gem not in this bundle, run `bundle clean --force`.

We are running the following commands

rm -fr $JEKYLL_VAR_DIR/*
chown jekyll:jekyll -R /usr/gem
jekyll build -d $JEKYLL_VAR_DIR

Output

Bundle complete! 4 Gemfile dependencies, 31 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.
Cleaning all the gems on your system is dangerous! If you're sure you want to remove every system gem not in this bundle, run `bundle clean --force`.

Expected

Bundle complete! 4 Gemfile dependencies, 31 gems now installed.
Bundled gems are installed into `/usr/local/bundle`
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux-musl]
Configuration file: /tmp/build/bfc0b666/git/_config.yml
            Source: /tmp/build/bfc0b666/git
       Destination: /var/jekyll
 Incremental build: disabled. Enable with --incremental
      Generating... 
       Jekyll Feed: Generating feed for posts
                    done in 5.446 seconds.
 Auto-regeneration: disabled. Use --watch to enable.
@kingdonb
Copy link

kingdonb commented Apr 27, 2020

It looks like #268 and #269 are both related to this issue. I had a similar experience, build which was on FROM jekyll/jekyll:4.0 stopped working, it was the bundle install operation that failed, and the appearance is that while jekyll 4.0.0 was released some time ago, the jekyll/jekyll:4.0.0 image itself is not immutable and has changed.

I went looking for a tag that might refer to that previous version of 4.0.0 and came up empty. It was not possible for me to change the FROM in my own Dockerfile to make the build succeed again, without referring back to jekyll/jekyll:3.8.6 which is less than ideal. Worse, I wasn't able to figure out exactly how the build machinery produces this image to roll back the recent commits and try rebuilding it for myself.

As commenters in the related thread have already stated, tags (at least some tags) should be immutable, it is very bad for breaking changes to erase historical images that worked, rendering derivative Dockerfiles unusable. I noticed an appeal for funding when I was in there looking at those commits I wanted to roll back. I understand open source is on a volunteer basis and hope you will not take these comments the wrong way, maintainers.

My intention is to make the project better for everyone, the easiest way to solve this issue going forward IMHO is to add a tag at build/push time which includes the commit sha, so that older images can be referenced and inspected for differences rather than overwriting them when a new image is built. It would be nice to honor semver, but this will make progress within the docker build harder to achieve, when the image tag used is meant to reflect the version of Jekyll included from upstream. I see how this could make it difficult to do versioning effectively.

There are still things you can do that make sure downstream pipelines using this image won't break unrecoverably with new image builds, like ensuring every image pushed will get a unique and permanent tag so it will be preserved historically, and guaranteeing historical tags will only be removed in extreme cases, not casually overwritten and discarded every time a new build is undertaken. (Edit: upon reflection, I think I understand this was not a casual or everyday issue.)

But that horse has left the barn, that old image is gone, and the version waters are all muddy now. So as for understanding what changed in the build and how to restore compatibility in the 4.0.0 line, while preserving the Dockerfile author's intended updates, I haven't done enough digging to understand what change should happen next, or if it makes sense to just go ahead and overwrite 4.0.0 again with a fix of some kind.

@envygeeks
Copy link
Owner

This issue has been fixed. Technically we still bundle clean, or try to, we just ignore the error and hide it because we don't care if it succeeds, we just wanna try on your behalf.

@fabacab
Copy link

fabacab commented May 11, 2020

This issue has been fixed. Technically we still bundle clean, or try to, we just ignore the error and hide it because we don't care if it succeeds, we just wanna try on your behalf.

Has this fix been rolled out into production or am I doing something wrong such that the applied fix is not working for me?

I have a Dockerfile that begins FROM jekyll/builder, which was working fine until today when I noticed that the jekyll/builder:latest image now includes Bundler 2.1.4 by default. My project's Gemfile was bundled with 1.17.3 so I amended my own container's entry point to run gem install -v 1.17.3 bundler before calling this project's.

Unfortuantely, this caused a permissions error I hadn't encountered before which I believe is the same as what is described #272. In my case, setting JEKYLL_ROOTLESS=true fixes the issue (blunt but effective…), which is when I ran into the bundle clean error described in this thread.

Now I'm in a catch-22, because if I work around the permission issues by running jekyll as UID and GID 0, then line 29 in repos/jekyll/copy/all/usr/jekyll/bin/bundle is executed (due to the earlier check on line 12 correctly recognizing that we are running as root and thus not "skipping out" before the bundle clean command is called), or else if I don't run as root to avoid the bundle clean error, I hit the earlier permissions problem.

I can't immediately see where in the bundle wrapper script errors from clean are "ignored" either, so I feel like I must be missing something since you say this is already "fixed"?

Any help would be appreciated as I try to restore our CI/CD systems as soon as possible.

@envygeeks
Copy link
Owner

The problem is you are setting the Jekyll user to UID/GID = 0 and then it's chowning the folder to UID/GID 0 thus killing your permissions, don't change Jekyll's UID/GID, let the image set it when it runs if you need a different one via the environment variables provided to do so.

@fabacab
Copy link

fabacab commented May 12, 2020

The problem is you are setting the Jekyll user to UID/GID = 0 and then it's chowning the folder to UID/GID 0 thus killing your permissions, don't change Jekyll's UID/GID, let the image set it when it runs if you need a different one via the environment variables provided to do so.

Thanks for the quick reply! And I'm sorry, but I'm not sure I understand your answer. Maybe I was unclear, let me try to explain my situation again.

I have a Dockerfile that is based on jekyll/builder. This Dockerfile uses its own entry point script. At some point in my own entry point script, I have this:

    # Execute the original image's own entrypoint.
    /usr/jekyll/bin/entrypoint jekyll build -d "$JEKYLL_DATA_DIR" "$INPUT_JEKYLL_BUILD_OPTS"

The idea here is that I thought it would be possible to simply call up to the official Jekyll Builder entry point, and this did work for quite a while, until today.

My own entry point script needs to run as root, but no where do I change the values for JEKYLL_UID or JEKYLL_GID, which means that the upstream's default of 1000 is applied on each run, as to be expected. Nevertheless, I receive the following permissions errors in my build logs:

Errno::EACCES: Permission denied @ rb_file_s_rename -
(/home/jekyll/.gem/ruby/2.7.0/cache/i18n-0.9.5.gem,
/usr/gem/cache/i18n-0.9.5.gem)

If I'm not mistaken, this is the same permissions issue as described in the issue I linked earlier. Notice, importantly, that this permissions error occurs when JEKYLL_UID=1000; that's one reason I'm confused.

My understanding of your reply is that I should ensure that the jekyll user's UID and GID are not 0, i.e., that I allow the upstream /usr/jekyll/bin/entrypoint script to execute as root but then to change the Jekyll user's UID as appropriate. As far as I can tell, that is exactly the case already, yet the permissions errors are preventing successful builds. Note also that when I explicitly set the JEKYLL_UID and the JEKYLL_GID environment variables equal to 1000, the same permission error results.

One question I have is: Can you tell me what you are referring to when you say that "it is changing the folder to UID/GID 0"? What is the "it" in that sentence referring to?

So, since I am already not setting the Jekyll user or group ID to zero, and yet I'm still facing permissions issues in the default configuration, I am unsure what I might be doing wrong that prevents my builds from succeeding. When I attempt to solve the permissions error by explicitly setting JEKYLL_UID=0, for example via the JEKYLL_ROOTLESS environment variable, that's when the bundle clean error arises for the reasons I described earlier, leaving me in a catch-22 situation that cannot be resolved one way or the other. However, leaving JEKYLL_UID alone or explicitly setting it to 1000 (away from 0), causes the earlier error. Neither of the two approaches provides a satisfying fix.

Is there a third option?

Edited to add: For the moment, I've simply pinned our builds to jekyll/builder:3.8, which works as before without issue. This feels like a temporary fix, though. Thanks again for all your attention to this.

@fabacab
Copy link

fabacab commented May 12, 2020

So I took another stab at this this evening and I'm fairly certain that unless I'm just totally misusing the container, there's a problem that causes the build to fail with the bundle clean error here.

Some of the permissions errors I was having turned out to be a caching issue where my GitHub Actions runners were not pulling down the correct image. After resolving that by pinning my base container image with a digest as described here, I'm no longer seeing any permissions errors but once again hitting this bundle clean error.

The error is certainly caused by the fact that when I execute the entrypoint script, the following sequence of events occurs:

  1. root calls the entrypoint script with these parameters: jekyll build -d /srv/jekyll --future
    • JEKYLL_UID and JEKYLL_GID are set to their default value of 1000 so no modifications are made.
    • The last line of the script exec "$@", and so…
  2. …we next run /usr/jekyll/bin/jekyll build -d /srv/jekyll --future.
    • The code proceeds to line 26, which checks for the presence of a Gemfile and whether or not we're connected to the Internet.
    • Since both of these are true (I have a Gemfile and am connected to the Internet), the script next runs…
  3. bundle install. So now we begin the /usr/jekyll/bin/bundle script, which wraps the real bundle executable.
    • On line 4 is where the default-gem-permissions call is made. This is the line that was missing when I posted my earlier comment. That's part of why the permissions errors may have been cropping up. Now that default-gem-permissions is being called, the permissions errors are gone.
    • However, unfortunately, if you examine the call stack, we have not yet ever run su-exec jekyll, which means we are still executing the script as root. This means that the check on line 12 fails (because remember, there is a Gemfile present so the first test returns a falsey value, and we are still executing with ID 0, so the second test [ "$(id -u)" != "0" ] also returns a falsey value).
  4. Finally, since this script was invoked as bundle install, the test on line 16 returns a truthy value, and we immediately continue to line 18 where we run su-exec jekyll $exe check, which expands to su-exec jekyll /usr/local/bin/bundle check.
    • Since this is the very first run of the script, we have not yet had a chance to install any dependencies described in the Gemfile. As a result, bundle check returns a 1, which is negated by the script, giving us a truthy value and forcing us into the bundle clean code branch.

That code branch always fails (bundle clean will return a 1), and since the set -e on line 5 forces the bundle wrapper script to exit immediately when any simple command returns a failure status code, that means this wrapper script immediately exits with an error status code, as well. Every ancestral wrapper script also has set -e specified, which means this erroring will bubble up from the bundle wrapper script to the jekyll wrapper script to the entrypoint script itself, ultimately causing the container to stop executing with a failure status, causing the build error.

I can not see any way around this sequence of events that works in all cases and avoids the bundle clean error, so I'm really curious to learn about how this issue was (supposedly?) fixed. It simply does not seem fixed to me, and I'm having a hell of a hard time trying to figure out what I might be missing, as well as why this is suddenly failing for me today but was working fine as recently as yesterday.

As usual, thanks so much for your help.

@envygeeks
Copy link
Owner

I'll see if I can replicate it and take a look today or tomorrow!

@envygeeks envygeeks reopened this May 12, 2020
@fabacab
Copy link

fabacab commented May 12, 2020

I'll see if I can replicate it and take a look today or tomorrow!

Many thanks for looking into this for us all. Please let me know if there is any more information you need that you think I can provide and I will do my best to get it to you.

@hbokh
Copy link

hbokh commented May 14, 2020

I arrived here because of the same issue when building a Docker image from https://github.com/cobbler/cobbler.github.io/blob/master/Dockerfile

Replacing the first line to FROM jekyll/jekyll:3 as build got me going.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants