Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-5755 Docker image build fail #2937

Merged
merged 9 commits into from
Apr 2, 2024

Conversation

thomasbruggink
Copy link
Contributor

@thomasbruggink thomasbruggink commented Feb 25, 2024

This PR submits fixes to the focal and jammy docker images.

  • Bionic support was dropped becaused dotnet 8 no longer supports bionic (Ubuntu 18.04). Moved to old/ like other unmaintained images.
  • Focal/Jammy used the wrong apt location for dotnet, endpoint was 18.04 instead of 20.04/22.04
  • Jammy cannot build Erlang OPT 23 since it depends on OpenSSL 1.1 which was dropped in favor of 3.0. Using Erlang OPT 25 fixes the problem since it depends on OpenSSL 3.0
  • Jammy was installing JDK 11 but lib/java requires Java 17

All containers used the root used to volume map the local files into the running container. This creates a hard to maintain working directory on Linux and MacOS since files form the local user and root user are mixed.
To solve this the new docker files can be build using the UID and GID of the host so the files dont mix. The script uses UID and GID 1000 since these are the default ids for most Linux distros.

Change the travis yml to build with 20.04 instead of 18.04. Removed all traces of 18.04 but it cant be tested locally.

Updated the README to reflect the new build/docker/ directory.

Let me know if you prefer to remove the old/ directory completely instead of moving bionic into it.

@Jens-G Jens-G self-requested a review February 25, 2024 11:30
@Jens-G Jens-G added the build and general CI cmake, automake and build system changes label Feb 25, 2024
@thomasbruggink thomasbruggink force-pushed the fix-docker-build-files branch 2 times, most recently from 0ec38f4 to 6e04107 Compare February 25, 2024 13:09
@sveneld
Copy link
Contributor

sveneld commented Feb 25, 2024

I think the best solution is to remove "old" directory.

@@ -47,7 +47,7 @@ env:
- SCRIPT="cmake.sh"
- BUILD_ARG=""
- BUILD_ENV="-e CC=gcc -e CXX=g++ -e THRIFT_CROSSTEST_CONCURRENCY=4"
- DISTRO=ubuntu-bionic
- DISTRO=ubuntu-focal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As i understand thrift does not use travis and use github actions now. Maybe the better way - delete travis?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure i dont mind deleting this. Not sure what the maintainers think :)

Copy link
Member

@fishy fishy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will also need to update go to 1.21, as we already dropped support to go 1.20.

coverage.

### Ubuntu ###

* focal (stable, current)
* bionic (previous stable)
* jammy (next stable, WIP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now jammy is already the current stable and focal is already the previous stable, right? https://ubuntu.com/about/release-cycle

@sveneld
Copy link
Contributor

sveneld commented Feb 27, 2024

Maybe we can use github actions for build docker images? It will be useful, and we did not miss possible bugs

@thomasbruggink
Copy link
Contributor Author

I've deleted the old docker directory and updated the go version to 1.21.
Let me know if you prefer the old directory to stay I can revert the commit.

As for the travis removal, I think this change is slightly off topic from this PR? Perhaps it can be removed in a different PR if others agree.

@Jens-G
Copy link
Member

Jens-G commented Mar 3, 2024

I tried to build using the ReleaseManagement.md document, except for focal, like so:

$ docker run -v $(pwd):/thrift/src:rw -it thrift/thrift-build:ubuntu-focal /bin/bash
container# ./bootstrap.sh && ./configure && make dist

I get this, which seems non-fatal:

libtoolize:   error: './aclocal/libtool.m4' is newer: use '--force' to overwrite
libtoolize:   error: './aclocal/ltversion.m4' is newer: use '--force' to overwrite

and this, which is definitely fatal:

config.status:1796: executing depfiles commands
config.status:1873: cd compiler/cpp       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmSTNBGe:619: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd compiler/cpp/src       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmgeglFa:526: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd lib/cpp       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmUqictg:711: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd lib/cpp/test       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmQdxGKi:738: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd lib/c_glib       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmBtppXK:611: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd lib/c_glib/test       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmrdHNl9:857: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd lib/lua       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmGn2JNQ:587: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd test/c_glib       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmFg7yb8:478: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd test/cpp       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmKD31lr:518: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd tutorial/c_glib       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/GmcgCIUt:489: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1873: cd tutorial/cpp       && sed -e '/# am--include-marker/d' Makefile         | make -f - am--depfiles
/tmp/Gm66qrZO:496: *** missing separator.  Stop.
config.status:1878: $? = 2
config.status:1882: error: in `/thrift/src':
config.status:1884: error: Something went wrong bootstrapping makefile fragments
    for automatic dependency tracking.  Try re-running configure with the
    '--disable-dependency-tracking' option to at least be able to build
    the package (albeit without support for automatic dependency tracking).


Copy link
Member

@Jens-G Jens-G left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fixed

@thomasbruggink
Copy link
Contributor Author

thomasbruggink commented Mar 4, 2024

@Jens-G thanks for testing this.
I've been unable to re-produce the output you shared, from the message it seems that that autoconf is out of date. Perhaps it's running against an older image?
Can you try to rebuild with the name you provided to override it?

docker build --build-arg uid=$(id -u) --build-arg gid=$(id -g) -t thrift/thrift-build:ubuntu-focal build/docker/ubuntu-focal

If that doesnt work could you try adding --no-cache to force a rebuild? Perhaps the apt cache is outdated.

I did find Haxe and Rust were missing so I added a commit for those.

@sveneld
Copy link
Contributor

sveneld commented Mar 4, 2024

@thomasbruggink can you help with updating php part of container.
Here it is some discussion
https://github.com/apache/thrift/pull/2942/files#r1510420807

@thomasbruggink
Copy link
Contributor Author

@thomasbruggink can you help with updating php part of container.
Here it is some discussion
https://github.com/apache/thrift/pull/2942/files#r1510420807

Sure, you want me to include the change you made as well with regards to xdebug?

@sveneld
Copy link
Contributor

sveneld commented Mar 4, 2024

@thomasbruggink can you help with updating php part of container.
Here it is some discussion
https://github.com/apache/thrift/pull/2942/files#r1510420807

Sure, you want me to include the change you made as well with regards to xdebug?

Yes, please, include my updates in Dockerfile for php.

Windows:

thrift$ docker build -t thrift build/docker/ubuntu-jammy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tables on line 155 and 174 should be also updated with correct versions of core tools and compiler/languages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 👍
would be nice to automate this in the future somehow

@Jens-G
Copy link
Member

Jens-G commented Mar 4, 2024

build@6a87fe62b977:/thrift/src$ ./bootstrap.sh && ./configure && make dist
make distclean... ok
Configuring for:
PHP Api Version:         20190902
Zend Module Api No:      20190902
Zend Extension Api No:   320190902
sh: 1: cannot create autom4te.cache/output.0t: Permission denied
autom4te: /usr/bin/m4 failed with exit status: 2
libtoolize:   error: './aclocal/libtool.m4' is newer: use '--force' to overwrite
libtoolize:   error: './aclocal/ltversion.m4' is newer: use '--force' to overwrite
autoheader: cannot rename /tmp/ahgpEtwo/config.hin as config.hin: Permission denied

That's it. But its much quicker now, at least :-D

@thomasbruggink
Copy link
Contributor Author

build@6a87fe62b977:/thrift/src$ ./bootstrap.sh && ./configure && make dist
make distclean... ok
Configuring for:
PHP Api Version:         20190902
Zend Module Api No:      20190902
Zend Extension Api No:   320190902
sh: 1: cannot create autom4te.cache/output.0t: Permission denied
autom4te: /usr/bin/m4 failed with exit status: 2
libtoolize:   error: './aclocal/libtool.m4' is newer: use '--force' to overwrite
libtoolize:   error: './aclocal/ltversion.m4' is newer: use '--force' to overwrite
autoheader: cannot rename /tmp/ahgpEtwo/config.hin as config.hin: Permission denied

That's it. But its much quicker now, at least :-D

My guess here is that the previous container running as root left some files behind with user and group 1. The current docker file changes the user it runs under to use the same user id and group id as the host system has. This way the repository wont be polluted with files a non root user cannot delete.
Can you try removing the files (autom4te.cache/) manually (probably requires root). Or sudo git clean -xfd outside the container to reset the entire repository 🙇

I still get a build error running make dist but I'm not sure if this is because of me missing files so I'm curious about your input there.

/thrift/src/compiler/cpp/thrift --gen cpp ../../../test/AnnotationTest.thrift
/bin/bash: /thrift/src/compiler/cpp/thrift: No such file or directory

I have the idea make dist is supposed to build the thrift compiler as well but this seems not to happen 🤔

@Jens-G
Copy link
Member

Jens-G commented Mar 5, 2024

I have the idea make dist is supposed to build the thrift compiler as well but this seems not to happen 🤔

Apache releases are source code releases.

@Jens-G
Copy link
Member

Jens-G commented Mar 5, 2024

My guess here is that the previous container running as root left some files behind [...]

That was my first thought too, so I removed the whole folder. Same result. Now I basically cloned from scratch and currently the the third attempt to build it runs (the two before stopped with "not enough space", so I also cleaned all docker build caches etc. ). Lets see what that brings ...

EDIT:

$ ./bootstrap.sh 
make distclean... ok
Configuring for:
PHP Api Version:         20190902
Zend Module Api No:      20190902
Zend Extension Api No:   320190902
mkdir: cannot create directory 'build': Permission denied
cp: target '/thrift/src/lib/php/src/ext/thrift_protocol/build' is not a directory
cp: cannot create regular file '/thrift/src/lib/php/src/ext/thrift_protocol/run-tests.php': Permission denied
/usr/bin/phpize: 153: cannot create configure.ac: Permission denied
chmod: cannot access '/thrift/src/lib/php/src/ext/thrift_protocol/build/shtool': No such file or directory
shtool at '/thrift/src/lib/php/src/ext/thrift_protocol/build/shtool' does not exist or is not executable.
Make sure that the file exists and is executable and then rerun this script.

autoscan: cannot open > autoscan.log: Permission denied

@thomasbruggink
Copy link
Contributor Author

My guess here is that the previous container running as root left some files behind [...]

That was my first thought too, so I removed the whole folder. Same result. Now I basically cloned from scratch and currently the the third attempt to build it runs (the two before stopped with "not enough space", so I also cleaned all docker build caches etc. ). Lets see what that brings ...

Hmm I cant reproduce this 🤔 I fixed the setup to allow root builds again, this would at least confirm the image has the correct tools installed.
Build as:

docker build --build-arg uid=0 --build-arg user=root --build-arg gid=0 --build-arg group=root -t thrift/thrift-build:ubuntu-focal build/docker/ubuntu-focal

Can you share a bit about your setup? Are you on a Linux distro or wsl on windows? Running docker rootless? Maybe I can repro this in a VM.

I'm testing under a pretty standard Ubuntu 22.04 installation (not a VM).

@sveneld sveneld mentioned this pull request Mar 8, 2024
5 tasks
@Jens-G
Copy link
Member

Jens-G commented Mar 8, 2024

Can you share a bit about your setup? Are you on a Linux distro or wsl on windows?
Running docker rootless? Maybe I can repro this in a VM.

Mint LDME 6 (the Debian based flavour) and Docker Desktop.

@sveneld
Copy link
Contributor

sveneld commented Mar 15, 2024

@ulidtko maybe you can help to figure the problem?

@thomasbruggink
Copy link
Contributor Author

thomasbruggink commented Mar 15, 2024

Ok I figured it out, the difference seems to be in using docker desktop which even on Linux runs another Linux VM inside.
https://docs.docker.com/desktop/faqs/linuxfaqs/#how-do-i-enable-file-sharing
File sharing under virtioFS seems to by default map all file permissions in the directory from the UID on the host to UID 0 (root) in the container.

So instead the container on Linux with docker desktop should be build like:

docker build -t thrift build/docker/ubuntu-jammy

and can then be started as:

docker run -v $(pwd):/thrift/src -it thrift /bin/bash

inside the container all files will be root but on the host they will automatically be set to whatever the host runs.

I still need to confirm if Mac behaves the same. confirmed that docker-desktop behaves the same under windows, mac and linux. So the current readme should be correct 🙇

@Jens-G could you try and see if the above instructions allow you to ./bootstrap.sh && ./configure && make dist correctly?

@Jens-G
Copy link
Member

Jens-G commented Mar 16, 2024

make  distdir-am
make[6]: Entering directory '/thrift/src/lib/cpp'
 (cd test && make  top_distdir=../../../thrift-0.21.0 distdir=../../../thrift-0.21.0/lib/cpp/test \
     am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[7]: Entering directory '/thrift/src/lib/cpp/test'
/thrift/src/compiler/cpp/thrift --gen cpp ../../../test/AnnotationTest.thrift
/bin/bash: line 1: /thrift/src/compiler/cpp/thrift: No such file or directory
make[7]: *** [Makefile:1880: gen-cpp/AnnotationTest_types.h] Error 127
make[7]: Leaving directory '/thrift/src/lib/cpp/test'
make[6]: *** [Makefile:1685: distdir-am] Error 1
make[6]: Leaving directory '/thrift/src/lib/cpp'
make[5]: *** [Makefile:1681: distdir] Error 2
make[5]: Leaving directory '/thrift/src/lib/cpp'
make[4]: *** [Makefile:672: distdir-am] Error 1
make[4]: Leaving directory '/thrift/src/lib'
make[3]: *** [Makefile:668: distdir] Error 2
make[3]: Leaving directory '/thrift/src/lib'
make[2]: *** [Makefile:791: distdir-am] Error 1
make[2]: Leaving directory '/thrift/src'
make[1]: *** [Makefile:785: distdir] Error 2
make[1]: Leaving directory '/thrift/src'
make: *** [Makefile:894: dist] Error 2

Better, still not perfect. Not sure where this comes from ... mmmh.

This PR submits fixes to the focal and jammy docker images.

* Bionic support was dropped becaused dotnet 8 no longer supports bionic
(Ubuntu 18.04). Moved to `old/` like other unmaintained images.
* Focal/Jammy used the wrong apt location for dotnet, endpoint was 18.04
  instead of 20.04/22.04
* Jammy cannot build Erlang OPT 23 since it depends on OpenSSL 1.1
  which was dropped in favor of 3.0. Using Erlang OPT 25 fixes the
  problem since it depends on OpenSSL 3.0
* Jammy was installing JDK 11 but lib/java requires Java 17

All containers used the `root` used to volume map the local files into
the running container. This creates a hard to maintain working directory
on Linux and MacOS since files form the local user and root user are
mixed.
To solve this the new docker files can be build using the UID and GID of
the host so the files dont mix. The script uses UID and GID 1000 since
these are the default ids for most Linux distros.

Change the travis yml to build with 20.04 instead of 18.04. Removed all
traces of 18.04 but it cant be tested locally.

Updated the README to reflect the new `build/docker/` directory.
@thomasbruggink thomasbruggink force-pushed the fix-docker-build-files branch from cb78349 to 3b387ed Compare March 17, 2024 01:06
@thomasbruggink
Copy link
Contributor Author

thomasbruggink commented Mar 17, 2024

make  distdir-am
make[6]: Entering directory '/thrift/src/lib/cpp'
 (cd test && make  top_distdir=../../../thrift-0.21.0 distdir=../../../thrift-0.21.0/lib/cpp/test \
     am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[7]: Entering directory '/thrift/src/lib/cpp/test'
/thrift/src/compiler/cpp/thrift --gen cpp ../../../test/AnnotationTest.thrift
/bin/bash: line 1: /thrift/src/compiler/cpp/thrift: No such file or directory
make[7]: *** [Makefile:1880: gen-cpp/AnnotationTest_types.h] Error 127
make[7]: Leaving directory '/thrift/src/lib/cpp/test'
make[6]: *** [Makefile:1685: distdir-am] Error 1
make[6]: Leaving directory '/thrift/src/lib/cpp'
make[5]: *** [Makefile:1681: distdir] Error 2
make[5]: Leaving directory '/thrift/src/lib/cpp'
make[4]: *** [Makefile:672: distdir-am] Error 1
make[4]: Leaving directory '/thrift/src/lib'
make[3]: *** [Makefile:668: distdir] Error 2
make[3]: Leaving directory '/thrift/src/lib'
make[2]: *** [Makefile:791: distdir-am] Error 1
make[2]: Leaving directory '/thrift/src'
make[1]: *** [Makefile:785: distdir] Error 2
make[1]: Leaving directory '/thrift/src'
make: *** [Makefile:894: dist] Error 2

Better, still not perfect. Not sure where this comes from ... mmmh.

It seems from local testing that the thrift compiler is required to run make dist correctly. Perhaps the original container did not configure all targets correctly so the dist job for all languages never actually ran?
For example the go library requires compiler/cpp/thrift to exist:

 (cd go && make  top_distdir=../../thrift-0.21.0 distdir=../../thrift-0.21.0/test/go \
     am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[5]: Entering directory '/thrift/src/test/go'
Makefile:664: warning: overriding recipe for target 'check'
Makefile:512: warning: ignoring old recipe for target 'check'
grep -v list.*map.*list.*map ../../test/ThriftTest.thrift > ThriftTest.thrift
mkdir -p src/gen
/thrift/src/compiler/cpp/thrift -out src/gen --gen go:thrift_import=github.com/apache/thrift/lib/go/thrift,package_prefix=github.com/apache/thrift/test/go/src/gen/ ThriftTest.thrift
[WARNING:/thrift/src/test/go/ThriftTest.thrift:43] No generator named 'noexist' could be found!
[WARNING:/thrift/src/test/go/ThriftTest.thrift:45] cpp generator does not accept 'noexist' as sub-namespace!
/thrift/src/compiler/cpp/thrift -out src/gen --gen go:thrift_import=github.com/apache/thrift/lib/go/thrift,package_prefix=github.com/apache/thrift/test/go/src/gen/ ../StressTest.thrift
[WARNING:/thrift/src/test/StressTest.thrift:31] Consider using the more efficient "binary" type instead of "list<byte>".
[WARNING:/thrift/src/test/StressTest.thrift:31] Consider using the more efficient "binary" type instead of "list<byte>".
/thrift/src/compiler/cpp/thrift -out src/gen --gen go:thrift_import=github.com/apache/thrift/lib/go/thrift,package_prefix=github.com/apache/thrift/test/go/src/gen/ ../ConstantsDemo.thrift
touch gopath

So once I ran pushd compiler/cpp/ && make -j $(nproc) && popd && make dist it would run until the PHP job which I fixed here: 3b387ed

Then the job runs till completion and a thrift-0.21.0.tar.gz is created.

I think there are 2 options, 1 is to build the thrift compiler first or 2 is to remove the dependencies on running the thrift compiler. The latter probably requires a bunch more commits.

@Jens-G
Copy link
Member

Jens-G commented Mar 18, 2024

To my understanding make dist should be self-sufficient, i.e. not require the compiler being built before in a separate step. That's also the way in which it behaves right now, I just double-checked it.

@Jens-G
Copy link
Member

Jens-G commented Mar 18, 2024

Perhaps the original container did not configure all targets correctly so the dist job for all languages never actually ran?

Yes and no. It's obviously some automake change. I compared the generated Makefiles and it became rather obvious, because now

distdir: $(BUILT_SOURCES)
	$(MAKE) $(AM_MAKEFLAGS) distdir-am

distdir-am: $(DISTFILES)

but before simply

distdir: $(DISTFILES)

Note the lack of the $(BUILT_SOURCES) dependency.


There's a new option no-dist-built-sources that basically reverts the change as an opt-in feature.

@thomasbruggink
Copy link
Contributor Author

Note the lack of the $(BUILT_SOURCES) dependency.

There's a new option no-dist-built-sources that basically reverts the change as an opt-in feature.

Thanks for the hint 👍 I have no experience with automake so I've tried to figure out where to apply your solution. Updating AM_INIT_AUTOMAKE with the no-dist-built-sources has the problem that it wont work on automake under version 1.16.5 which is not installable through a package manager on ubuntu 20.04.
However even on 1.16.5 it seems this option just doesnt work? I could find others that seem to have run into the same problem.

So the latest commit manually overrides the distdir option but removes the dependecy on BUILT_SOURCES. This works on both 20.04 and 22.04 for me.

Please let me know if this also succeeds for you 🙇 if you happen to know a better solutions then I can redo the last commit.

@Jens-G
Copy link
Member

Jens-G commented Mar 21, 2024

There's a new option no-dist-built-sources that basically reverts the change as an opt-in feature.

However even on 1.16.5 it seems this option just doesnt work? I could find others that seem to have run into the same problem.

Interesting. Yes I did notice, indeed. Obviously I am the first one then that brought the issue up at the right place: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=69908

I'll try your solution tomorrow, thanks for fixing it.

@Jens-G
Copy link
Member

Jens-G commented Mar 24, 2024

I'll try your solution tomorrow, thanks for fixing it.

seems to work now

@Jens-G Jens-G self-requested a review March 30, 2024 21:23
Copy link
Member

@Jens-G Jens-G left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Jens-G
Copy link
Member

Jens-G commented Apr 1, 2024

@fishy @sveneld All fine with this?

@sveneld
Copy link
Contributor

sveneld commented Apr 2, 2024

LGTM

@Jens-G Jens-G merged commit 63f0458 into apache:master Apr 2, 2024
18 of 20 checks passed
@thomasbruggink thomasbruggink deleted the fix-docker-build-files branch July 27, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build and general CI cmake, automake and build system changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants