Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alpine 3.13, armv7 network-access seems to be broken #135

Closed
jaedle opened this issue Jan 16, 2021 · 26 comments
Closed

alpine 3.13, armv7 network-access seems to be broken #135

jaedle opened this issue Jan 16, 2021 · 26 comments

Comments

@jaedle
Copy link

jaedle commented Jan 16, 2021

Hey!

Some of my nightly armv7 alpine builds suddenly started failing. It looks like there is a problem on installing packages through apk on alpine 3.13 on armv7l.

Unfortunately this means that the latest tag is currently broken for armv7l.

Example commands and output:

> docker container run --rm -it alpine:3.13 sh

/ # cat /etc/alpine-release
3.13.0
/ # uname -a
Linux 424524b1584e 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l Linux
apk add --no-cache curl
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/armv7/APKINDEX.tar.gz
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1913:
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: Permission denied
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/armv7/APKINDEX.tar.gz
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1996002192:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1913:
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/community: Permission denied
ERROR: unable to select packages:
  curl (no such package):
    required by: world[curl]

Same thing is working perfectly fine on alpine 3.12:

> docker container run --rm -it alpine:3.12 sh

/ # cat /etc/alpine-release
3.12.3
/ # uname -a
Linux 2406cc5a46e9 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l Linux
/ # apk add --no-cache curl
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/armv7/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/armv7/APKINDEX.tar.gz
(1/4) Installing ca-certificates (20191127-r4)
(2/4) Installing nghttp2-libs (1.41.0-r0)
(3/4) Installing libcurl (7.69.1-r3)
(4/4) Installing curl (7.69.1-r3)
Executing busybox-1.31.1-r19.trigger
Executing ca-certificates-20191127-r4.trigger
OK: 5 MiB in 18 packages

I started digging deeper. It looks like the network access on the docker container seems to be broken.

docker container run --rm -it alpine:3.13 sh
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: clock_gettime(MONOTONIC) failed
/ # nslookup www.google.com
nslookup: clock_gettime(MONOTONIC) failed

When running the container privileged the problems are gone:

docker container run --rm -it --privileged alpine:3.13 sh
/ # nslookup www.google.de
Server:		8.8.8.8
Address:	8.8.8.8:53

Non-authoritative answer:
Name:	www.google.de
Address: 172.217.19.67

Non-authoritative answer:
Name:	www.google.de
Address: 2a00:1450:4005:80b::2003
@jaedle jaedle changed the title apk install fails on alpine:3.13 armv7l network access on alpine 3.13 on armv7 seems to be broken Jan 16, 2021
@jaedle jaedle changed the title network access on alpine 3.13 on armv7 seems to be broken alpine 3.13, armv7 network seems to be broken Jan 16, 2021
@jaedle jaedle changed the title alpine 3.13, armv7 network seems to be broken alpine 3.13, armv7 network-access seems to be broken Jan 16, 2021
@orbsmiv
Copy link

orbsmiv commented Jan 17, 2021

I'm seeing similar issues:

docker run -it --rm alpine:3.13 ash
/ # apk update
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/armv7/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.13/main: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: No such file or directory
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/armv7/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.13/community: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/community: No such file or directory
2 errors; 14 distinct packages available
/ # wget https://dl-cdn.alpinelinux.org/alpine/v3.13/main/armv7/APKINDEX.tar.gz
Connecting to dl-cdn.alpinelinux.org (151.101.2.133:443)
ssl_client: dl-cdn.alpinelinux.org: certificate verification failed: format error in certificate's notBefore field
wget: error getting response: Connection reset by peer
/ # date
Sun Jan  0 00:100:4174038  1900

I've checked on three Raspberry Pis and they're all presenting the same issue and showing the date Sun Jan 0 00:100:4174038 1900. Rolling back to alpine:3.12 resolves the issue.

(Similar discussion here: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12091)

@jipp
Copy link

jipp commented Jan 17, 2021

Hi

based on the post above ( https://gitlab.alpinelinux.org/alpine/aports/-/issues/12091)

looks like another workaround would be to run the latest alpine:13.0 without any security profile (seccomp)
like described here: https://docs.docker.com/engine/security/seccomp/

docker run -it --rm --security-opt seccomp=unconfined alpine:3.13 ping www.google.de
Unable to find image 'alpine:3.13' locally
3.13: Pulling from library/alpine
Digest: sha256:d9a7354e3845ea8466bb00b22224d9116b183e594527fb5b6c3d30bc01a20378
Status: Downloaded newer image for alpine:3.13
PING www.google.de (172.217.21.195): 56 data bytes
64 bytes from 172.217.21.195: seq=0 ttl=115 time=15.151 ms
64 bytes from 172.217.21.195: seq=1 ttl=115 time=41.073 ms
64 bytes from 172.217.21.195: seq=2 ttl=115 time=14.698 ms
^C
--- www.google.de ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 14.698/23.640/41.073 ms

cheers

@jaedle
Copy link
Author

jaedle commented Jan 17, 2021

Thanks, @jipp

I'm running quite a lot of containers. Rather than upgrading and modify security policies, I downgraded everything to 3.12 and wait for a fix.

@wader
Copy link

wader commented Jan 17, 2021

Update: This was probably something local, it works fine again after reboot and on another machine.

I've run into something that sounds similar to this when resolving codeload.github.com.

$ docker run --rm -ti alpine:3.13.0
/ # ping codeload.github.com.
ping: bad address 'codeload.github.com.'
/ #
$ docker run --rm -ti alpine:3.12.3
/ # ping codeload.github.com.
PING codeload.github.com. (140.82.121.10): 56 data bytes
64 bytes from 140.82.121.10: seq=0 ttl=37 time=29.817 ms
64 bytes from 140.82.121.10: seq=1 ttl=37 time=30.078 ms
^C

Querying 8.8.8.8 and my router (openwrt) seems to work but not using thru dockers DNS servers:

$ docker run --rm -ti alpine:3.13.0
/ # apk add bind-tools
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/x86_64/APKINDEX.tar.gz
...
/ # host codeload.github.com 8.8.8.8
Using domain server:
Name: 8.8.8.8
Address: 8.8.8.8#53
Aliases:

codeload.github.com has address 140.82.121.10
/ # host codeload.github.com 192.168.1.1
Using domain server:
Name: 192.168.1.1
Address: 192.168.1.1#53
Aliases:

codeload.github.com has address 140.82.121.9
/ # host codeload.github.com
codeload.github.com has address 140.82.121.9
Host codeload.github.com not found: 3(NXDOMAIN)
Host codeload.github.com not found: 3(NXDOMAIN)
/ # cat /etc/resolv.conf
# This file is fetched from the host via vpnkit-bridge
nameserver 192.168.65.1

@joepagan
Copy link

Found this to be an issue downstream and created an issue on a php image repo.

tl;dr - it looks like the issue is ipv4/ipv6 related. Not sure if it's just curl or ipv6 being blocked across the entire container.

You can currently make successful requests by using the --ipv4 flag

@markfqs
Copy link

markfqs commented Jan 18, 2021

I experienced this exat same issue when 3.12.0 was released, but only ir ARMv7

Seems the root cause is related to clock/time not being properly set:

$ docker run -it alpine:3.13.0
/ # apk update
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/armv7/APKINDEX.tar.gz
1995559824:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1995559824:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1995559824:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1995559824:error:0D0D90AD:asn1 encoding routines:ASN1_TIME_adj:error getting time:crypto/asn1/a_time.c:330:
1995559824:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1913:
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.13/main: Permission denied
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: No such file or directory
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/armv7/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.13/community: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/community: No such file or directory
2 errors; 14 distinct packages available
/ # date
Sun Jan 0 00:100:4174038 1900
/ # sleep 60 && date
Sun Jan 0 00:100:4174038 1900
/ #
/ #
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: clock_gettime(MONOTONIC) failed
/ #

Note that as time is not correct, cannot validate TLS/SSL, also seems not only wrong, indeed seems is not even "ticking"

@joepagan
Copy link

Looks like the issue for me was that I needed to update to Docker for Mac 3.1.0.

@tianon
Copy link
Contributor

tianon commented Jan 18, 2021

You need to update libseccomp on your host to 2.4.2 or newer and Docker to 19.03.9 or newer (see moby/moby#40734).

@TBK
Copy link

TBK commented Jan 19, 2021

Please read https://wiki.alpinelinux.org/wiki/Release_Notes_for_Alpine_3.13.0#time64_requirements

@jipp
Copy link

jipp commented Jan 19, 2021

Following the wiki I got it running.

Nevertheless this would mean that running the default Raspbian 32bit (armv7) this is a mandatory step for Alpine images to be used.
As well for images that use Alpine 3.13.0 as a base.

@jaedle
Copy link
Author

jaedle commented Jan 19, 2021

@TBK thanks.

I'm running dietpi which is still based on stretch. Afaik there is no way to install seccomp with that version.

Feels like to get this straight without any hacky-solution a distro-change seems mandatory. 😞

@jaedle
Copy link
Author

jaedle commented Jan 19, 2021

As this is not a bug, but a known problem with old versions and possible workarounds I will close this issue.

Thanks for your help!

@peimansh
Copy link

peimansh commented Feb 14, 2021

Why is this closed ?!
People are still having issues with network access in 3.13 !

@dennis-wielepsky
Copy link

Why is this closed ?!
People are still having with network access in 3.13 !

This is problem of the docker host library versions.

Please read https://wiki.alpinelinux.org/wiki/Release_Notes_for_Alpine_3.13.0#time64_requirements

@RouxAntoine
Copy link

RouxAntoine commented Jun 19, 2021

Hello,
I setup the libseccomp2 version 2.5.1-1 armhf on my two raspberry (pi 2 armv7l and pi 1 B+ armv6l) a few month ago and it work well (thanks @nickwest and @mysystem32).

Today i format one of two rasp (raspberry pi 1 B+ armv6l) because i had strange kernel panic preventing reboot. I setup again raspbian buster 10, setup docker and container start but no resolution again. ping into container give error ping: clock_gettime(MONOTONIC) failed and date date Sun Jan 0 00:100:33238 1900 so i remember these issue i setup again libseccomp2 on this fresh setup and now :

docker rm <container> failed with error :

Error response from daemon: cannot stop container: nginxfront: Cannot kill container a072dd2619cf29d2340febe4175edb57e6194333ad5485b9255395e914a71c85: unknown error after kill: runc did not terminate successfully: : unknown

and apt cli stop working with segfault :

sudo apt update
Reading package lists... Done
E: Method http has died unexpectedly!
E: Sub-process http received signal 4.
E: Method /usr/lib/apt/methods/http did not start correctly
E: Method https has died unexpectedly!
E: Sub-process https received signal 4.

~/ $ /usr/lib/apt/methods/http Illegal instruction

So my question do you think something had changed related to libseccomp2 and docker dns resolution problem so that workaround don't work anymore ? Do you think my hypothesis about apt crash related to libseccomp2 is right ?

apt version 1.8.2.3 (armhf), available to give you additional information

Thanks in advance.

edit : i found this apparently am not alone with kernel panic issue https://gitlab.alpinelinux.org/alpine/aports/-/issues/12091#note_147886 :)

edit 2 : wget http://raspbian.raspberrypi.org/raspbian/pool/main/libs/libseccomp/libseccomp2_2.5.1-1+rpi1_armhf.deb and dpkg -i libseccomp2_2.5.1-1+rpi1_armhf.deb seem to fix my original dns issue and also illegal instruction on apt.

@ozbillwang
Copy link

ozbillwang commented Jun 30, 2021

I did change the base image to alpine:3.12, and upgrade the libseccomp-dev version to later (2.5.1-1), but I still see the build issue for arm platform from ubuntu amd64 box

image

Since I run the build on cloud pipelines, such as github action, travis ci, I can't do anything to the build agents.

What can I do to fix this issue from my end?

====
By the way, if you need upgrade to latest libseccopm on ubuntu, below is the right command

sudo apt-get update
sudo apt-get install -y libseccomp-dev

@jaedle
Copy link
Author

jaedle commented Jul 2, 2021

@ozbillwang I had the same problems. I pinned the last working version and started to migrate my images to debian.

@ozbillwang
Copy link

ic. So the problem is only in Ubuntu. Thanks for the update.

@jaedle
Copy link
Author

jaedle commented Jul 2, 2021

@ozbillwang Sorry for being unclear.

I had the same problems on different host systems (Debian Arm, Ubuntu x86, CI-providers).
Despite the fact that I really liked alpine I stopped using it entirely because of this breaking change for me.

I am pretty sure you can update your Ubuntu to a libseccomp version that is compatible, so it’s no Ubuntu-specific problem.

I derive my own built docker images from now on on debian instead of alpine.

@RouxAntoine
Copy link

RouxAntoine commented Jul 2, 2021

Do you think this is related to inner container alpine os ? because this is host libseccomp version and host apt which stop working in my case. Maybe other inner container os don't use libseccomp ?

@ozbillwang
Copy link

ozbillwang commented Jul 3, 2021

@jaedle

ok, so you change the FROM base image in Dockerfile from alpine to Debian, the build server is still ubuntu or Debian.

Thanks for the clarification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests