Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable get cloud metadata from "magic" IP #1541

Open
lmq1999 opened this issue Sep 13, 2024 · 2 comments
Open

Unstable get cloud metadata from "magic" IP #1541

lmq1999 opened this issue Sep 13, 2024 · 2 comments
Labels
kind/bug Something isn't working platform/openstack

Comments

@lmq1999
Copy link

lmq1999 commented Sep 13, 2024

Description

Unstable to get metadata via curl http://169.254.169.254/openstack

Impact

Without metadata from cloud, I can't run some service and kubernetes CSI

Environment and steps to reproduce

  1. Set-up: Flatcar image: flatcar_production_openstack_image.img
pool-g4dzrku5-sj3dtqihuu6cjof6-node-hshs3fbx ~ # cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3975.2.0
VERSION_ID=3975.2.0
BUILD_ID=2024-08-05-2103
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3975.2.0 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3975.2.0:*:*:*:*:*:*:*"
  1. Task:
    Setup network config for vpn interface and loopback interface
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # cat /etc/systemd/network
network/       networkd.conf  
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # cat /etc/systemd/network/
.keep_sys-apps_systemd-0  kengine.network           lo.network                
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # cat /etc/systemd/network/kengine.network 
[Match]
Name=kengine

[Link]
Unmanaged=yes
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # cat /etc/systemd/network/lo.network      
[Match]
Name=lo

[Network]
Address=127.0.0.1/8
Address=10.93.0.1/32
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # 
  1. Action(s):
    Restart networkd and try to get metadata
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # systemctl restart systemd-networkd
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latestpool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latestpool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
curl: (28) Failed to connect to 169.254.169.254 port 80 after 135318 ms: Couldn't connect to server
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
curl: (28) Failed to connect to 169.254.169.254 port 80 after 134717 ms: Couldn't connect to server

After restart networkd again

latestpool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
curl: (28) Failed to connect to 169.254.169.254 port 80 after 135318 ms: Couldn't connect to server
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
curl: (28) Failed to connect to 169.254.169.254 port 80 after 134717 ms: Couldn't connect to server
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
curl: (28) Failed to connect to 169.254.169.254 port 80 after 134778 ms: Couldn't connect to server
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # systemctl restart systemd-networkd
pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latestpool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latestpool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27

  1. Error:
    curl: (28) Failed to connect to 169.254.169.254 port 80 after 134778 ms: Couldn't connect to server

Expected behavior

Can get metadata like this:

curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latest

Additional information

Ip route

pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # ip r
default via 103.107.182.1 dev eth0 proto dhcp src 103.107.182.231 metric 1024 
10.20.4.0/24 dev eth1 proto kernel scope link src 10.20.4.231 metric 1024 
10.20.4.3 dev eth1 proto dhcp scope link src 10.20.4.231 metric 1024 
10.200.9.0/24 via 10.20.4.129 dev eth1 proto kernel 
10.200.16.0/24 via 10.20.4.137 dev eth1 proto kernel 
10.200.71.0/24 via 10.200.71.46 dev cilium_host proto kernel src 10.200.71.46 
10.200.71.46 dev cilium_host proto kernel scope link 
10.200.75.0/24 via 10.20.4.77 dev eth1 proto kernel 
10.200.79.0/24 via 10.20.4.136 dev eth1 proto kernel 
10.200.85.0/24 via 10.20.4.186 dev eth1 proto kernel 
10.200.86.0/24 via 10.20.4.8 dev eth1 proto kernel 
10.200.88.0/24 via 10.20.4.124 dev eth1 proto kernel 
10.200.90.0/24 via 10.20.4.81 dev eth1 proto kernel 
103.107.182.0/24 dev eth0 proto kernel scope link src 103.107.182.231 metric 1024 
103.107.182.1 dev eth0 proto dhcp scope link src 103.107.182.231 metric 1024 
103.107.182.7 dev eth0 proto dhcp scope link src 103.107.182.231 metric 1024 
169.254.169.254 via 103.107.182.7 dev eth0 proto dhcp src 103.107.182.231 metric 1024 
169.254.169.254 via 10.20.4.3 dev eth1 proto dhcp src 10.20.4.231 metric 1024 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 

networkctl list

pool-x45g4eed-x9ipxe2qz7c3offo-node-oes7cdpq /home/core # networkctl list  
IDX LINK        TYPE     OPERATIONAL SETUP     
  1 lo          loopback routable    configured
  2 eth0        ether    routable    configured
  3 eth1        ether    routable    configured
  4 docker0     bridge   no-carrier  unmanaged
  7 cilium_net  ether    degraded    unmanaged
  8 cilium_host ether    routable    unmanaged
 10 lxc_health  ether    degraded    unmanaged

7 links listed.

bonus: I also use docker openvpn config from this issue: #1515

@tormath1
Copy link
Contributor

Hello and thanks for the report, we will need some additional information (could use a gist for the asked logs to avoid loading this issue ?):

  • journalctl -u --boot systemd-networkd
  • dmesg
  • curl -vvvvv http://169.254.169.254/openstack
  • Do you see the same effect with other IPs / URLs or only the OpenStack one?
  • What is your OpenStack deployment? (managed by a cloud provider, devstack environment, full openstack deployment?)
  • If you have another OS available can you try to reproduce?

Can you confirm that you can ping the metadata server with both interfaces? (eth0 and eth1) and that you don't have any security group preventing access to resources on port 80?

@lmq1999
Copy link
Author

lmq1999 commented Sep 13, 2024

journalctl: https://gist.github.com/lmq1999/6b36af7053f026988fbf59e08e3c2510
dmesg: https://gist.github.com/lmq1999/7dbaa7b0827e59143d868b4e8fd0ddee
curl:

pool-x45g4eed-x9ipxe2qz7c3offo-node-msmsjtx5 /home/core # curl -vvvvv http://169.254.169.254/openstack
*   Trying 169.254.169.254:80...
* connect to 169.254.169.254 port 80 from 103.148.57.65 port 38364 failed: Connection timed out
* Failed to connect to 169.254.169.254 port 80 after 134043 ms: Couldn't connect to server
* Closing connection
curl: (28) Failed to connect to 169.254.169.254 port 80 after 134043 ms: Couldn't connect to server

Do you see the same effect with other IPs / URLs or only the OpenStack one => Just Openstack metadata IP
What is your OpenStack deployment => managed by cloud provider (Bizflycloud), already contact the admin but they found no clue
If you have another OS available can you try to reproduce? => I dont have problem with Ubuntu 20

Yes I can confirm I can get to metadata server with both interface

I think the problems somewhere in the docker openvpn i mention above


client
dev kengine
dev-type tap
reneg-sec 0
proto tcp-client
remote 123.31.11.151 10018
resolv-retry infinite
nobind
<ca>
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----

</ca>
<key>
-----BEGIN PRIVATE KEY-----
-----END PRIVATE KEY-----
</key>
<cert>
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----

</cert>
<tls-auth>
#
# 2048 bit OpenVPN static key
#
-----BEGIN OpenVPN Static key V1-----
-----END OpenVPN Static key V1-----

</tls-auth>
remote-cert-tls server
key-direction 1
script-security 3
keepalive 10 60
persist-key
persist-tun
comp-lzo
verb 3

route-nopull
pull-filter ignore "route-gateway"

If I don't add these config

route-nopull
pull-filter ignore "route-gateway"

lt will add wrong gateway route and this node unable access the internet (but can get the metadata)
but if I add these 2 lines in, I can access the internet but not the metadata when the container run

docker vpn command: openvpn --cd /vpn --config /vpn/kengine.conf --script-security 2 --redirect-gateway def1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working platform/openstack
Projects
Status: 📝 Needs Triage
Development

No branches or pull requests

2 participants