Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to auto update through a HTTP proxy? #379

Open
basvdlei opened this issue Feb 7, 2020 · 16 comments
Open

How to auto update through a HTTP proxy? #379

basvdlei opened this issue Feb 7, 2020 · 16 comments

Comments

@basvdlei
Copy link

basvdlei commented Feb 7, 2020

What is the canonical way to configure zincati and rpm-ostree (maybe other components) to use a HTTP proxy to get auto updates to function?

And is there a list of domains available that are required for auto updates to function?

Scenario:
FCOS nodes are running inside a secure zone without routes to public internet services. The only way to access public resources is through a HTTP proxy with domain whitelisting.

@cgwalters
Copy link
Member

coreos/rpm-ostree#208 (comment)
And doing the same for Zincati should work I think - if it does we can add this to the docs.

See also how in OpenShift 4 we have a high level proxy setting, the MCO wires it up globally for the operating system components.

@dustymabe
Copy link
Member

yeah I had linked @basvdlei to coreos/rpm-ostree#762 (comment) (should we close that issue?).

I asked @basvdlei to open an issue here because it either works and we need to add docs for it, or it doesn't work and we need to make it work and then add docs 😄

Thanks @basvdlei

@basvdlei
Copy link
Author

basvdlei commented Feb 7, 2020

Some quick lab testing shows that rpm-ostree honors the HTTP_* environment variables but Zincati does not.

FCOS Config:

variant: fcos
version: 1.0.0
storage:
  files:
    - path: /etc/proxy.env
      mode: 0644
      contents:
        inline: |
          ALL_PROXY="http://192.168.121.1:3128"
          HTTP_PROXY="http://192.168.121.1:3128"
          HTTPS_PROXY="http://192.168.121.1:3128"
          NO_PROXY="localhost,127.0.0.1,example.com"
systemd:
  units:
    - name: "rpm-ostreed.service"
      dropins:
        - name: "99-proxy.conf"
          contents: |
            [Service]
            EnvironmentFile=/etc/proxy.env
    - name: "zincati.service"
      dropins:
        - name: "99-proxy.conf"
          contents: |
            [Service]
            EnvironmentFile=/etc/proxy.env

rpm-ostree:

[core@master-01 ~]$ sudo rpm-ostree upgrade
1 metadata, 0 content objects fetched; 569 B transferred in 1 seconds
No upgrade available.

core@master-01 ~]$ journalctl --lines 5 -u rpm-ostreed
-- Logs begin at Fri 2020-02-07 22:38:35 UTC, end at Fri 2020-02-07 23:13:19 UTC. --
Feb 07 23:12:47 master-01 rpm-ostree[18838]: Initiated txn Upgrade for client(id:cli dbus:1.368 unit:session-3.scope uid:0): /org/projectatomic/rpmostree1/fedora_coreos
Feb 07 23:12:49 master-01 rpm-ostree[18838]: libostree pull from 'fedora' for fedora/x86_64/coreos/stable complete
                                                        security: GPG: commit http: TLS
                                                        non-delta: meta: 1 content: 0
                                                        transfer: secs: 1 size: 569 bytes
Feb 07 23:12:49 master-01 rpm-ostree[18838]: Txn Upgrade on /org/projectatomic/rpmostree1/fedora_coreos successful
Feb 07 23:12:49 master-01 rpm-ostree[18838]: client(id:cli dbus:1.368 unit:session-3.scope uid:0) vanished; remaining=0
Feb 07 23:12:49 master-01 rpm-ostree[18838]: In idle state; will auto-exit in 63 seconds

Zincati:

[core@master-01 ~]$ sudo systemctl restart zincati.service 
[core@master-01 ~]$ journalctl --lines 5 -u zincati.service
-- Logs begin at Fri 2020-02-07 22:38:35 UTC, end at Fri 2020-02-07 23:15:37 UTC. --
Feb 07 23:15:35 master-01 zincati[22175]: [INFO ] starting update agent (zincati 0.0.6)
Feb 07 23:15:36 master-01 zincati[22175]: [INFO ] Cincinnati service: https://updates.coreos.stg.fedoraproject.org
Feb 07 23:15:36 master-01 zincati[22175]: [INFO ] agent running on node '1f857c0a091e4fa4bd18014e9f6965aa', in update group 'default'
Feb 07 23:15:36 master-01 zincati[22175]: [INFO ] initialization complete, auto-updates logic enabled
Feb 07 23:15:37 master-01 zincati[22175]: [ERROR] failed to check Cincinnati for updates: client-side error: https://updates.coreos.stg.fedoraproject.org/v1/graph?os_version=31.20200118.3.0&group=default&node_uuid=1f857c0a091e>

@jamescassell
Copy link
Collaborator

Please use the lower case versions for proxy vars. Curl ignores the upper case HTTP_PROXY env var.

@basvdlei
Copy link
Author

basvdlei commented Feb 8, 2020

Oh the dubious http_proxy casing standard :-) On Container Linux it was all Golang based which looked at the upper case first so I got away with keeping exported variables upper case.

It looks like the crate reqwest used in Zincati will read both upper and lower case proxy vars. But only since version 0.10 this is done by default. Zincati still uses 0.9.24 but coreos/zincati#196 should fix that.

@basvdlei
Copy link
Author

basvdlei commented Feb 8, 2020

In addition it seems the reqwest crate does not support the NO_PROXY variable: seanmonstar/reqwest#705. This will be problematic when using for example the FleetLock strategy where the Airlock endpoint might not be accessible through the proxy.

@dustymabe
Copy link
Member

Thanks @basvdlei for helping investigate!

@lucab
Copy link
Contributor

lucab commented Feb 10, 2020

It looks like this ticket contains a few sub-tasks:

@basvdlei
Copy link
Author

Thanks for the follow up!

FWIW I'm personally not a fan of using the DefaultEnvironment= for setting the proxy. As shown in this ticket not all libraries interpret the HTTP_PROXY and NO_PROXY in the exact same way. DefaultEnvironment will configure the variable for all services on the node and from the 'least privileged security' point of view I would like to only provide proxy access to the services that require outbound connections.

For example in our case we do not want the docker.service to do or know anything with the internet.

@dustymabe
Copy link
Member

zincati 0.0.9 just landed in the latest FCOS testing release (31.20200323.2.0). @lucab what is left to do for this issue?

@fifofonix
Copy link

I can confirm that I have been able to install a cluster of FCOS servers using the latest testing image both using VMWareFusion and on vSphere behind a corporate proxy, and that when in TRACE mode (-vvv), the Zincatti agent is successfully providing details on the OS graph (nodes/edges etc). Prior to this update it was timing out. I haven't actually seen an OS auto-update event processed (obvs) but with the next version of FCOS hopefully we will see that occur.

@lucab
Copy link
Contributor

lucab commented Mar 31, 2020

I'd stick to @fifofonix suggestion and wait to observe a successful auto-update through the proxy, then document the setup. Once we have docs, we can close this.
I've moved the proxy exclusion RFE to coreos/zincati#254.

@fifofonix once confirmed working, can you please post your fcct snippet to https://github.com/coreos/fedora-coreos-docs so that it can be properly documented?

@fifofonix
Copy link

fifofonix commented Apr 2, 2020

Good news. This morning I found my cluster had successfully upgraded through a corporate proxy as expected so we can definitely close this.

Note I'm using Grafana/Prometheus with node_exporter for operational visibility and node_uname_info for the FCOS release now shows as 5.5.10-200.fc31.x86_64. I don't love the fact this isn't easy to tie to the FCOS release # but it didn't tie for CL either.

As requested the fcct snippets I've used:

files:
    - path: /etc/corp-proxy.env
      mode: 0644
      contents:
        inline: |
          all_proxy="http://corp-proxy:8000"
          http_proxy="http://corp-proxy:8000"
          HTTP_PROXY="http://corp-proxy:8000"
          https_proxy="http://corp-proxy:8000"
          HTTPS_PROXY="http://corp-proxy:8000"
          no_proxy="127.0.0.1,0.0.0.0,localhost"
    - path: /etc/zincati/config.d/51-rollout-wariness.toml
      mode: 0644
      contents:
        inline: |
          [identity]
          # 0.001 meaning we are not wary at all and so update us now...
          rollout_wariness = 0.001
          [updates]
          strategy= "immediate"
systemd:
  units:
    - name: zincati.service
      enabled: true
      dropins:
        - name: 99-http-proxy.conf
          contents: |
            [Service]
            # Next line can be used to increase verbosity of logging -vv or -vvv for TRACE logging...
            Environment=ZINCATI_VERBOSITY="-v"
            # Set env variables with file.  Could set explicitly instead.  https_proxy is key.
            EnvironmentFile=/etc/corp-proxy.env

@lucab
Copy link
Contributor

lucab commented Apr 2, 2020

@fifofonix thanks for the feedback! I'll try to run with your snippet to assemble a reasonable docpage, I'll ping you once there.
If you are already running Prometheus, you may want to grab Zincati metrics too. Those contains deployment details, including OS version.

@basvdlei
Copy link
Author

basvdlei commented Apr 6, 2020

I've also been able to update in my lab setup from 31.20200323.2.0 to 31.20200323.2.1 using a HTTP proxy.

While initially it didn't work and zincati was logging 503's:

zincati[914]: [INFO ] starting update agent (zincati 0.0.9)
zincati[914]: [INFO ] Cincinnati service: https://updates.coreos.stg.fedoraproject.org
zincati[914]: [INFO ] agent running on node '1642f11dfbc04e028c562f64fa50340b', in update group 'default'
zincati[914]: [INFO ] initialization complete, auto-updates logic enabled
zincati[914]: [ERROR] failed to check Cincinnati for updates: server-side error, code 503: (unknown/generic server error)
zincati[914]: [ERROR] failed to check Cincinnati for updates: server-side error, code 503: (unknown/generic server error)

Subsequent retries with more verbose logging did work. So I guess this was an issue with the Cincinnati servers at that time (2020-04-06 08:38:01 UTC).

Checking the proxies access logs gives the following connect requests:

TCP_TUNNEL/200 11301 CONNECT updates.coreos.stg.fedoraproject.org:443 - HIER_DIRECT/209.132.181.5 -
TCP_TUNNEL/200 11301 CONNECT updates.coreos.stg.fedoraproject.org:443 - HIER_DIRECT/209.132.181.5 -
TCP_TUNNEL/200 11301 CONNECT updates.coreos.stg.fedoraproject.org:443 - HIER_DIRECT/209.132.181.5 -
TCP_TUNNEL/200 5379 CONNECT d2uk5hbyrobdzx.cloudfront.net:443 - HIER_DIRECT/54.192.86.92 -
TCP_TUNNEL/200 37570 CONNECT ostree.fedoraproject.org:443 - HIER_DIRECT/67.219.144.68 -
TCP_TUNNEL/200 21856 CONNECT d2uk5hbyrobdzx.cloudfront.net:443 - HIER_DIRECT/54.192.86.92 -
TCP_TUNNEL/200 5433 CONNECT ostree.fedoraproject.org:443 - HIER_DIRECT/67.219.144.68 -
TCP_TUNNEL/200 5378 CONNECT d2uk5hbyrobdzx.cloudfront.net:443 - HIER_DIRECT/54.192.86.92 -
TCP_TUNNEL/200 37570 CONNECT ostree.fedoraproject.org:443 - HIER_DIRECT/67.219.144.68 -
TCP_TUNNEL/200 76226745 CONNECT d2uk5hbyrobdzx.cloudfront.net:443 - HIER_DIRECT/54.192.86.92 -
TCP_TUNNEL/200 10151 CONNECT d2uk5hbyrobdzx.cloudfront.net:443 - HIER_DIRECT/54.192.86.92 -
TCP_TUNNEL/200 5433 CONNECT ostree.fedoraproject.org:443 - HIER_DIRECT/67.219.144.68 -
TCP_TUNNEL/200 11301 CONNECT updates.coreos.stg.fedoraproject.org:443 - HIER_DIRECT/209.132.181.5 -
TCP_TUNNEL/200 11301 CONNECT updates.coreos.stg.fedoraproject.org:443 - HIER_DIRECT/209.132.181.5 -

Am I correct to assume that white-listing the following domains pretty much guarantees updates will work?:

  • *.fedoraproject.org
  • *.cloudfront.net

@lucab
Copy link
Contributor

lucab commented Apr 6, 2020

@basvdlei the 503 is likely a transient hiccup in a CDN/proxy (you can check metrics if that happens too often, my canaries also saw several this morning).

The domains you see above are:

  • updates.coreos.stg.fedoraproject.org and updates.coreos.fedoraproject.org - Cincinnati updates metadata
  • ostree.fedoraproject.org - OSTree repository
  • *.cloudfront.net - CDN for OSTree repository

There are likely a few more domains related to RPM packages (content and metadata). You may be able to catch them observing the traffic of rpm-ostree update and rpm-ostree install <somepackage>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants