-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reverseproxy: Always return new upstreams (fix #5736) #5752
Conversation
I believe there's at least a few private-source implementations of dynamic upstreams already. It's a common need to do custom tenancy-based upstream routing. That said, did we mark it experimental? Might be a good idea to do it sooner rather than later if we need to make a change. Can we do it in a backwards compatible way, i.e. a new interface and try asserting both for now (prefer the new one) and remove the old one later? |
It's not experimental AFAIK :( It's OK though, I think I've decided to leave the API as-is. It's just unfortunate for modules that wish to cache their upstreams; they can still cache the dial addresses, but the actual upstreams need to be allocated each time. |
Can |
Possibly! I'd need to fiddle with it. Maybe for a future PR. |
Thanks @mholt for the fix. I haven’t fully tested yet, but initially it looks good. Will report in a few hours. At first glance, it seems that DNS results cannot be cached between Upstream instances which is probably a bigger performance bottleneck than the memory allocations themselves |
@kkroo Awesome, thanks.
I believe the results of DNS lookups for both A and SRV upstreams are still cached, we just replace the "container" we return them in each time. |
Thanks for the clarification. There is still a write/write race condition on reverseproxy.go:1080 from different routines but it seems the rest of the races went away. |
@kkroo Dang. Ok, so that's this line right? h.HealthChecks.Passive.logger = h.logger.Named("health_checker.passive") Or am I reading the wrong commit again? Can you give me the exact commit and full stack trace of that race? |
Yes, exactly. Samestack trace as in the orignal issue EDIT: I cherry-picked the commit here on top of the PR i had open (unrelated), so line 1080 corresponds to I will checkout this PR directly and let you know if anything is somehow differnt
|
|
@kkroo Ok, I think I fixed that one (see the newest commit). For some reason the passive health check logger was being set during provisionUpstream instead of the handler's Provision method. So I just moved that line. My tests pass -- can you verify that this fixes it for you? FYI, I am hoping to release v2.7.4 today with a fix for HTTP/3 and this also included. If we hear back from you before then, awesome -- otherwise, I may just merge this since the fix was so simple and I'm fairly confident in it. But will love to get your experience afterward even if that's the case. Thank you again!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* templates: Fix httpInclude (fix caddyserver#5698) Allowable during feature freeze because this is a simple, non-invasive bug fix only. * ci: Use gofumpt to format code (caddyserver#5707) * go.mod: Upgrade golang.org/x/net to 0.14.0 (caddyserver#5718) * ci: Add riscv64 (64-bit RISC-V) to goreleaser (caddyserver#5720) This will add 64-bit RISC-V Linux prebuilts for Caddy. * ci: Update to Go 1.21 (caddyserver#5719) * ci: Update to Go 1.21 * Bump quic-go to v0.37.4 * Check EnableFullDuplex err * Linter bug suppression See timakin/bodyclose#52 --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> * fileserver: Don't repeat error for invalid method inside error context (caddyserver#5705) * caddytls: Update docs for on-demand config * Fix tests I thought Go ordered JSON objects when marshaling, but I guess not. * cmd: Require config for caddy validate (fix caddyserver#5612) (caddyserver#5614) * Require config for caddy validate - fixes caddyserver#5612 Signed-off-by: Pistasj <hi@pistasjis.net> * Try making adjacent Caddyfile check its own function Signed-off-by: Pistasj <hi@pistasjis.net> * add Francis' suggestion Co-authored-by: Francis Lavoie <lavofr@gmail.com> * Refactor * Fix borked commit, sigh --------- Signed-off-by: Pistasj <hi@pistasjis.net> Co-authored-by: Francis Lavoie <lavofr@gmail.com> Co-authored-by: Matthew Holt <mholt@users.noreply.github.com> * fileserver: Slightly more fitting icons * ci: use gci linter (caddyserver#5708) * use gofmput to format code * use gci to format imports * reconfigure gci * linter autofixes * rearrange imports a little * export GOOS=windows golangci-lint run ./... --fix * reverseproxy: Always return new upstreams (fix caddyserver#5736) (caddyserver#5752) * reverseproxy: Always return new upstreams (fix caddyserver#5736) * Fix healthcheck logger race * go.mod: Upgrade CertMagic and quic-go * fix package typo (caddyserver#5764) Signed-off-by: guoguangwu <guoguangwu@magic-shield.com> * fileserver: docs: clarify the ability to produce JSON array with `browse` (caddyserver#5751) * caddyfile: Loosen heredoc parsing (caddyserver#5761) * httpcaddyfile: Stricter errors for site and upstream address schemes (caddyserver#5757) Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com> Co-authored-by: Francis Lavoie <lavofr@gmail.com> * update quic-go to v0.37.6 (caddyserver#5767) * caddyfile: Adjust error formatting (caddyserver#5765) * replacer: change timezone to UTC for "time.now.http" placeholders (caddyserver#5774) * chore: Appease gosec linter (caddyserver#5777) These happen to be harmless memory aliasing but I guess the linter can't know that and we can't really prove it in general. * go.mod: Update quic-go to v0.38.0 (caddyserver#5772) * go.mod: Update quic-go to v0.38.0 * run "go mod tidy" --------- Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * caddyfile: Fix case where heredoc marker is empty after newline (caddyserver#5769) Fixes `panic: runtime error: slice bounds out of range [:3] with capacity 2` Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * ci: ensure short-sha is exported correctly on all platforms (caddyserver#5781) * fileserver: Export BrowseTemplate This allows programs embedding Caddy to customize the browse template. * logging: Clone array on log filters, prevent side-effects (caddyserver#5786) Fixes https://caddy.community/t/is-caddy-mutating-header-content-from-logging-settings/20947 * logging: query filter for array of strings (caddyserver#5779) Co-authored-by: Matt Holt <mholt@users.noreply.github.com> Co-authored-by: Francis Lavoie <lavofr@gmail.com> * ci: Run govulncheck (caddyserver#5790) * feat(ci): check vuln Go mods in CI * fix(ci): correct directive for govulncheck * refactor(ci): move govulncheck to lint.yml * refactor(lint): move govulncheck to different job * cmd: Prevent overwriting existing env vars with `--envfile` (caddyserver#5803) Co-authored-by: Francis Lavoie <lavofr@gmail.com> * httpcaddyfile: fix placeholder shorthands in named routes (caddyserver#5791) Co-authored-by: Francis Lavoie <lavofr@gmail.com> * reverseproxy: fix nil pointer dereference in AUpstreams.GetUpstreams (caddyserver#5811) fix a nil pointer dereference in AUpstreams.GetUpstreams when AUpstreams.Versions is not set (fixes caddyserver#5809) Signed-off-by: Pascal Vorwerk <info@fossores.de> * fileserver: browse template SVG icons and UI tweaks (caddyserver#5812) * fileserver browse.html UI tweaks: folder-symlink icon, search fileserver browse.html UI tweaks: folder-symlink icon, search - ui - add folder-symlink SVG icon - search: use `<input type="search">` instead of `text` - fix npe with `sizebar.style.width` = null in grid mode * tabify whitespace Co-authored-by: Francis Lavoie <lavofr@gmail.com> --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> * caddyhttp: Use LimitedReader for HTTPRedirectListener * build(deps): bump actions/checkout from 3 to 4 (caddyserver#5846) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump goreleaser/goreleaser-action from 4 to 5 (caddyserver#5847) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: caddytest.AssertResponseCode error message (caddyserver#5853) * reverseproxy: Allow fallthrough for response handlers without routes (caddyserver#5780) * templates: Add dummy `RemoteAddr` to `httpInclude` request, proxy compatibility (caddyserver#5845) * Enhancement: Allow X-Forwarded-For Header in httpInclude Virtual Requests The goal of this enhancement is to modify the funcHTTPInclude function in the Caddy codebase to include the X-Forwarded-For header in the virtual request. This change will enable reverse proxies to set the X-Forwarded-For header, ensuring that the client's IP address is correctly provided to the target endpoint. This modification is essential for applications that depend on the X-Forwarded-For header for various functionalities, such as authentication, logging, or content customization. * Updated tplcontext.go - set `virtReq.RemoteAddr = "127.0.0.1"` i have made the suggested changes * Apply suggestions from code review * Update modules/caddyhttp/templates/tplcontext.go --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> * go.mod: Upgrade dependencies incl. x/net/http Possibly important for the HTTP/2 Rapid Reset issue. * fileserver: Add command shortcuts `-l` and `-a` (caddyserver#5854) * encode: Add `application/wasm*` to the default content types (caddyserver#5869) * httpcaddyfile: Enable TLS for catch-all site if `tls` directive is specified (caddyserver#5808) * reverseproxy: Fix retries on "upstreams unavailable" error (caddyserver#5841) * reverseproxy: fix parsing Caddyfile fails for unlimited request/response buffers (caddyserver#5828) * cmd: Fix exiting with custom status code, add `caddy -v` (caddyserver#5874) * Simplify variables for commands * Add --envfile support for adapt command * Carry custom status code for commands to os.Exit() * cmd: add `-v` and `--version` to root caddy command * Add `--envfile` to `caddy environ`, extract flag parsing to func --------- Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com> * httpcaddyfile: Sort TLS SNI matcher for deterministic JSON output (caddyserver#5860) * httpcaddyfile: Sort TLS SNI matcher, for deterministic adapt output * Update caddyconfig/httpcaddyfile/httptype.go --------- Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * reverseproxy: Replace health header placeholders (caddyserver#5861) * reverseproxy: Add logging for dynamic A upstreams (caddyserver#5857) * reverseproxy: Fix `least_conn` policy regression (caddyserver#5862) * reverseproxy: Add more debug logs (caddyserver#5793) * reverseproxy: Add more debug logs This makes debug logging very noisy when reverse proxying, but I guess that's the point. This has shown to be useful in troubleshooting infrastructure issues. * Update modules/caddyhttp/reverseproxy/streaming.go Co-authored-by: Francis Lavoie <lavofr@gmail.com> * Update modules/caddyhttp/reverseproxy/streaming.go Co-authored-by: Francis Lavoie <lavofr@gmail.com> * Add opt-in `trace_logs` option * Rename to VerboseLogs --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> * tls: Add X25519Kyber768Draft00 PQ "curve" behind build tag (caddyserver#5852) … when compiled with cfgo (https://github.com/cloudflare/go). * fileserver: Set canonical URL on browse template (caddyserver#5867) * Browse.html: Add canonical URL and home-link When contents are equal, but maybe just a sort order is different, it is good to add `<link rel="canonical" href="base-path/" />`. This helps search engines propeely index the page. I also added a link to the home page with the name of `{{.Host}}` just above the bread crumbs to make the page clearer. https://paste.tnonline.net/files/28Wun5CQZiqA_Screenshot_20231007_134435_Opera.png * Update browse.html * ci: Force the Go version for govulncheck (caddyserver#5879) * admin: Respond with 4xx on non-existing config path (caddyserver#5870) Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * caddyfile: Fix variadic placeholder false positive when token contains `:` (caddyserver#5883) * cmd: upgrade: resolve symlink of the executable (caddyserver#5891) * httpcaddyfile: Fix TLS automation policy merging with get_certificate (caddyserver#5896) * templates: Clarify `include` args docs, add `.ClientIP` (caddyserver#5898) * core: quic listener will manage the underlying socket by itself (caddyserver#5749) * core: quic listener will manage the underlying socket by itself. * format code * rename sharedQUICTLSConfig to sharedQUICState, and it will now manage the number of active requests * add comment * strict unwrap type * fix unwrap * remove comment * cmd: Add newline character to version string in CLI output (caddyserver#5895) * caddyhttp: Use sync.Pool to reduce lengthReader allocations (caddyserver#5848) * Use sync.Pool to reduce lengthReader allocations Signed-off-by: Harish Shan <140232061+perhapsmaple@users.noreply.github.com> * Add defer putLengthReader to prevent leak Signed-off-by: Harish Shan <140232061+perhapsmaple@users.noreply.github.com> * Cleanup in putLengthReader Co-authored-by: Francis Lavoie <lavofr@gmail.com> --------- Signed-off-by: Harish Shan <140232061+perhapsmaple@users.noreply.github.com> Co-authored-by: Francis Lavoie <lavofr@gmail.com> * core: Apply SO_REUSEPORT to UDP sockets (caddyserver#5725) * core: Apply SO_REUSEPORT to UDP sockets For some reason, 10 months ago when I implemented SO_REUSEPORT for TCP, I didn't realize, or forgot, that it can be used for UDP too. It is a much better solution than using deadline hacks to reuse a socket, at least for TCP. Then mholt/caddy-l4#132 was posted, in which we see that UDP servers never actually stopped when the L4 app was stopped. I verified this using this command: $ nc -u 127.0.0.1 55353 combined with POSTing configs to the /load admin endpoint (which alternated between an echo server and a proxy server so I could tell which config was being used). I refactored the code to use SO_REUSEPORT for UDP, but of course we still need graceful reloads on all platforms, not just Unix, so I also implemented a deadline hack similar to what we used for TCP before. That implementation for TCP was not perfect, possibly having a logical (not data) race condition; but for UDP so far it seems to be working. Verified the same way I verified that SO_REUSEPORT works. I think this code is slightly cleaner and I'm fairly confident this code is effective. * Check error * Fix return * Fix var name * implement Unwrap interface and clean up * move unix packet conn to platform specific file * implement Unwrap for unix packet conn * Move sharedPacketConn into proper file * Fix Windows * move sharedPacketConn and fakeClosePacketConn to proper file --------- Co-authored-by: Weidi Deng <weidi_deng@icloud.com> * httpcaddyfile: Remove port from logger names (caddyserver#5881) Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * templates: Delete headers on `httpError` to reset to clean slate (caddyserver#5905) * go.mod: CVE-2023-45142 Update opentelemetry (caddyserver#5908) * go.mod: Upgrade quic-go to v0.39.1 * caddyhttp: Adjust `scheme` placeholder docs (caddyserver#5910) * Upgrade acmeserver to github.com/go-chi/chi/v5 (caddyserver#5913) This commit upgrades the router used in the acmeserver to github.com/go-chi/chi/v5. In the latest release of step-ca, the router used by certificates was upgraded to that version. Fixes caddyserver#5911 Signed-off-by: Mariano Cano <mariano.cano@gmail.com> * test: acmeserver: add smoke test for the ACME server directory (caddyserver#5914) * chore: Fix usage pool comment (caddyserver#5916) * update quic-go to v0.39.3 (caddyserver#5918) * go.mod: update quic-go version to v0.40.0 (caddyserver#5922) * Revert "caddyhttp: Use sync.Pool to reduce lengthReader allocations (caddyserver#5848)" (caddyserver#5924) * fileserver: Add .m4v for browse template icon * httpredirectlistener: Only set read limit for when request is HTTP (caddyserver#5917) * chore: Bump otel to v1.21.0. (caddyserver#5949) Signed-off-by: Dan Lorenc <dlorenc@chainguard.dev> * panic when reading from backend failed to propagate stream error (caddyserver#5952) * http2 uses new round-robin scheduler (caddyserver#5946) * templates: Offically make templates extensible (caddyserver#5939) * templates: Offically make templates extensible This supercedes caddyserver#4757 (and caddyserver#4568) by making template extensions configurable. The previous implementation was never documented AFAIK and had only 1 consumer, which I'll notify as a courtesy. * templates: Add 'maybe' function for optional components * Try to fix lint error * tls: accept placeholders in string values of certificate loaders (caddyserver#5963) * tls: loader: accept placeholders in string values * appease the linter * caddytls: Context to DecisionFunc (caddyserver#5923) See caddyserver/certmagic#255 * caddytls: Sync distributed storage cleaning (caddyserver#5940) * caddytls: Log out remote addr to detect abuse * caddytls: Sync distributed storage cleaning * Handle errors * Update certmagic to fix tiny bug * Split off port when logging remote IP * Upgrade CertMagic * chore: cross-build for AIX (caddyserver#5971) * core: Always make AppDataDir for InstanceID (caddyserver#5976) * cmd: Preserve LastModified date when exporting storage (caddyserver#5968) * proxyprotocol: use github.com/pires/go-proxyproto (caddyserver#5915) * proxyprotocol: use github.com/pires/go-proxyproto * Fix typo: r/generelly/generally Co-authored-by: Francis Lavoie <lavofr@gmail.com> * add config options for `Deny` CIDR and fallback policy * use `netip` package & trust unix sockets --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> * caddyhttp: Add `uuid` to access logs when used (caddyserver#5859) * fileserver: New --precompressed flag (caddyserver#5880) exposes the file_server precompressed functionality to be used with the file-server command Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * fileserver: Enable compression for command by default (caddyserver#5855) * feat: enable compression for file-server * refactor * const * Update help text * Update modules/caddyhttp/fileserver/command.go --------- Co-authored-by: Francis Lavoie <lavofr@gmail.com> Co-authored-by: Matt Holt <mholt@users.noreply.github.com> * go.mod: Updated quic-go to v0.40.1 (caddyserver#5983) * metrics: Record request metrics on HTTP errors (caddyserver#5979) * httpcaddyfile: Sort skip_hosts for deterministic JSON (caddyserver#5990) * httpcaddyfile: Sort skip_hosts for deterministic JSON * Update caddyconfig/httpcaddyfile/httptype.go Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com> * Fix test * Bah --------- Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com> * logging: Add `zap.Option` support (caddyserver#5944) * cmd: use automaxprocs for better perf in containers (caddyserver#5711) * feat: use automaxprocs for better perf in containers * better logs * cs * build(deps): bump golang.org/x/crypto from 0.16.0 to 0.17.0 (caddyserver#5994) Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.16.0 to 0.17.0. - [Commits](golang/crypto@v0.16.0...v0.17.0) --- updated-dependencies: - dependency-name: golang.org/x/crypto dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: Pistasj <hi@pistasjis.net> Signed-off-by: guoguangwu <guoguangwu@magic-shield.com> Signed-off-by: Pascal Vorwerk <info@fossores.de> Signed-off-by: Harish Shan <140232061+perhapsmaple@users.noreply.github.com> Signed-off-by: Mariano Cano <mariano.cano@gmail.com> Signed-off-by: Dan Lorenc <dlorenc@chainguard.dev> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Matthew Holt <mholt@users.noreply.github.com> Co-authored-by: Jacob Gadikian <jacobgadikian@gmail.com> Co-authored-by: Shyim <github@shyim.de> Co-authored-by: Aaron Dewes <aaron@runcitadel.space> Co-authored-by: Francis Lavoie <lavofr@gmail.com> Co-authored-by: pistasjis <57069715+pistasjis@users.noreply.github.com> Co-authored-by: guangwu <guoguangwu@magic-shield.com> Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com> Co-authored-by: Karun Agarwal <113603846+singhalkarun@users.noreply.github.com> Co-authored-by: Marten Seemann <martenseemann@gmail.com> Co-authored-by: WeidiDeng <weidi_deng@icloud.com> Co-authored-by: Paul Jeannot <paul.jeannot95@gmail.com> Co-authored-by: Đỗ Trọng Hải <41283691+hainenber@users.noreply.github.com> Co-authored-by: Evan Van Dam <evandam92@gmail.com> Co-authored-by: Pascal Vorwerk <info@fossores.de> Co-authored-by: glowinthedark <48893368+glowinthedark@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kévin Dunglas <kevin@dunglas.fr> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: Thanmay Nath <110758050+ThanmayNath@users.noreply.github.com> Co-authored-by: Christoph <github@yozora.eu> Co-authored-by: Fred Cox <mcfedr@gmail.com> Co-authored-by: Bas Westerbaan <bas@westerbaan.name> Co-authored-by: Forza <68693597+Forza-tng@users.noreply.github.com> Co-authored-by: Norman Soetbeer <norman.soetbeer@gmail.com> Co-authored-by: Harish Shan <140232061+perhapsmaple@users.noreply.github.com> Co-authored-by: Ethan Brown (Domino) <111539728+ddl-ebrown@users.noreply.github.com> Co-authored-by: Mariano Cano <mariano.cano@gmail.com> Co-authored-by: dlorenc <lorenc.d@gmail.com> Co-authored-by: Andreas Kohn <andreas.kohn@gmail.com> Co-authored-by: Benjamin Marwell <bmarwell@apache.org> Co-authored-by: Aziz Rmadi <46684200+armadi1809@users.noreply.github.com> Co-authored-by: Jens-Uwe Mager <jum@anubis.han.de> Co-authored-by: David DeMoss <ddemoss222@gmail.com> Co-authored-by: Tim Geoghegan <timgeog@gmail.com>
Should fix #5736.
I don't love this, because it involves allocating new
Upstream
s for every request, but there's no other elegant, safe way to do this AFAIK.One potential solution could be to have the
GetUpstreams()
method return[]Upstream
instead of[]*Upstream
-- this would be a breaking change but AFAIK nobody else is implementing these modules (yet?). It would also be a correctness bugfix, since the current API returns the same upstream values but we change the Host value. (Even if synchronized, it'd logically be wrong.)@kkroo Would you be able to give this a try and confirm whether it resolves the races for you, at least? I might still implement the fix differently and ask you to test one more time, but if we verify as we go I can be more confident of the fix. Thank you!!
Update: after thinking on this some more, I guess this is probably the best I can think of for right now. Using
Upstream
isn't really viable for a number of complex reasons, and while the allocations aroundUpstream
are unfortunate, they only happen when using dynamic upstreams, which is rare.