Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Decoder reached invalid state' in underlying Reqwest client #144

Closed
jnicholls opened this issue Feb 24, 2020 · 11 comments
Closed

'Decoder reached invalid state' in underlying Reqwest client #144

jnicholls opened this issue Feb 24, 2020 · 11 comments

Comments

@jnicholls
Copy link

I posted this issue in the async-compression project as well. I'm seeing a random panic in the async GzipDecoder stream in reqwest. I am uncertain if this is because of the way kube-rs is using reqwest or if it's because of reqwest, or if it's because of the async-compression crate. Having read the code where the panic occurs, it looks like the only way this could occur is if poll_next is called simultaneously from separate threads, i.e. it is not thread-safe. Anyone else seeing this with v0.25.0?

thread 'main' panicked at 'Decoder reached invalid state', /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/async-compression-0.2.0/src/stream/generic/decoder.rs:129:35
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /Users/runner/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /Users/runner/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:84
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:61
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1025
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:65
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:50
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:193
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:210
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:471
  11: bytes::abort
  12: <async_compression::stream::generic::decoder::Decoder<S,D> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/reqwest-0.10.1/<::std::macros::panic macros>:3
  13: <async_compression::stream::GzipDecoder<S> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/async-compression-0.2.0/src/stream/macros/decoder.rs:68
  14: <reqwest::async_impl::decoder::imp::Decoder as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/reqwest-0.10.1/src/async_impl/decoder.rs:176
  15: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/mod.rs:1184
  16: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/next.rs:35
  17: std::future::poll_with_tls_context
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:99
  18: reqwest::async_impl::response::Response::chunk::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/reqwest-0.10.1/src/async_impl/response.rs:288
  19: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:43
  20: std::future::poll_with_tls_context
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:99
  21: kube::client::APIClient::request_events::{{closure}}::{{closure}}::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/kube-0.25.0/src/client/mod.rs:180
  22: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:43
  23: <futures_util::stream::unfold::Unfold<T,F,Fut> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/unfold.rs:111
  24: <futures_util::stream::stream::map::Map<St,F> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/map.rs:92
  25: <futures_util::stream::stream::flatten::Flatten<St> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/flatten.rs:96
  26: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-core-0.3.4/src/stream.rs:121
  27: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/mod.rs:1184
  28: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/futures-util-0.3.4/src/stream/stream/next.rs:35
  29: std::future::poll_with_tls_context
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:99
  30: kube::api::reflector::Reflector<K>::single_watch::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/kube-0.25.0/src/api/reflector.rs:214
  31: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:43
  32: std::future::poll_with_tls_context
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:99
  33: kube::api::reflector::Reflector<K>::poll::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/kube-0.25.0/src/api/reflector.rs:120
  34: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:43
  35: std::future::poll_with_tls_context
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:99
  36: k8s_aws_nlb_proxy_protocol_operator::poll_services::{{closure}}
             at src/main.rs:67
  37: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/future.rs:43
  38: tokio::task::core::Core<T>::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/core.rs:128
  39: tokio::task::harness::Harness<T,S>::poll::{{closure}}::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/harness.rs:120
  40: core::ops::function::FnOnce::call_once
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/ops/function.rs:232
  41: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/panic.rs:318
  42: std::panicking::try::do_call
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/panicking.rs:292
  43: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:78
  44: std::panicking::try
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/panicking.rs:270
  45: std::panic::catch_unwind
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/panic.rs:394
  46: tokio::task::harness::Harness<T,S>::poll::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/harness.rs:101
  47: tokio::loom::std::causal_cell::CausalCell<T>::with_mut
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/loom/std/causal_cell.rs:41
  48: tokio::task::harness::Harness<T,S>::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/harness.rs:100
  49: tokio::task::raw::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/raw.rs:162
  50: tokio::task::raw::RawTask::poll
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/task/raw.rs:113
  51: time::format::parse_items::try_parse_fmt_string
  52: core::ptr::swap_nonoverlapping_bytes
  53: tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/runtime/basic_scheduler.rs:142
  54: tokio::runtime::Runtime::block_on::{{closure}}
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/runtime/mod.rs:411
  55: tokio::runtime::context::enter
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/runtime/context.rs:72
  56: tokio::runtime::handle::Handle::enter
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/runtime/handle.rs:34
  57: tokio::runtime::Runtime::block_on
             at /Users/jarred.nicholls/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.11/src/runtime/mod.rs:408
  58: k8s_aws_nlb_proxy_protocol_operator::main
             at src/main.rs:23
  59: std::rt::lang_start::{{closure}}
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/rt.rs:67
  60: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  61: std::panicking::try::do_call
             at src/libstd/panicking.rs:292
  62: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:78
  63: std::panicking::try
             at src/libstd/panicking.rs:270
  64: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  65: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  66: std::rt::lang_start
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/rt.rs:67
  67: k8s_aws_nlb_proxy_protocol_operator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
@jnicholls
Copy link
Author

According to Nemo157's comment Nullus157/async-compression#79 (comment) he believes the issue is in this crate, particularly in this loop: https://github.com/clux/kube-rs/blob/6fc2fc70903f46c69d8239ade8b3266b3a66ef95/src/client/mod.rs#L165

@jnicholls
Copy link
Author

There may be an improvement in the Stream implementation for async-compression that @Nemo157 will implement and publish for preparation of a 0.10.2+ release of reqwest. There may be a premature panic occurring before a final Poll::Ready(None) poll_next result, that he will address. Stay tuned.

Nullus157/async-compression#79

@clux
Copy link
Member

clux commented Feb 24, 2020

Oh, thank you. That is interesting. Regardless of async-compression's thing, it's possible that a less brittle way of unfolding herein is using try_unfold. I'll have a bit of a go at that in the mean time. If you find the upcoming fix improves situation that will still be good to know though!

@jnicholls
Copy link
Author

Yeah, the only downside is waiting for async-compression 0.3.1 release, and then having that be picked up by reqwest 0.10.3+ (0.10.2 has already been released a few days ago). I suppose it is possible to solve this within kube-rs knowing the behavior of the underlying Stream that would be forward and backward compatible.

@clux
Copy link
Member

clux commented Feb 24, 2020

From reading his comment, it sounded like we could pin async-compression as long as we had 0.10.2. But if not, then yeah, that could be a long time to wait.

@jnicholls
Copy link
Author

That's true! reqwest 0.10.2 uses async-compression "0.3.0" which would pick up version 0.3.1. Ok, well that's great news.

@clux
Copy link
Member

clux commented Feb 24, 2020

Did some preliminary work in #145 - it was rather involved, and still haven't understood all of it, so going to leave it for a bit. Have a busy week unfortunately.

@clux
Copy link
Member

clux commented Feb 24, 2020

Actually did some more stuff just to see how it works in a branch. Feel free to try it out. I am seeing a lot more Reflector restarts when getting EOFs from Reqwest, might be other errors now that we're propagating on.

@jnicholls
Copy link
Author

I'll let you continue digging into this if you feel it is worth while. async-compression put out the 0.3.1 build and the problem seems resolved on the surface (no panics). If you uncover additional issues even after building with 0.3.1 (i.e. do a cargo update) that are bubbling up, then perhaps a separate issue is warranted for those?

Thanks for your work!

@clux
Copy link
Member

clux commented Feb 25, 2020

Yeah, will make a release late tonight. Have fixed another issue in #146

@clux
Copy link
Member

clux commented Feb 25, 2020

Released the bump in 0.26.0 via #145.

@clux clux closed this as completed Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants