dispatcher: Run zero-delay timeout timers on the next iteration of the event loop #11823

antoniovicente · 2020-06-30T19:13:34Z

Commit Message: dispatcher: Run 0-delay timeout timers on the next iteration of the event loop instead of the current iteration.
Additional Description: Processing 0-delay timers in the same loop they are generated can result in long timer callback chains which could starve other operations in the event loop or even result in infinite processing loops. Cases that required same-iteration scheduling behavior for 0-delay timers were refactored to use SchedulableCallback::scheduleCallbackCurrentIteration in #11663, so behavior changes due to this change should be relatively minor.
Risk Level: high, changes to event loop scheduling behavior of timers
Testing: unit
Docs Changes: n/a
Release Notes: n/a
Runtime guard: envoy.reloadable_features.activate_timers_next_event_loop

…d of the current iteration. Guarded by runtime feature "envoy.reloadable_features.activate_timers_next_event_loop" Signed-off-by: Antonio Vicente <avd@google.com>

mattklein123

Thanks this is awesome (especially the comments about libevent). The prod changes LGTM. @jmarantz can you have review the simtime changes and tests?

Signed-off-by: Antonio Vicente <avd@google.com>

jmarantz · 2020-07-02T19:01:21Z

merge master when you get a chance; I will assume the CI failures are due to flakes outside this PR.

jmarantz

awesome work. left some discussion topics, unless you think this effort closes the topic.

source/common/event/libevent_scheduler.h

jmarantz · 2020-07-02T21:45:02Z

/wait

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente

merge master when you get a chance; I will assume the CI failures are due to flakes outside this PR.

Done.

source/common/event/libevent_scheduler.h

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente · 2020-07-17T16:39:54Z

/wait

@jmarantz PTAL when you have a chance. Thanks.

jmarantz · 2020-07-18T15:40:24Z

/azp run

azure-pipelines · 2020-07-18T15:40:32Z

Azure Pipelines successfully started running 1 pipeline(s).

jmarantz

Basically looks great, modulo a tiny nit and a question about windows vs non-windows.

test/test_common/simulated_time_system_test.cc

jmarantz · 2020-07-18T15:44:28Z

/wait

jmarantz · 2020-07-20T12:54:50Z

Maybe play 'master merge roulette' to see if you get lucky? IDK. FWIW the tsan failure was a segv in ServerTest.LogToFile:

2020-07-18T16:31:54.3830193Z [ RUN      ] IpVersions/ServerInstanceImplTest.LogToFile/IPv4
2020-07-18T16:31:54.3830955Z external/bazel_tools/tools/test/collect_coverage.sh: line 131: 16607 Segmentation fault      (core dumped) "$@"

Super annoying. No idea why that would segv, and there was no stack trace.

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

jmarantz · 2020-07-23T17:47:00Z

In your arm64-release test this failure happens and looks plausibly related

2020-07-23T16:55:24.5113964Z [ RUN      ] TimeSystemType/GuardDogDeathTest.MultiKillThresholdDeathTest/1
2020-07-23T16:55:24.5114490Z TestRandomGenerator running with seed 656857712
2020-07-23T16:55:24.5115292Z -- Test timed out at 2020-07-23 16:55:24 UTC --

antoniovicente · 2020-07-23T19:05:13Z

Maybe play 'master merge roulette' to see if you get lucky? IDK. FWIW the tsan failure was a segv in ServerTest.LogToFile:
2020-07-18T16:31:54.3830193Z [ RUN      ] IpVersions/ServerInstanceImplTest.LogToFile/IPv4
2020-07-18T16:31:54.3830955Z external/bazel_tools/tools/test/collect_coverage.sh: line 131: 16607 Segmentation fault      (core dumped) "$@"
Super annoying. No idea why that would segv, and there was no stack trace.
cc @lizan

This segfault on IpVersions/ServerInstanceImplTest.LogToFile/IPv4 failure has happened at least twice in CI on this PR so it may be a real issue. I don't know how to repo, not having a crash stack makes debugging difficult.

This is a determistic failure. I was able to repo locally in coverage mode with asan. Seems related to use of runtime features on timers created too early in the process. It is triggered by use of "-l trace" when running in coverage mode.

[ RUN ] IpVersions/ServerInstanceImplTest.LogToFile/IPv4
source/common/common/logger_delegates.cc:21:14: runtime error: member call on misaligned address 0xbebebebebebebebe for type 'Envoy::AccessLog::AccessLogFile', which requires 8 byte alignment
0xbebebebebebebebe: note: pointer points here

#0 0x2121692f in Envoy::Logger::FileSinkDelegate::log(std::__1::basic_string_view<char, std::1::char_traits >) ??:?
#1 0x261ccabc in Envoy::Logger::DelegatingLogSink::log(spdlog::details::log_msg const&) ??:?
#2 0x16b75c4a in spdlog::logger::sink_it(spdlog::details::log_msg const&) ??:?
#3 0x16dd6079 in spdlog::logger::log_it(spdlog::details::log_msg const&, bool, bool) ??:?
#4 0x17d916a2 in void spdlog::logger::log<std::__1::basic_string_view<char, std::__1::char_traits > >(spdlog::source_loc, spdlog::level::level_enum, fmt::v6::basic_string_view, std::__1::basic_string_view<char, std::__1::char_traits > const&) ??:?
#5 0x21057a03 in Envoy::Runtime::runtimeFeatureEnabled(std::__1::basic_string_view<char, std::__1::char_traits >) ??:?
#6 0x1fe22052 in Envoy::Event::TimerImpl::TimerImpl(Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()>, Envoy::Event::Dispatcher&) ??:?
#7 0x1fe1bc8a in std::__1::__unique_ifEnvoy::Event::TimerImpl::__unique_single std::__1::make_unique<Envoy::Event::TimerImpl, Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()> const&, Envoy::Event::Dispatcher&>(Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) libevent_scheduler.cc:?
#8 0x1fe175f8 in Envoy::Event::LibeventScheduler::createTimer(std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) ??:?
#9 0x1fe155c3 in Envoy::Event::(anonymous namespace)::RealScheduler::createTimer(std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) real_time_system.cc:?
#10 0x1f5486c9 in Envoy::Event::DispatcherImpl::createTimerInternal(std::__1::function<void ()>) ??:?
#11 0x1f5483ab in Envoy::Event::DispatcherImpl::createTimer(std::__1::function<void ()>) ??:?
#12 0x18f389bf in Envoy::AccessLog::AccessLogFileImpl::AccessLogFileImpl(std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >, Envoy::Thread::ThreadFactory&) ??:?
#13 0x18f62a33 in std::__1::__compressed_pair_elem<Envoy::AccessLog::AccessLogFileImpl, 1, false>::__compressed_pair_elem<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul>(std::__1::piecewise_construct_t, std::__1::tuple<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>, std::__1::__tuple_indices<0ul, 1ul, 2ul, 3ul, 4ul, 5ul>) ??:?
#14 0x18f60852 in std::__1::__compressed_pair<std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl, Envoy::AccessLog::AccessLogFileImpl>::__compressed_pair<std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl&, std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>(std::__1::piecewise_construct_t, std::__1::tuple<std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl&>, std::__1::tuple<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>) ??:?
#15 0x18f5f0bc in std::__1::__shared_ptr_emplace<Envoy::AccessLog::AccessLogFileImpl, std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl >::__shared_ptr_emplace<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>(std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl, std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&) ??:?
#16 0x18f4f0b5 in ZNSt3__111make_sharedIN5Envoy9AccessLog17AccessLogFileImplEJNS_10unique_ptrINS1_10Filesystem4FileENS_14default_deleteIS6_EEEERNS1_5Event10DispatcherERNS1_6Thread13BasicLockableERNS1_18AccessLogFileStatsERKNS_6chrono8durationIxNS_5ratioILl1ELl1000EEEEERNSD_13ThreadFactoryEEEENS_9enable_ifIXntsr8is_arrayIT_EE5valueENS_10shared_ptrISS_EEE4typeEDpOT0 access_log_manager_impl.cc:?
#17 0x18f37fdc in Envoy::AccessLog::AccessLogManagerImpl::createAccessLog(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) ??:?
#18 0x212165af in Envoy::Logger::FileSinkDelegate::FileSinkDelegate(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManager&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink) ??:?
#19 0x1859c8e8 in std::__1::__unique_ifEnvoy::Logger::FileSinkDelegate::__unique_single std::__1::make_unique<Envoy::Logger::FileSinkDelegate, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManagerImpl&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink >(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManagerImpl&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink&&) server.cc:?
#20 0x184c18d5 in Envoy::Server::InstanceImpl::InstanceImpl(Envoy::Init::Manager&, Envoy::Server::Options const&, Envoy::Event::TimeSystem&, std::__1::shared_ptr<Envoy::Network::Address::Instance const>, Envoy::ListenerHooks&, Envoy::Server::HotRestart&, Envoy::Stats::StoreRoot&, Envoy::Thread::BasicLockable&, Envoy::Server::ComponentFactory&, std::__1::unique_ptr<Envoy::Random::RandomGenerator, std::__1::default_deleteEnvoy::Random::RandomGenerator >&&, Envoy::ThreadLocal::Instance&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContext, std::__1::default_deleteEnvoy::ProcessContext >) ??:?
#21 0x16f32984 in std::__1::__unique_ifEnvoy::Server::InstanceImpl::__unique_single std::__1::make_unique<Envoy::Server::InstanceImpl, Envoy::Init::Manager&, testing::NiceMockEnvoy::Server::MockOptions&, Envoy::Event::GlobalTimeSystem&, std::__1::shared_ptrEnvoy::Network::Address::Ipv4Instance, Envoy::DefaultListenerHooks&, testing::NiceMockEnvoy::Server::MockHotRestart&, Envoy::Stats::TestIsolatedStoreImpl&, Envoy::Thread::MutexBasicLockable&, Envoy::Server::TestComponentFactory&, std::__1::unique_ptr<testing::NiceMockEnvoy::Random::MockRandomGenerator, std::__1::default_delete<testing::NiceMockEnvoy::Random::MockRandomGenerator > >, Envoy::ThreadLocal::InstanceImpl&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContextImpl, std::__1::default_deleteEnvoy::ProcessContextImpl > >(Envoy::Init::Manager&, testing::NiceMockEnvoy::Server::MockOptions&, Envoy::Event::GlobalTimeSystem&, std::__1::shared_ptrEnvoy::Network::Address::Ipv4Instance&&, Envoy::DefaultListenerHooks&, testing::NiceMockEnvoy::Server::MockHotRestart&, Envoy::Stats::TestIsolatedStoreImpl&, Envoy::Thread::MutexBasicLockable&, Envoy::Server::TestComponentFactory&, std::__1::unique_ptr<testing::NiceMockEnvoy::Random::MockRandomGenerator, std::__1::default_delete<testing::NiceMockEnvoy::Random::MockRandomGenerator > >&&, Envoy::ThreadLocal::InstanceImpl&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContextImpl, std::__1::default_deleteEnvoy::ProcessContextImpl >&&) server_test.cc:?
#22 0x169feb65 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTestBase::initialize(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, bool) server_test.cc:?
#23 0x16a40152 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTestBase::initialize(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) server_test.cc:?
#24 0x16ae6dd7 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTest_LogToFile_Test::TestBody() server_test.cc:?
#25 0x264f4ee0 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::)(), char const) ??:?
#26 0x264a3dc7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::)(), char const) ??:?
#27 0x2645c33f in testing::Test::Run() ??:?
#28 0x2645e7df in testing::TestInfo::Run() ??:?
#29 0x264602c0 in testing::TestSuite::Run() ??:?
#30 0x26489a44 in testing::internal::UnitTestImpl::RunAllTests() ??:?
#31 0x2650ba60 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::)(), char const) ??:?
#32 0x264ad6ac in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::)(), char const) ??:?
#33 0x2648846b in testing::UnitTest::Run() ??:?
#34 0x1f2ceff5 in RUN_ALL_TESTS() ??:?
#35 0x1f2cccf9 in Envoy::TestRunner::RunTests(int, char**) ??:?
#36 0x1f2c5de8 in main ??:?
#37 0x7fe619924e0a in __libc_start_main /build/glibc-M65Gwz/glibc-2.30/csu/../csu/libc-start.c:308
#38 0x16905fa9 in _start ??:?

antoniovicente · 2020-07-24T23:14:00Z

Maybe play 'master merge roulette' to see if you get lucky? IDK. FWIW the tsan failure was a segv in ServerTest.LogToFile:
2020-07-18T16:31:54.3830193Z [ RUN      ] IpVersions/ServerInstanceImplTest.LogToFile/IPv4
2020-07-18T16:31:54.3830955Z external/bazel_tools/tools/test/collect_coverage.sh: line 131: 16607 Segmentation fault      (core dumped) "$@"
Super annoying. No idea why that would segv, and there was no stack trace.
cc @lizan
This segfault on IpVersions/ServerInstanceImplTest.LogToFile/IPv4 failure has happened at least twice in CI on this PR so it may be a real issue. I don't know how to repo, not having a crash stack makes debugging difficult.
This is a determistic failure. I was able to repo locally in coverage mode with asan. Seems related to use of runtime features on timers created too early in the process. It is triggered by use of "-l trace" when running in coverage mode.

[ RUN ] IpVersions/ServerInstanceImplTest.LogToFile/IPv4
source/common/common/logger_delegates.cc:21:14: runtime error: member call on misaligned address 0xbebebebebebebebe for type 'Envoy::AccessLog::AccessLogFile', which requires 8 byte alignment
0xbebebebebebebebe: note: pointer points here

#0 0x2121692f in Envoy::Logger::FileSinkDelegate::log(std::__1::basic_string_view<char, std::1::char_traits >) ??:? #1 0x261ccabc in Envoy::Logger::DelegatingLogSink::log(spdlog::details::log_msg const&) ??:? #2 0x16b75c4a in spdlog::logger::sink_it(spdlog::details::log_msg const&) ??:? #3 0x16dd6079 in spdlog::logger::log_it(spdlog::details::log_msg const&, bool, bool) ??:?
#4 0x17d916a2 in void spdlog::logger::log<std::__1::basic_string_view<char, std::__1::char_traits > >(spdlog::source_loc, spdlog::level::level_enum, fmt::v6::basic_string_view, std::__1::basic_string_view<char, std::__1::char_traits > const&) ??:?
#5 0x21057a03 in Envoy::Runtime::runtimeFeatureEnabled(std::__1::basic_string_view<char, std::__1::char_traits >) ??:?
#6 0x1fe22052 in Envoy::Event::TimerImpl::TimerImpl(Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()>, Envoy::Event::Dispatcher&) ??:?
#7 0x1fe1bc8a in std::__1::__unique_ifEnvoy::Event::TimerImpl::__unique_single std::__1::make_unique<Envoy::Event::TimerImpl, Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()> const&, Envoy::Event::Dispatcher&>(Envoy::CSmartPtr<event_base, &event_base_free>&, std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) libevent_scheduler.cc:?
#8 0x1fe175f8 in Envoy::Event::LibeventScheduler::createTimer(std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) ??:?
#9 0x1fe155c3 in Envoy::Event::(anonymous namespace)::RealScheduler::createTimer(std::__1::function<void ()> const&, Envoy::Event::Dispatcher&) real_time_system.cc:?
#10 0x1f5486c9 in Envoy::Event::DispatcherImpl::createTimerInternal(std::__1::function<void ()>) ??:?
#11 0x1f5483ab in Envoy::Event::DispatcherImpl::createTimer(std::__1::function<void ()>) ??:?
#12 0x18f389bf in Envoy::AccessLog::AccessLogFileImpl::AccessLogFileImpl(std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >, Envoy::Thread::ThreadFactory&) ??:?
#13 0x18f62a33 in std::__1::__compressed_pair_elem<Envoy::AccessLog::AccessLogFileImpl, 1, false>::__compressed_pair_elem<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul>(std::__1::piecewise_construct_t, std::__1::tuple<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>, std::__1::__tuple_indices<0ul, 1ul, 2ul, 3ul, 4ul, 5ul>) ??:?
#14 0x18f60852 in std::__1::__compressed_pair<std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl, Envoy::AccessLog::AccessLogFileImpl>::__compressed_pair<std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl&, std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>(std::__1::piecewise_construct_t, std::__1::tuplestd::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl&, std::__1::tuple<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>) ??:?
#15 0x18f5f0bc in std::__1::__shared_ptr_emplace<Envoy::AccessLog::AccessLogFileImpl, std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl >::__shared_ptr_emplace<std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&>(std::__1::allocatorEnvoy::AccessLog::AccessLogFileImpl, std::__1::unique_ptr<Envoy::Filesystem::File, std::__1::default_deleteEnvoy::Filesystem::File >&&, Envoy::Event::Dispatcher&, Envoy::Thread::BasicLockable&, Envoy::AccessLogFileStats&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&, Envoy::Thread::ThreadFactory&) ??:?
#16 0x18f4f0b5 in ZNSt3__111make_sharedIN5Envoy9AccessLog17AccessLogFileImplEJNS_10unique_ptrINS1_10Filesystem4FileENS_14default_deleteIS6_EEEERNS1_5Event10DispatcherERNS1_6Thread13BasicLockableERNS1_18AccessLogFileStatsERKNS_6chrono8durationIxNS_5ratioILl1ELl1000EEEEERNSD_13ThreadFactoryEEEENS_9enable_ifIXntsr8is_arrayIT_EE5valueENS_10shared_ptrISS_EEE4typeEDpOT0 access_log_manager_impl.cc:?
#17 0x18f37fdc in Envoy::AccessLog::AccessLogManagerImpl::createAccessLog(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) ??:?
#18 0x212165af in Envoy::Logger::FileSinkDelegate::FileSinkDelegate(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManager&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink) ??:?
#19 0x1859c8e8 in std::__1::__unique_ifEnvoy::Logger::FileSinkDelegate::__unique_single std::__1::make_unique<Envoy::Logger::FileSinkDelegate, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManagerImpl&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink >(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, Envoy::AccessLog::AccessLogManagerImpl&, std::__1::shared_ptrEnvoy::Logger::DelegatingLogSink&&) server.cc:?
#20 0x184c18d5 in Envoy::Server::InstanceImpl::InstanceImpl(Envoy::Init::Manager&, Envoy::Server::Options const&, Envoy::Event::TimeSystem&, std::__1::shared_ptr<Envoy::Network::Address::Instance const>, Envoy::ListenerHooks&, Envoy::Server::HotRestart&, Envoy::Stats::StoreRoot&, Envoy::Thread::BasicLockable&, Envoy::Server::ComponentFactory&, std::__1::unique_ptr<Envoy::Random::RandomGenerator, std::__1::default_deleteEnvoy::Random::RandomGenerator >&&, Envoy::ThreadLocal::Instance&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContext, std::__1::default_deleteEnvoy::ProcessContext >) ??:?
#21 0x16f32984 in std::__1::__unique_ifEnvoy::Server::InstanceImpl::__unique_single std::__1::make_unique<Envoy::Server::InstanceImpl, Envoy::Init::Manager&, testing::NiceMockEnvoy::Server::MockOptions&, Envoy::Event::GlobalTimeSystem&, std::__1::shared_ptrEnvoy::Network::Address::Ipv4Instance, Envoy::DefaultListenerHooks&, testing::NiceMockEnvoy::Server::MockHotRestart&, Envoy::Stats::TestIsolatedStoreImpl&, Envoy::Thread::MutexBasicLockable&, Envoy::Server::TestComponentFactory&, std::__1::unique_ptr<testing::NiceMockEnvoy::Random::MockRandomGenerator, std::__1::default_delete<testing::NiceMockEnvoy::Random::MockRandomGenerator > >, Envoy::ThreadLocal::InstanceImpl&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContextImpl, std::__1::default_deleteEnvoy::ProcessContextImpl > >(Envoy::Init::Manager&, testing::NiceMockEnvoy::Server::MockOptions&, Envoy::Event::GlobalTimeSystem&, std::__1::shared_ptrEnvoy::Network::Address::Ipv4Instance&&, Envoy::DefaultListenerHooks&, testing::NiceMockEnvoy::Server::MockHotRestart&, Envoy::Stats::TestIsolatedStoreImpl&, Envoy::Thread::MutexBasicLockable&, Envoy::Server::TestComponentFactory&, std::__1::unique_ptr<testing::NiceMockEnvoy::Random::MockRandomGenerator, std::__1::default_delete<testing::NiceMockEnvoy::Random::MockRandomGenerator > >&&, Envoy::ThreadLocal::InstanceImpl&, Envoy::Thread::ThreadFactory&, Envoy::Filesystem::Instance&, std::__1::unique_ptr<Envoy::ProcessContextImpl, std::__1::default_deleteEnvoy::ProcessContextImpl >&&) server_test.cc:?
#22 0x169feb65 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTestBase::initialize(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, bool) server_test.cc:?
#23 0x16a40152 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTestBase::initialize(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) server_test.cc:?
#24 0x16ae6dd7 in Envoy::Server::(anonymous namespace)::ServerInstanceImplTest_LogToFile_Test::TestBody() server_test.cc:?
#25 0x264f4ee0 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::)(), char const) ??:?
#26 0x264a3dc7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::)(), char const) ??:?
#27 0x2645c33f in testing::Test::Run() ??:?
#28 0x2645e7df in testing::TestInfo::Run() ??:?
#29 0x264602c0 in testing::TestSuite::Run() ??:?
#30 0x26489a44 in testing::internal::UnitTestImpl::RunAllTests() ??:?
#31 0x2650ba60 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::)(), char const) ??:?
#32 0x264ad6ac in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::)(), char const) ??:?
#33 0x2648846b in testing::UnitTest::Run() ??:?
#34 0x1f2ceff5 in RUN_ALL_TESTS() ??:?
#35 0x1f2cccf9 in Envoy::TestRunner::RunTests(int, char**) ??:?
#36 0x1f2c5de8 in main ??:?
#37 0x7fe619924e0a in __libc_start_main /build/glibc-M65Gwz/glibc-2.30/csu/../csu/libc-start.c:308
#38 0x16905fa9 in _start ??:?

I think that the two possible ways to fix this issue with use of runtime features too early in the process are:

Remove the log in Envoy::Runtime::runtimeFeatureEnabled Runtime::LoaderSingleton::getExisting() retuns nullptr
Do not add runtime feature guards to this change.

I'm open to suggetions.

lizan · 2020-07-25T01:04:06Z

Remove the log in Envoy::Runtime::runtimeFeatureEnabled Runtime::LoaderSingleton::getExisting() retuns nullptr

Can we do this conditionally? i.e. check if logger is created or not before logging there.

antoniovicente · 2020-07-27T14:49:50Z

Remove the log in Envoy::Runtime::runtimeFeatureEnabled Runtime::LoaderSingleton::getExisting() retuns nullptr

Can we do this conditionally? i.e. check if logger is created or not before logging there.

Logging happens inside the call to Runtime::runtimeFeatureEnabled. I have the impression that some loggers are created but loggers backed by files end up creating timers during the creation process and run into problems.

One possible option would be to check for Runtime::LoaderSingleton::getExisting() before calling Runtime::runtimeFeatureEnabled in the timer class, but that doesn't feel great either.

Signed-off-by: Antonio Vicente <avd@google.com>

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

Simulated time relinquishes the lock while activating timers. That allows other threads like the watchdog to insert timers that appear to be in the past. Invariant check makes //test/server:server_test fail about 0.5% of the time under ASAN. Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente · 2020-08-05T02:24:18Z

Latest asan failure due to timeout in //test/common/network:udp_listener_impl_test is not reproducible over 2500 local runs.
forcing rerun of CI.

mattklein123 · 2020-08-05T21:22:14Z

@antoniovicente sorry for the churn. Can you merge main one more time? I think tidy should be fixed now.

/wait

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

mattklein123 · 2020-08-06T17:00:55Z

Known flake. I will look at that today. @jmarantz are you OK with this change? We can force merge.

Run 0ms timeout timers on the next iteration of the event loop instea…

d970d84

…d of the current iteration. Guarded by runtime feature "envoy.reloadable_features.activate_timers_next_event_loop" Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente requested review from alyssawilk and mattklein123 as code owners June 30, 2020 19:13

antoniovicente changed the title ~~dispatcher: Run 0ms timeout timers on the next iteration of the event loop~~ dispatcher: Run zero-delay timeout timers on the next iteration of the event loop Jun 30, 2020

mattklein123 self-assigned this Jun 30, 2020

mattklein123 reviewed Jun 30, 2020

View reviewed changes

mattklein123 assigned jmarantz Jun 30, 2020

Attempt to fix windows tests.

bdccb5b

Signed-off-by: Antonio Vicente <avd@google.com>

jmarantz previously approved these changes Jul 2, 2020

View reviewed changes

source/common/event/libevent_scheduler.h Show resolved Hide resolved

source/common/event/libevent_scheduler.h Show resolved Hide resolved

source/common/event/libevent_scheduler.h Show resolved Hide resolved

repokitteh-read-only bot added the waiting label Jul 2, 2020

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

39abbba

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente commented Jul 6, 2020

View reviewed changes

source/common/event/libevent_scheduler.h Show resolved Hide resolved

source/common/event/libevent_scheduler.h Show resolved Hide resolved

source/common/event/libevent_scheduler.h Show resolved Hide resolved

antoniovicente dismissed jmarantz’s stale review via 39abbba July 6, 2020 17:52

repokitteh-read-only bot removed the waiting label Jul 6, 2020

antoniovicente added 2 commits July 7, 2020 18:43

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

24d2d71

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

0728c73

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

mattklein123 added the waiting label Jul 15, 2020

antoniovicente added 2 commits July 17, 2020 12:27

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

ae438b8

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

fix format and test

d915c0f

Signed-off-by: Antonio Vicente <avd@google.com>

repokitteh-read-only bot removed the waiting label Jul 17, 2020

jmarantz reviewed Jul 18, 2020

View reviewed changes

test/test_common/simulated_time_system_test.cc Show resolved Hide resolved

test/test_common/simulated_time_system_test.cc Outdated Show resolved Hide resolved

repokitteh-read-only bot added the waiting label Jul 18, 2020

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

4e47f7c

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

jmarantz previously approved these changes Jul 23, 2020

View reviewed changes

mattklein123 added the waiting label Jul 24, 2020

Only read runtime feature if singleton exists

dbe5e90

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente dismissed jmarantz’s stale review via dbe5e90 July 31, 2020 03:26

repokitteh-read-only bot removed the waiting label Jul 31, 2020

antoniovicente mentioned this pull request Aug 3, 2020

event: scaled range bucket timer #12414

Closed

antoniovicente added 8 commits August 3, 2020 15:57

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

965b647

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

release the simulated time lock before interacting with libevent

1c46b17

Signed-off-by: Antonio Vicente <avd@google.com>

Kick CI

01cb897

Signed-off-by: Antonio Vicente <avd@google.com>

Kick CI

73abb34

Signed-off-by: Antonio Vicente <avd@google.com>

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

4b40801

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

Use advanceLibeventTime instead of absl::Sleep in new test cases.

e8a4cb9

Signed-off-by: Antonio Vicente <avd@google.com>

Kick CI

55da797

Signed-off-by: Antonio Vicente <avd@google.com>

repokitteh-read-only bot added the waiting label Aug 5, 2020

Merge remote-tracking branch 'upstream/master' into activate_0ms_time…

88fd249

…rs_next_time Signed-off-by: Antonio Vicente <avd@google.com>

repokitteh-read-only bot removed the waiting label Aug 6, 2020

jmarantz approved these changes Aug 6, 2020

View reviewed changes

mattklein123 merged commit 432ee80 into envoyproxy:master Aug 6, 2020

KBaichoo mentioned this pull request Aug 19, 2020

GuardDogActionsTest flake (ARM optimized) #12638

Closed

antoniovicente mentioned this pull request Sep 18, 2020

event: scaled timers using dynamic queues #13129

Merged

alyssawilk mentioned this pull request Apr 15, 2021

envoy.reloadable_features.activate_timers_next_event_loop deprecation #16003

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dispatcher: Run zero-delay timeout timers on the next iteration of the event loop #11823

dispatcher: Run zero-delay timeout timers on the next iteration of the event loop #11823

antoniovicente commented Jun 30, 2020 •

edited

Loading

mattklein123 left a comment

jmarantz commented Jul 2, 2020

jmarantz left a comment

jmarantz commented Jul 2, 2020

antoniovicente left a comment

antoniovicente commented Jul 17, 2020

jmarantz commented Jul 18, 2020

azure-pipelines bot commented Jul 18, 2020

jmarantz left a comment

jmarantz commented Jul 18, 2020

jmarantz commented Jul 20, 2020

jmarantz commented Jul 23, 2020 •

edited

Loading

antoniovicente commented Jul 23, 2020 •

edited

Loading

antoniovicente commented Jul 24, 2020

lizan commented Jul 25, 2020

antoniovicente commented Jul 27, 2020

antoniovicente commented Aug 5, 2020

mattklein123 commented Aug 5, 2020

mattklein123 commented Aug 6, 2020

dispatcher: Run zero-delay timeout timers on the next iteration of the event loop #11823

dispatcher: Run zero-delay timeout timers on the next iteration of the event loop #11823

Conversation

antoniovicente commented Jun 30, 2020 • edited Loading

mattklein123 left a comment

Choose a reason for hiding this comment

jmarantz commented Jul 2, 2020

jmarantz left a comment

Choose a reason for hiding this comment

jmarantz commented Jul 2, 2020

antoniovicente left a comment

Choose a reason for hiding this comment

antoniovicente commented Jul 17, 2020

jmarantz commented Jul 18, 2020

azure-pipelines bot commented Jul 18, 2020

jmarantz left a comment

Choose a reason for hiding this comment

jmarantz commented Jul 18, 2020

jmarantz commented Jul 20, 2020

jmarantz commented Jul 23, 2020 • edited Loading

antoniovicente commented Jul 23, 2020 • edited Loading

antoniovicente commented Jul 24, 2020

lizan commented Jul 25, 2020

antoniovicente commented Jul 27, 2020

antoniovicente commented Aug 5, 2020

mattklein123 commented Aug 5, 2020

mattklein123 commented Aug 6, 2020

antoniovicente commented Jun 30, 2020 •

edited

Loading

jmarantz commented Jul 23, 2020 •

edited

Loading

antoniovicente commented Jul 23, 2020 •

edited

Loading