Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown error on BRPOPLPUSH command #2599

Closed
dg-korolev opened this issue Feb 15, 2024 · 16 comments
Closed

Unknown error on BRPOPLPUSH command #2599

dg-korolev opened this issue Feb 15, 2024 · 16 comments
Labels
bug Something isn't working

Comments

@dg-korolev
Copy link

dg-korolev commented Feb 15, 2024

Describe the bug
After replacing KeyDB with Dragonfly for use with Bull, we encountered the problem that Dragonfly crashed with the error:

dragonfly-1  | *** Check failure stack trace: ***
dragonfly-1  |     @     0x55fcbd81e9f3  google::LogMessage::SendToLog()
dragonfly-1  |     @     0x55fcbd8171b7  google::LogMessage::Flush()
dragonfly-1  |     @     0x55fcbd818b3f  google::LogMessageFatal::~LogMessageFatal()
dragonfly-1  |     @     0x55fcbd5edc8c  util::fb2::detail::FiberInterface::PullMyselfFromRemoteReadyQueue()
dragonfly-1  |     @     0x55fcbd5e6074  util::fb2::EventCount::wait_until()
dragonfly-1  |     @     0x55fcbd11ef14  dfly::Transaction::WaitOnWatch()
dragonfly-1  |     @     0x55fcbd0be494  dfly::(anonymous namespace)::BPopPusher::Run()
dragonfly-1  |     @     0x55fcbd0be8d6  dfly::(anonymous namespace)::BRPopLPush()
dragonfly-1  |     @     0x55fcbd10a3bf  dfly::CommandId::Invoke()
dragonfly-1  |     @     0x55fcbcf1afec  dfly::Service::InvokeCmd()
dragonfly-1  |     @     0x55fcbcf1fbb1  dfly::Service::DispatchCommand()
dragonfly-1  |     @     0x55fcbd1f178d  facade::Connection::DispatchCommand()
dragonfly-1  |     @     0x55fcbd1f19b4  facade::Connection::ParseRedis()
dragonfly-1  |     @     0x55fcbd1f3ddf  facade::Connection::IoLoop()
dragonfly-1  |     @     0x55fcbd1f42d3  facade::Connection::ConnectionFlow()
dragonfly-1  |     @     0x55fcbd1f52b6  facade::Connection::HandleRequests()
dragonfly-1  |     @     0x55fcbd5f92cd  util::ListenerInterface::RunSingleConnection()
dragonfly-1  |     @     0x55fcbd5f9717  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberENS0_21basic_fixedsize_stackINS0_12stack_traitsEEEZN4util3fb26detail15WorkerFiberImplIZNS8_17ListenerInterface13RunAcceptLoopEvEUlvE0_JEEC4IS7_EESt17basic_string_viewIcSt11char_traitsIcEERKNS0_12preallocatedEOT_OSD_EUlOS4_E_EEEEvNS1_10transfer_tE
dragonfly-1  |     @     0x55fcbd63bf2f  make_fcontext
dragonfly-1  | *** SIGABRT received at time=1707932504 on cpu 2 ***
dragonfly-1  | PC: @     0x7ff5b84bc00b  (unknown)  raise
dragonfly-1  | [failure_signal_handler.cc : 345] RAW: Signal 11 raised at PC=0x7ff5b849b941 while already in AbslFailureSignalHandler()
dragonfly-1  | *** SIGSEGV received at time=1707932504 on cpu 2 ***
dragonfly-1  | PC: @     0x7ff5b849b941  (unknown)  abort

This seems to be related to the BRPOPLPUSH command.
We encountered this error a couple of times a day.

Environment (please complete the following information):

  • OS: Debian 11 64bit
  • Kernel: 5.10.0-28-amd64 finalize blpop algorithm #1 SMP Debian 5.10.209-2 (2024-01-31) x86_64 GNU/Linux
  • Containerized:
    • Docker version 25.0.3, build 4debf41
    • Docker Compose version v2.24.5
    • docker-compose.yml:
version: '3.8'
services:
  dragonfly:
    restart: always
    image: 'ghcr.io/dragonflydb/dragonfly:v1.14.3-ubuntu'
    ulimits:
      memlock: -1
    ports:
      - "6379:6379"
    command: ['--default_lua_flags=allow-undeclared-keys', '--snapshot_cron=*/1 * * * *', '--conn_io_threads=4', '--conn_use_incoming_cpu=true']
    volumes:
      - dragonflydata:/data
    logging:
      driver: 'json-file'
      options:
          max-file: '2'
          max-size: 75m
volumes:
  dragonflydata
  • Dragonfly Version: 1.14.3
  • Bull: 4.12.2
@dg-korolev dg-korolev added the bug Something isn't working label Feb 15, 2024
@romange
Copy link
Collaborator

romange commented Feb 15, 2024

@dg-korolev Thank you for reporting this issue. Can you please provide the complete INFO log file for this crash?
(I suspect that this issue has been fixed in the master, but i just want to be sure).

@dg-korolev
Copy link
Author

@romange

dragonfly-1  | W20240214 17:41:00.342994     8 transaction.cc:58] TxQueue is too long. Tx count:97, armed:70, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:suppliersChstoreQueue:wait, poll_executions:285272119 continuation_tx: RPOPLPUSH@144765476/2:7 (3482) 
dragonfly-1  | W20240214 17:41:00.348347     9 transaction.cc:58] TxQueue is too long. Tx count:97, armed:74, runnable:0, total locks: 38, contended locks: 38
dragonfly-1  | max contention score: 40960, lock: bull:categoriesChstoreQueue:active, poll_executions:266010538
dragonfly-1  | W20240214 17:41:08.364198    10 transaction.cc:58] TxQueue is too long. Tx count:97, armed:89, runnable:0, total locks: 49, contended locks: 47
dragonfly-1  | max contention score: 40960, lock: bull:suppliersChstoreQueue:active, poll_executions:286705670 continuation_tx: BRPOPLPUSH@144819184/2 (19818) 
dragonfly-1  | W20240214 17:41:10.404824     9 transaction.cc:58] TxQueue is too long. Tx count:97, armed:71, runnable:0, total locks: 38, contended locks: 38
dragonfly-1  | max contention score: 40960, lock: bull:brandsChstoreQueue:active, poll_executions:266114190 continuation_tx: HGETALL@144832080/1:11 (29404) 
dragonfly-1  | W20240214 17:41:10.405685    11 transaction.cc:58] TxQueue is too long. Tx count:97, armed:90, runnable:0, total locks: 46, contended locks: 46
dragonfly-1  | max contention score: 40960, lock: bull:brandsChstoreQueue:wait, poll_executions:280049284 continuation_tx: EXISTS@144832087/1:3 (25079) 
dragonfly-1  | W20240214 17:41:10.408584     8 transaction.cc:58] TxQueue is too long. Tx count:97, armed:76, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:categoryPagesChstoreQueue:wait, poll_executions:285383820 continuation_tx: HSET@144832129/1:10 (60844) 
dragonfly-1  | W20240214 17:41:18.405524    10 transaction.cc:58] TxQueue is too long. Tx count:97, armed:87, runnable:0, total locks: 49, contended locks: 47
dragonfly-1  | max contention score: 40960, lock: bull:categoryPagesChstoreFailedQueue:active, poll_executions:286822921 continuation_tx: BRPOPLPUSH@144887341/2 (55992) 
dragonfly-1  | W20240214 17:41:20.462534     8 transaction.cc:58] TxQueue is too long. Tx count:97, armed:87, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:categoryPagesChstoreQueue:wait, poll_executions:285493206
dragonfly-1  | W20240214 17:41:21.364646    11 transaction.cc:58] TxQueue is too long. Tx count:97, armed:80, runnable:0, total locks: 46, contended locks: 46
dragonfly-1  | max contention score: 40960, lock: bull:brandsChstoreQueue:wait, poll_executions:280173487 continuation_tx: SREM@144905785/1:4 (43283) 
dragonfly-1  | W20240214 17:41:21.376291     9 transaction.cc:58] TxQueue is too long. Tx count:97, armed:73, runnable:0, total locks: 38, contended locks: 38
dragonfly-1  | max contention score: 40704, lock: bull:brandsChstoreQueue:active, poll_executions:266226827
dragonfly-1  | W20240214 17:41:28.283685    10 transaction.cc:58] TxQueue is too long. Tx count:97, armed:68, runnable:0, total locks: 49, contended locks: 47
dragonfly-1  | max contention score: 40960, lock: bull:categoriesChstoreQueue:wait, poll_executions:286930644 continuation_tx: INCR@144947623/1:442 (17171) 
dragonfly-1  | W20240214 17:41:30.005920     8 transaction.cc:58] TxQueue is too long. Tx count:105, armed:74, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:categoryPagesChstoreQueue:wait, poll_executions:285589463 continuation_tx: LPUSH@144954418/1:2400 (17271) 
dragonfly-1  | W20240214 17:41:31.147203    11 transaction.cc:58] TxQueue is too long. Tx count:97, armed:73, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:categoryPagesChstoreFailedQueue:wait, poll_executions:280268870 continuation_tx: EVALSHA@144959834/0:1251 (17248) 
dragonfly-1  | W20240214 17:41:31.170527     9 transaction.cc:58] TxQueue is too long. Tx count:97, armed:68, runnable:0, total locks: 37, contended locks: 37
dragonfly-1  | max contention score: 40960, lock: bull:categoriesChstoreQueue:active, poll_executions:266327438 continuation_tx: LPUSH@144959834/1:1775 (17248)  armed
dragonfly-1  | W20240214 17:41:38.329814    10 transaction.cc:58] TxQueue is too long. Tx count:97, armed:75, runnable:0, total locks: 49, contended locks: 47
dragonfly-1  | max contention score: 40960, lock: bull:categoriesChstoreQueue:wait, poll_executions:287024067 continuation_tx: BRPOPLPUSH@144996110/2 (54095) 
dragonfly-1  | W20240214 17:41:40.390175     8 transaction.cc:58] TxQueue is too long. Tx count:97, armed:67, runnable:1, total locks: 47, contended locks: 45
dragonfly-1  | max contention score: 40960, lock: bull:brandsPagesChstoreQueue:active, poll_executions:285684801
dragonfly-1  | W20240214 17:41:41.407651     9 transaction.cc:58] TxQueue is too long. Tx count:97, armed:77, runnable:0, total locks: 37, contended locks: 37
dragonfly-1  | max contention score: 40704, lock: bull:brandsChstoreQueue:active, poll_executions:266425882 continuation_tx: ZREM@145012965/1:9 (49176)  armed
dragonfly-1  | W20240214 17:41:41.410605    11 transaction.cc:58] TxQueue is too long. Tx count:97, armed:74, runnable:0, total locks: 45, contended locks: 45
dragonfly-1  | max contention score: 40704, lock: bull:brandsChstoreQueue:wait, poll_executions:280367270 continuation_tx: LPUSH@145012980/1:4 (41059) 
dragonfly-1  | F20240214 17:41:43.991547    10 fiber_interface.cc:302] Check failed: !IsScheduledRemotely() 
dragonfly-1  | *** Check failure stack trace: ***
dragonfly-1  |     @     0x55fcbd81e9f3  google::LogMessage::SendToLog()
dragonfly-1  |     @     0x55fcbd8171b7  google::LogMessage::Flush()
dragonfly-1  |     @     0x55fcbd818b3f  google::LogMessageFatal::~LogMessageFatal()
dragonfly-1  |     @     0x55fcbd5edc8c  util::fb2::detail::FiberInterface::PullMyselfFromRemoteReadyQueue()
dragonfly-1  |     @     0x55fcbd5e6074  util::fb2::EventCount::wait_until()
dragonfly-1  |     @     0x55fcbd11ef14  dfly::Transaction::WaitOnWatch()
dragonfly-1  |     @     0x55fcbd0be494  dfly::(anonymous namespace)::BPopPusher::Run()
dragonfly-1  |     @     0x55fcbd0be8d6  dfly::(anonymous namespace)::BRPopLPush()
dragonfly-1  |     @     0x55fcbd10a3bf  dfly::CommandId::Invoke()
dragonfly-1  |     @     0x55fcbcf1afec  dfly::Service::InvokeCmd()
dragonfly-1  |     @     0x55fcbcf1fbb1  dfly::Service::DispatchCommand()
dragonfly-1  |     @     0x55fcbd1f178d  facade::Connection::DispatchCommand()
dragonfly-1  |     @     0x55fcbd1f19b4  facade::Connection::ParseRedis()
dragonfly-1  |     @     0x55fcbd1f3ddf  facade::Connection::IoLoop()
dragonfly-1  |     @     0x55fcbd1f42d3  facade::Connection::ConnectionFlow()
dragonfly-1  |     @     0x55fcbd1f52b6  facade::Connection::HandleRequests()
dragonfly-1  |     @     0x55fcbd5f92cd  util::ListenerInterface::RunSingleConnection()
dragonfly-1  |     @     0x55fcbd5f9717  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberENS0_21basic_fixedsize_stackINS0_12stack_traitsEEEZN4util3fb26detail15WorkerFiberImplIZNS8_17ListenerInterface13RunAcceptLoopEvEUlvE0_JEEC4IS7_EESt17basic_string_viewIcSt11char_traitsIcEERKNS0_12preallocatedEOT_OSD_EUlOS4_E_EEEEvNS1_10transfer_tE
dragonfly-1  |     @     0x55fcbd63bf2f  make_fcontext
dragonfly-1  | *** SIGABRT received at time=1707932504 on cpu 2 ***
dragonfly-1  | PC: @     0x7ff5b84bc00b  (unknown)  raise
dragonfly-1  | [failure_signal_handler.cc : 345] RAW: Signal 11 raised at PC=0x7ff5b849b941 while already in AbslFailureSignalHandler()
dragonfly-1  | *** SIGSEGV received at time=1707932504 on cpu 2 ***
dragonfly-1  | PC: @     0x7ff5b849b941  (unknown)  abort

@romange
Copy link
Collaborator

romange commented Feb 15, 2024

@dg-korolev it's not the full log but confirms the bug I thought of.

  1. can you please send the FIRST 50 lines of an INFO log? Alternatively, send me "redis-cli INFO SERVER" output? Both can be from a currently running server in the same host.
  2. What is the reason you guys use Debian 11 - 5.10? Was it a deliberate decision?
  3. How many CPUs do you use for Dragonfly?

@dg-korolev
Copy link
Author

@romange

  1. INFO SERVER:

Server

redis_version:6.2.11
dragonfly_version:df-v1.14.3
redis_mode:standalone
arch_bits:64
os:Linux 5.10.0-28-amd64 x86_64
multiplexing_api:epoll
tcp_port:6379
thread_count:8
uptime_in_seconds:1668
uptime_in_days:0

Clients

connected_clients:16656
client_read_buffer_bytes:52408832
blocked_clients:0
dispatch_queue_entries:2408

Memory

used_memory:4522561576
used_memory_human:4.21GiB
used_memory_peak:4603879104
used_memory_peak_human:4.29GiB
used_memory_rss:6268502016
used_memory_rss_human:5.84GiB
used_memory_peak_rss:6268502016
comitted_memory:6226051072
maxmemory:71417606963
maxmemory_human:66.51GiB
object_used_memory:4318579342
type_used_memory_STRING:75856496
type_used_memory_LIST:118836672
type_used_memory_SET:2694
type_used_memory_ZSET:24265200
type_used_memory_HASH:4099618280
table_used_memory:264757120
num_buckets:458752
num_entries:4474551
inline_keys:1
listpack_blobs:18446744073709551529
listpack_bytes:18446744073709550868
small_string_bytes:124367216
pipeline_cache_bytes:277009754
dispatch_queue_bytes:232297038
dispatch_queue_subscriber_bytes:0
dispatch_queue_peak_bytes:467505815
client_read_buffer_peak_bytes:52408832
cache_mode:store
maxmemory_policy:noeviction
save_buffer_bytes:28117825

Stats

total_connections_received:19714
total_commands_processed:18014417
instantaneous_ops_per_sec:9191
total_pipelined_commands:1573544
pipelined_latency_usec:3507339569817
total_net_input_bytes:7097805488
total_net_output_bytes:10149873527
instantaneous_input_kbps:-1
instantaneous_output_kbps:-1
rejected_connections:-1
expired_keys:1339
evicted_keys:0
hard_evictions:0
garbage_checked:5
garbage_collected:0
bump_ups:0
stash_unloaded:0
oom_rejections:0
traverse_ttl_sec:0
delete_ttl_sec:0
keyspace_hits:3366322
keyspace_misses:9709701
keyspace_mutations:10052487
total_reads_processed:6496222
total_writes_processed:92800706
defrag_attempt_total:0
defrag_realloc_total:0
defrag_task_invocation_total:0
reply_count:92800706
reply_latency_usec:1347405340
reply_batch_count:9647444
reply_batch_latency_usec:1026755

Replication

role:master
connected_slaves:0
master_replid:4b66c605011f41eb7a85ed69bc49c5335972eb83

Modules

module:name=ReJSON,ver=20000,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]
module:name=search,ver=20000,api=1,filters=0,usedby=[],using=[ReJSON],options=[handle-io-errors]

Keyspace

db0:keys=4474551,expires=2050,avg_ttl=-1

Cpu

used_cpu_sys:3048.633788
used_cpu_user:2258.347336
used_cpu_sys_children:0.0
used_cpu_user_children:0.2991
used_cpu_sys_main_thread:474.855243
used_cpu_user_main_thread:350.341076

Cluster

cluster_enabled:0

  1. This is a DevOps solution. Which version do you recommend?

  2. We use 8 cpu cores but only 3 cores are loaded at a time

@romange
Copy link
Collaborator

romange commented Feb 15, 2024

  1. I just kicked off our weekly build: https://github.com/dragonflydb/dragonfly/actions/runs/7919426799 once it finishes you will be able to use this image - it should solve the crash problem.
  2. You indeed limit your IO threads to 3 but dragonfly still uses 8 cpus for sharding its data.
  3. Having said that, I suspect you configure your bullmq settings in a suboptimal way. How many BullMQ queues do you use?

@dg-korolev
Copy link
Author

dg-korolev commented Feb 15, 2024

  1. 👍
  2. Please could you describe in more detail, I don’t quite understand
  3. 100-120

@romange
Copy link
Collaborator

romange commented Feb 15, 2024

  1. please refer to https://www.dragonflydb.io/docs/integrations/bullmq#advanced--optimized-configurations
    basically you will get more performance if you use "{}" hashtags for your queue names and run dragonfly with --lock_on_hashtags argument.
  2. if you want to limit dragonfly threads, please use --proactor_threads=3 instead of conn_io_threads

@dg-korolev
Copy link
Author

Thanks for the help and recommendations.
We are waiting for the next release, after which we will check whether the crash problem will be resolved with it

@romange
Copy link
Collaborator

romange commented Feb 18, 2024

@dg-korolev have you used ghcr.io/dragonflydb/dragonfly-weekly:1e06c63727ffe84d2b4f9c36c6d5122082f52ae0-ubuntu ?
Did it help?

@gyfis
Copy link

gyfis commented Feb 20, 2024

Hi @romange, we've migrated from Redis to Dragonfly on Ubuntu 20.04, using the latest .deb package available for download, and seeing the same crash. We had to revert back to Redis since this made Dragonfly unusuable (crashed every <5 minutes).

Here's the crash as well as `INFO` output
Feb 20 06:44:44 redis1 dragonfly[829850]: F20240220 06:44:44.012849 829863 fiber_interface.cc:302] Check failed: !IsScheduledRemotely()
Feb 20 06:44:44 redis1 dragonfly[829850]: *** Check failure stack trace: ***
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1a2a9f3  google::LogMessage::SendToLog()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1a231b7  google::LogMessage::Flush()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1a24b3f  google::LogMessageFatal::~LogMessageFatal()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb17f9c8c  util::fb2::detail::FiberInterface::PullMyselfFromRemoteReadyQueue()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb17f2074  util::fb2::EventCount::wait_until()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb132af14  dfly::Transaction::WaitOnWatch()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb121ce89  dfly::container_utils::RunCbOnFirstNonEmptyBlocking()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb12cc225  dfly::ListFamily::BPopGeneric()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb13163bf  dfly::CommandId::Invoke()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1126fec  dfly::Service::InvokeCmd()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb112bbb1  dfly::Service::DispatchCommand()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb13fd78d  facade::Connection::DispatchCommand()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb13fd9b4  facade::Connection::ParseRedis()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb13ffddf  facade::Connection::IoLoop()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb14002d3  facade::Connection::ConnectionFlow()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb14012b6  facade::Connection::HandleRequests()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb18052cd  util::ListenerInterface::RunSingleConnection()
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1805717  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberENS0_21basic_fixedsize_stackINS0_12stack_traitsEEEZN4util3fb26detail15WorkerFiberImplIZNS8_17ListenerInterface13RunAcceptLoopEvEUlvE0_JEEC4IS7_EESt17basic_string_viewIcSt11char_traitsIcEERKNS0_12preallocatedEOT_OSD_EUlOS4_E_EEEEvNS1_10transfer_tE
Feb 20 06:44:44 redis1 dragonfly[829850]:     @     0x557eb1847f2f  make_fcontext
Feb 20 06:44:44 redis1 dragonfly[829850]: *** SIGABRT received at time=1708407884 on cpu 9 ***
Feb 20 06:44:44 redis1 dragonfly[829850]: PC: @     0x7fcb9a93500b  (unknown)  raise

Server

redis_version:6.2.11
dragonfly_version:df-v1.14.3
redis_mode:standalone
arch_bits:64
os:Linux 5.4.0-99-generic x86_64
multiplexing_api:epoll
tcp_port:6389
thread_count:64
uptime_in_seconds:61
uptime_in_days:0

Clients

connected_clients:41132
client_read_buffer_bytes:51359744
blocked_clients:6719
dispatch_queue_entries:18

Memory

used_memory:1244046280
used_memory_human:1.16GiB
used_memory_peak:1315183480
used_memory_peak_human:1.22GiB
used_memory_rss:2161352704
used_memory_rss_human:2.01GiB
used_memory_peak_rss:2161352704
comitted_memory:2022965312
maxmemory:53687091200
maxmemory_human:50.00GiB
object_used_memory:1110317808
type_used_memory_STRING:966984216
type_used_memory_LIST:12387320
type_used_memory_SET:61272
type_used_memory_ZSET:93610176
type_used_memory_HASH:37274824
table_used_memory:19548160
num_buckets:28672
num_entries:87475
inline_keys:503
listpack_blobs:0
listpack_bytes:0
small_string_bytes:4117224
pipeline_cache_bytes:49757163
dispatch_queue_bytes:54629
dispatch_queue_subscriber_bytes:0
dispatch_queue_peak_bytes:54629
client_read_buffer_peak_bytes:51359744
cache_mode:store
maxmemory_policy:noeviction

Stats

total_connections_received:105295
total_commands_processed:3694722
instantaneous_ops_per_sec:76771
total_pipelined_commands:659333
pipelined_latency_usec:290571404
total_net_input_bytes:545839689
total_net_output_bytes:401893393
instantaneous_input_kbps:-1
instantaneous_output_kbps:-1
rejected_connections:-1
expired_keys:537
evicted_keys:0
hard_evictions:0
garbage_checked:0
garbage_collected:0
bump_ups:0
stash_unloaded:0
oom_rejections:0
traverse_ttl_sec:58307
delete_ttl_sec:0
keyspace_hits:1188811
keyspace_misses:88283587
keyspace_mutations:2200451
total_reads_processed:2086915
total_writes_processed:1921279
defrag_attempt_total:0
defrag_realloc_total:0
defrag_task_invocation_total:0
reply_count:1921279
reply_latency_usec:15978248
reply_batch_count:1318278
reply_batch_latency_usec:7744

Replication

role:master
connected_slaves:0
master_replid:f1ed4384ada0a16aaebd1427c0b67110da1f47eb

Modules

module:name=ReJSON,ver=20000,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]
module:name=search,ver=20000,api=1,filters=0,usedby=[],using=[ReJSON],options=[handle-io-errors]

Keyspace

db0:keys=87475,expires=70794,avg_ttl=-1

Cpu

used_cpu_sys:140.470046
used_cpu_user:91.801995
used_cpu_sys_children:0.0
used_cpu_user_children:0.0
used_cpu_sys_main_thread:3.189919
used_cpu_user_main_thread:4.648046

Cluster

cluster_enabled:0

@romange
Copy link
Collaborator

romange commented Feb 20, 2024

@gyfis is it possible for you to run container ghcr.io/dragonflydb/dragonfly-weekly:1e06c63727ffe84d2b4f9c36c6d5122082f52ae0-ubuntu instead of native binary as a workaround until we release 1.15?

@gyfis
Copy link

gyfis commented Feb 20, 2024

@romange Thank you for the quick reply! Sorry, that's not something I can attempt at the moment. This migration is quite expensive operation-wise for us, so we'll wait this out and attempt at a later point after 1.15 is shipped.

@romange
Copy link
Collaborator

romange commented Feb 20, 2024

I thought you guys already migrated and you need a solution ASAP with Dragonfly. In that case, I suggest to wait for a proper release. Also, after a short sync within the team, we saw we actually be able to fix this bug and release 1.14.4 this week. Sorry for all the trouble.

@gyfis
Copy link

gyfis commented Feb 20, 2024

Appreciate the thoughtfulness! We reverted after seeing the instability. We'll look out for the release, and will ping back if these things re-occur in one of the newer versions 🙏

@dg-korolev
Copy link
Author

@romange Hi, we ran new container and have not seen the crash problem since yesterday morning

adiholden added a commit that referenced this issue Feb 20, 2024
Signed-off-by: adi_holden <adi@dragonflydb.io>
@romange
Copy link
Collaborator

romange commented Feb 20, 2024

Fixed in v1.14.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants