[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

pilshchikov · 2023-10-09T15:03:10Z

Jira Link: DB-8238

Description

Case:

Run RF=3 3 nodes cluster without packaed columns
Load 100k values in table
In background every 15 minutes start execute flush_table and compact_table yb-admin commands
In background start retarting master process
Enable packed column with YBA with rolling restart
Add node with YBA
wait for some time
Disabel packaed columns
Disable packed rows
Enable packed rows
in background change nemesis that restart master process to restart tserver processes
Disable packed rows
FATAL log was found:

Running duration (h:mm:ss): 0:02:09
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F1007 16:39:07.497722 954679 mvcc.cc:395] T 35ef051e15e047b199c820302dc42998 P daf8ed30f5e54b3a9911994285a554ef: T 35ef051e15e047b199c820302dc42998 P daf8ed30f5e54b3a9911994285a554ef: Recent 32 MVCC operations:
1. Replicated { ht: { physical: 1696696743032008 } op_id: 15.686817 }
2. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696743038463 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
3. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
4. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
5. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743038476 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
6. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743038476 logical: 1 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
7. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
8. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743418532 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
9. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
10. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743553805 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
11. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
12. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696746602110 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
13. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
14. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696746774078 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
15. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
16. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747093531 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
17. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
18. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747200604 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
19. Replicated { ht: { physical: 1696696743035532 } op_id: 15.686818 }
20. LastReplicatedHybridTime { last_replicated: { physical: 1696696743035532 } }
21. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747484814 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696745038476 } }
22. LastReplicatedHybridTime { last_replicated: { physical: 1696696743035532 } }
23. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747484820 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696745038476 } }
24. AddLeaderPending { ht: { physical: 1696696747484845 } op_id: 15.686819 }
25. AddLeaderPending { ht: { physical: 1696696747484886 } op_id: 15.686820 }
26. AddLeaderPending { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
27. AddLeaderPending { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
28. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696747485783 } lease: { physical: 1696696748602112 } } safe_time: { physical: 1696696747484844 logical: 4095 } }
29. Aborted { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
30. Aborted { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
31. Aborted { ht: { physical: 1696696747484886 } op_id: 15.686820 }
32. Aborted { ht: { physical: 1696696747484845 } op_id: 15.686819 }
New operation's hybrid time too low: { physical: 1696696747406942 }, op id: 17.686821
  max_safe_time_returned_with_lease_={ safe_time: { physical: 1696696747484844 logical: 4095 } source: kNextInQueue }
  !!! ht <= max_safe_time_returned_with_lease_.safe_time=1
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_with_lease_.safe_time.ToUint64())=-319090687
  ht.PhysicalDiff(max_safe_time_returned_with_lease_.safe_time)=-77902
  
  max_safe_time_returned_without_lease_={ safe_time: <min> source: kUnknown }
  ht <= max_safe_time_returned_without_lease_.safe_time=0
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_without_lease_.safe_time.ToUint64())=6949669877378834432
  ht.PhysicalDiff(max_safe_time_returned_without_lease_.safe_time)=1696696747406942
  
  max_safe_time_returned_for_follower_={ safe_time: { physical: 1696696622593121 } source: kLastReplicated }
  ht <= max_safe_time_returned_for_follower_.safe_time=0
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_for_follower_.safe_time.ToUint64())=511237410816
  ht.PhysicalDiff(max_safe_time_returned_for_follower_.safe_time)=124813821
  
  last_replicated_={ physical: 1696696743035532 }
  ht <= last_replicated_=0
  static_cast<int64_t>(ht.ToUint64() - last_replicated_.ToUint64())=17905295360
  ht.PhysicalDiff(last_replicated_)=4371410
  
  last_ht_in_queue=<min>
  ht <= last_ht_in_queue=0
  static_cast<int64_t>(ht.ToUint64() - last_ht_in_queue.ToUint64())=6949669877378834432
  ht.PhysicalDiff(last_ht_in_queue)=1696696747406942
  
  propagated_safe_time_={ physical: 1696696747484844 logical: 4095 }
  !!! ht <= propagated_safe_time_=1
  static_cast<int64_t>(ht.ToUint64() - propagated_safe_time_.ToUint64())=-319090687
  ht.PhysicalDiff(propagated_safe_time_)=-77902
  
  queue_.size()=0
  queue_=[]

All the logs

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

shamanthchandra-yb · 2024-11-26T13:24:43Z

Also seen in a run on 2.25.0.0-b365 with ysql_disable_server_file_access = true. Please find run report in JIRA comments.

Huqicheng · 2024-11-26T19:49:27Z

Seems it's a raft issue.

24. AddLeaderPending { ht: { physical: 1696696747484845 } op_id: 15.686819 }
25. AddLeaderPending { ht: { physical: 1696696747484886 } op_id: 15.686820 }
26. AddLeaderPending { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
27. AddLeaderPending { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
28. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696747485783 } lease: { physical: 1696696748602112 } } safe_time: { physical: 1696696747484844 logical: 4095 } }
29. Aborted { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
30. Aborted { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
31. Aborted { ht: { physical: 1696696747484886 } op_id: 15.686820 }
32. Aborted { ht: { physical: 1696696747484845 } op_id: 15.686819 }

This peer is the leader of term 15. And seems majority replicated ht_lease is 1696696747485783.
But the leader of term 17 can write ops at 1696696747406942, which breaks the assumption of lease.

Huqicheng · 2024-11-29T18:12:31Z

Root cause:

When the new leader is elected, when it replicates its NoOp, it can also replicate its new leader lease to the follower even when leader lease from old leader is not expired.

  majority_replicated_lease_expiration_ = majority_replicated_data.leader_lease_expiration;
  majority_replicated_ht_lease_expiration_.store(majority_replicated_data.ht_lease_expiration,
                                                 std::memory_order_release);

  if (flags.Test(SetMajorityReplicatedLeaseExpirationFlag::kResetOldLeaderLease)) {
    LOG_WITH_PREFIX(INFO)
        << "Revoked old leader " << old_leader_lease_.holder_uuid << " lease: "
        << MonoDelta(old_leader_lease_.expiration - CoarseMonoClock::now());
    old_leader_lease_.Reset();
  }

  if (flags.Test(SetMajorityReplicatedLeaseExpirationFlag::kResetOldLeaderHtLease)) {
    LOG_WITH_PREFIX(INFO)
        << "Revoked old leader " << old_leader_ht_lease_.holder_uuid << " ht lease: "
        << HybridTime::FromMicros(old_leader_ht_lease_.expiration);
    old_leader_ht_lease_.Reset();
  }

Assuming TS-1 is the leader of term 1,
TS-2 is elected as leader of term 2 by TS-2 and TS-3, it replicates NoOp and its leader lease expiration to TS-3.
It will lead to incorrect behaviour when leader lease revocation is enabled:
TS-1 has its own leader lease exp at T1
TS-2 has its own leader lease exp at T2 and old leader lease exp at T1
TS-3 has old leader lease exp at T2.
(Where T1 < T2)

If TS-2 transfers leadership to TS-3 by load balancer at some time earlier than T1, TS-3 can immediately revoke the leader lease at T2 (from TS-2) and starts writing new data. This can cause hybrid time too low issue.

Solution

TS-2 shouldn't replicate its own leader lease until the old leader lease tracked by it is expired or revoked correctly.

cc @spolitov @rthallamko3

pilshchikov added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Oct 9, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Oct 9, 2023

zlareb1-yb mentioned this issue Oct 10, 2023

[YCQL Packed Rows] [Stress Tests] Core dump with SIGSEGV on libyb_util.so`yb::PrometheusWriter::WriteSingleEntry - handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count after Tserver Restart Nemesis #19456

Open

1 task

yugabyte-ci assigned Huqicheng Oct 10, 2023

yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Oct 10, 2023

pilshchikov added the qa_stress Bugs identified via Stress automation label Feb 11, 2024

rthallamko3 assigned rthallamko3 and unassigned Huqicheng Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

pilshchikov commented Oct 9, 2023 •

edited by jira bot

Loading

shamanthchandra-yb commented Nov 26, 2024

Huqicheng commented Nov 26, 2024

Huqicheng commented Nov 29, 2024 •

edited

Loading

[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

Comments

pilshchikov commented Oct 9, 2023 • edited by jira bot Loading

Description

Issue Type

Warning: Please confirm that this issue does not contain any sensitive information

shamanthchandra-yb commented Nov 26, 2024

Huqicheng commented Nov 26, 2024

Huqicheng commented Nov 29, 2024 • edited Loading

Root cause:

Solution

pilshchikov commented Oct 9, 2023 •

edited by jira bot

Loading

Huqicheng commented Nov 29, 2024 •

edited

Loading