Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Fatal logs on toggle on/off packed column under workload and nemesis #19447

Open
1 task done
pilshchikov opened this issue Oct 9, 2023 · 3 comments
Open
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue qa_stress Bugs identified via Stress automation

Comments

@pilshchikov
Copy link
Contributor

pilshchikov commented Oct 9, 2023

Jira Link: DB-8238

Description

Case:

  1. Run RF=3 3 nodes cluster without packaed columns
  2. Load 100k values in table
  3. In background every 15 minutes start execute flush_table and compact_table yb-admin commands
  4. In background start retarting master process
  5. Enable packed column with YBA with rolling restart
  6. Add node with YBA
  7. wait for some time
  8. Disabel packaed columns
  9. Disable packed rows
  10. Enable packed rows
  11. in background change nemesis that restart master process to restart tserver processes
  12. Disable packed rows
    FATAL log was found:
Running duration (h:mm:ss): 0:02:09
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F1007 16:39:07.497722 954679 mvcc.cc:395] T 35ef051e15e047b199c820302dc42998 P daf8ed30f5e54b3a9911994285a554ef: T 35ef051e15e047b199c820302dc42998 P daf8ed30f5e54b3a9911994285a554ef: Recent 32 MVCC operations:
1. Replicated { ht: { physical: 1696696743032008 } op_id: 15.686817 }
2. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696743038463 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
3. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
4. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
5. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743038476 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
6. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743038476 logical: 1 } lease: { physical: 1696696745036316 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
7. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
8. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743418532 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
9. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
10. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696743553805 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
11. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
12. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696746602110 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
13. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
14. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696746774078 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
15. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
16. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747093531 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
17. LastReplicatedHybridTime { last_replicated: { physical: 1696696743032008 } }
18. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747200604 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696743035531 logical: 4095 } }
19. Replicated { ht: { physical: 1696696743035532 } op_id: 15.686818 }
20. LastReplicatedHybridTime { last_replicated: { physical: 1696696743035532 } }
21. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747484814 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696745038476 } }
22. LastReplicatedHybridTime { last_replicated: { physical: 1696696743035532 } }
23. SafeTime { min_allowed: <min> deadline: +inf ht_lease: { time: { physical: 1696696747484820 } lease: { physical: 1696696745038476 } } safe_time: { physical: 1696696745038476 } }
24. AddLeaderPending { ht: { physical: 1696696747484845 } op_id: 15.686819 }
25. AddLeaderPending { ht: { physical: 1696696747484886 } op_id: 15.686820 }
26. AddLeaderPending { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
27. AddLeaderPending { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
28. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696747485783 } lease: { physical: 1696696748602112 } } safe_time: { physical: 1696696747484844 logical: 4095 } }
29. Aborted { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
30. Aborted { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
31. Aborted { ht: { physical: 1696696747484886 } op_id: 15.686820 }
32. Aborted { ht: { physical: 1696696747484845 } op_id: 15.686819 }
New operation's hybrid time too low: { physical: 1696696747406942 }, op id: 17.686821
  max_safe_time_returned_with_lease_={ safe_time: { physical: 1696696747484844 logical: 4095 } source: kNextInQueue }
  !!! ht <= max_safe_time_returned_with_lease_.safe_time=1
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_with_lease_.safe_time.ToUint64())=-319090687
  ht.PhysicalDiff(max_safe_time_returned_with_lease_.safe_time)=-77902
  
  max_safe_time_returned_without_lease_={ safe_time: <min> source: kUnknown }
  ht <= max_safe_time_returned_without_lease_.safe_time=0
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_without_lease_.safe_time.ToUint64())=6949669877378834432
  ht.PhysicalDiff(max_safe_time_returned_without_lease_.safe_time)=1696696747406942
  
  max_safe_time_returned_for_follower_={ safe_time: { physical: 1696696622593121 } source: kLastReplicated }
  ht <= max_safe_time_returned_for_follower_.safe_time=0
  static_cast<int64_t>(ht.ToUint64() - max_safe_time_returned_for_follower_.safe_time.ToUint64())=511237410816
  ht.PhysicalDiff(max_safe_time_returned_for_follower_.safe_time)=124813821
  
  last_replicated_={ physical: 1696696743035532 }
  ht <= last_replicated_=0
  static_cast<int64_t>(ht.ToUint64() - last_replicated_.ToUint64())=17905295360
  ht.PhysicalDiff(last_replicated_)=4371410
  
  last_ht_in_queue=<min>
  ht <= last_ht_in_queue=0
  static_cast<int64_t>(ht.ToUint64() - last_ht_in_queue.ToUint64())=6949669877378834432
  ht.PhysicalDiff(last_ht_in_queue)=1696696747406942
  
  propagated_safe_time_={ physical: 1696696747484844 logical: 4095 }
  !!! ht <= propagated_safe_time_=1
  static_cast<int64_t>(ht.ToUint64() - propagated_safe_time_.ToUint64())=-319090687
  ht.PhysicalDiff(propagated_safe_time_)=-77902
  
  queue_.size()=0
  queue_=[]

All the logs
image

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@shamanthchandra-yb
Copy link

Also seen in a run on 2.25.0.0-b365 with ysql_disable_server_file_access = true. Please find run report in JIRA comments.

@Huqicheng
Copy link
Contributor

Seems it's a raft issue.

24. AddLeaderPending { ht: { physical: 1696696747484845 } op_id: 15.686819 }
25. AddLeaderPending { ht: { physical: 1696696747484886 } op_id: 15.686820 }
26. AddLeaderPending { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
27. AddLeaderPending { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
28. UpdatePropagatedSafeTimeOnLeader { ht_lease: { time: { physical: 1696696747485783 } lease: { physical: 1696696748602112 } } safe_time: { physical: 1696696747484844 logical: 4095 } }
29. Aborted { ht: { physical: 1696696747484889 logical: 1 } op_id: 15.686822 }
30. Aborted { ht: { physical: 1696696747484887 logical: 1 } op_id: 15.686821 }
31. Aborted { ht: { physical: 1696696747484886 } op_id: 15.686820 }
32. Aborted { ht: { physical: 1696696747484845 } op_id: 15.686819 }

This peer is the leader of term 15. And seems majority replicated ht_lease is 1696696747485783.
But the leader of term 17 can write ops at 1696696747406942, which breaks the assumption of lease.

@Huqicheng
Copy link
Contributor

Huqicheng commented Nov 29, 2024

Root cause:

When the new leader is elected, when it replicates its NoOp, it can also replicate its new leader lease to the follower even when leader lease from old leader is not expired.

  majority_replicated_lease_expiration_ = majority_replicated_data.leader_lease_expiration;
  majority_replicated_ht_lease_expiration_.store(majority_replicated_data.ht_lease_expiration,
                                                 std::memory_order_release);

  if (flags.Test(SetMajorityReplicatedLeaseExpirationFlag::kResetOldLeaderLease)) {
    LOG_WITH_PREFIX(INFO)
        << "Revoked old leader " << old_leader_lease_.holder_uuid << " lease: "
        << MonoDelta(old_leader_lease_.expiration - CoarseMonoClock::now());
    old_leader_lease_.Reset();
  }

  if (flags.Test(SetMajorityReplicatedLeaseExpirationFlag::kResetOldLeaderHtLease)) {
    LOG_WITH_PREFIX(INFO)
        << "Revoked old leader " << old_leader_ht_lease_.holder_uuid << " ht lease: "
        << HybridTime::FromMicros(old_leader_ht_lease_.expiration);
    old_leader_ht_lease_.Reset();
  }

Assuming TS-1 is the leader of term 1,
TS-2 is elected as leader of term 2 by TS-2 and TS-3, it replicates NoOp and its leader lease expiration to TS-3.
It will lead to incorrect behaviour when leader lease revocation is enabled:
TS-1 has its own leader lease exp at T1
TS-2 has its own leader lease exp at T2 and old leader lease exp at T1
TS-3 has old leader lease exp at T2.
(Where T1 < T2)

If TS-2 transfers leadership to TS-3 by load balancer at some time earlier than T1, TS-3 can immediately revoke the leader lease at T2 (from TS-2) and starts writing new data. This can cause hybrid time too low issue.

Solution

TS-2 shouldn't replicate its own leader lease until the old leader lease tracked by it is expired or revoked correctly.

cc @spolitov @rthallamko3

@rthallamko3 rthallamko3 assigned rthallamko3 and unassigned Huqicheng Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue qa_stress Bugs identified via Stress automation
Projects
None yet
Development

No branches or pull requests

5 participants