Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UniPS: createS3Lock meets nullptr when FAP #9394

Closed
CalvinNeo opened this issue Sep 2, 2024 · 2 comments · Fixed by #9655
Closed

UniPS: createS3Lock meets nullptr when FAP #9394

CalvinNeo opened this issue Sep 2, 2024 · 2 comments · Fixed by #9655
Labels
affects-8.5 This bug affects the 8.5.x(LTS) versions. component/storage impact/panic severity/major type/bug The issue is confirmed as a bug.

Comments

@CalvinNeo
Copy link
Member

CalvinNeo commented Sep 2, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

This panic is the direct reason that makes TiFlash restarts. After TiFlash has restarted, everything is OK, and this error never happen again.


[2024/08/28 07:53:21.636 +00:00] [ERROR] [BaseDaemon.cpp:416] ["Address not mapped to object."] [source=BaseDaemon] [thread_id=2007]Show context
[2024/08/28 07:53:21.636 +00:00] [ERROR] [BaseDaemon.cpp:399] ["Address: NULL pointer."] [source=BaseDaemon] [thread_id=2007]
[2024/08/28 07:53:21.636 +00:00] [ERROR] [BaseDaemon.cpp:371] ["(from thread 1953) Received signal Segmentation fault(11)."] [source=BaseDaemon] [thread_id=2007]


[2024/08/28 07:53:21.636 +00:00] [ERROR] [BaseDaemon.cpp:563] ["
  0xaaaad36f7a2c\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+122124844]
                \tlibs/libdaemon/src/BaseDaemon.cpp:214
  0xffff9594e850\t<unknown symbol> [linux-vdso.so.1+2128]
  0xaaaad472ac58\tDB::PS::V3::S3LockLocalManager::createS3Lock(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::S3::S3FilenameView const&, unsigned long) [tiflash+139111512]
                \tdbms/src/Storages/Page/V3/Universal/S3LockLocalManager.cpp:229
  0xaaaad472a57c\tDB::PS::V3::S3LockLocalManager::createS3LockForWriteBatch(DB::UniversalWriteBatch&) [tiflash+139109756]
                \tdbms/src/Storages/Page/V3/Universal/S3LockLocalManager.cpp:173
  0xaaaad4720150\tDB::UniversalPageStorage::write(DB::UniversalWriteBatch&&, DB::PS::V3::PageType, std::__1::shared_ptr<DB::WriteLimiter> const&) const [tiflash+139067728]
                \tdbms/src/Storages/Page/V3/Universal/UniversalPageStorage.cpp:118
  0xaaaad462dc24\tDB::PageWriter::write(DB::WriteBatchWrapper&&, std::__1::shared_ptr<DB::WriteLimiter>) const [tiflash+138075172]
                \tdbms/src/Storages/Page/PageStorage.cpp:780
  0xaaaad3495588\tDB::DM::WriteBatches::writeLogAndData() [tiflash+119625096]
                \tdbms/src/Storages/DeltaMerge/WriteBatchesImpl.h:146
  0xaaaad3573f3c\tDB::DM::StableValueSpace::createFromCheckpoint(std::__1::shared_ptr<DB::Logger> const&, DB::DM::DMContext&, std::__1::shared_ptr<DB::UniversalPageStorage>, unsigned long, DB::DM::WriteBatches&) [tiflash+120536892]
                \tdbms/src/Storages/DeltaMerge/StableValueSpace.cpp:283
  0xaaaad34ef63c\tDB::DM::Segment::createTargetSegmentsFromCheckpoint(std::__1::shared_ptr<DB::Logger> const&, DB::DM::DMContext&, unsigned long, std::__1::vector<DB::DM::Segment::SegmentMetaInfo, std::__1::allocator<DB::DM::Segment::SegmentMetaInfo>> const&, DB::DM::RowKeyRange const&, std::__1::shared_ptr<DB::UniversalPageStorage>, DB::DM::WriteBatches&) [tiflash+119993916]
                \tdbms/src/Storages/DeltaMerge/Segment.cpp:490
  0xaaaad407d870\tDB::DM::DeltaMergeStore::buildSegmentsFromCheckpointInfo(std::__1::shared_ptr<DB::DM::DMContext> const&, DB::DM::RowKeyRange const&, std::__1::shared_ptr<DB::CheckpointInfo> const&) const [tiflash+132110448]
                \tdbms/src/Storages/DeltaMerge/DeltaMergeStore_Ingest.cpp:1140
  0xaaaad4068954\tDB::DM::DeltaMergeStore::buildSegmentsFromCheckpointInfo(DB::Context const&, DB::Settings const&, DB::DM::RowKeyRange const&, std::__1::shared_ptr<DB::CheckpointInfo> const&) [tiflash+132024660]
                \tdbms/src/Storages/DeltaMerge/DeltaMergeStore.h:401
  0xaaaad4c469ac\tDB::FastAddPeerImplWrite(DB::TMTContext&, DB::TiFlashRaftProxyHelper const*, unsigned long, unsigned long, std::__1::tuple<std::__1::shared_ptr<DB::CheckpointInfo>, std::__1::shared_ptr<DB::Region>, raft_serverpb::RaftApplyState, raft_serverpb::RegionLocalState>&&, unsigned long) [tiflash+144468396]
                \tdbms/src/Storages/KVStore/MultiRaft/Disagg/FastAddPeer.cpp:348
  0xaaaad4c47e30\tDB::FastAddPeerImpl(std::__1::shared_ptr<DB::FastAddPeerContext>, DB::TMTContext&, DB::TiFlashRaftProxyHelper const*, unsigned long, unsigned long, unsigned long) [tiflash+144473648]
                \tdbms/src/Storages/KVStore/MultiRaft/Disagg/FastAddPeer.cpp:455
  0xaaaad4c4b30c\tstd::__1::__function::__func<FastAddPeer::$_5, std::__1::allocator<FastAddPeer::$_5>, DB::FastAddPeerRes ()>::operator()() [tiflash+144487180]
                \tdbms/src/Storages/KVStore/MultiRaft/Disagg/FastAddPeer.cpp:670
  0xaaaad4c4f670\tstd::__1::packaged_task<DB::FastAddPeerRes ()>::operator()() [tiflash+144504432]
                \t/usr/lib/llvm-17/bin/../include/c++/v1/future:1891
  0xaaaad4c4f2dc\tDB::AsyncTasks<unsigned long, std::__1::function<DB::FastAddPeerRes ()>, DB::FastAddPeerRes>::addTaskWithCancel(unsigned long, std::__1::function<DB::FastAddPeerRes ()>, std::__1::function<void ()>)::'lambda'()::operator()() const [tiflash+144503516]
                \tdbms/src/Storages/KVStore/Utils/AsyncTasks.h:289
  0xaaaace47903c\tDB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>::worker(std::__1::__list_iterator<DB::ThreadFromGlobalPoolImpl<false>, void*>) [tiflash+35622972]
                \t/usr/lib/llvm-17/bin/../include/c++/v1/__functional/function.h:517
  0xaaaace47bd94\tstd::__1::__function::__func<DB::ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<bool DB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>::scheduleImpl<bool>(std::__1::function<void ()>, long, std::__1::optional<unsigned long>, bool)::'lambda0'()>(bool&&)::'lambda'(), std::__1::allocator<DB::ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<bool DB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>::scheduleImpl<bool>(std::__1::function<void ()>, long, std::__1::optional<unsigned long>, bool)::'lambda0'()>(bool&&)::'lambda'()>, void ()>::operator()() [tiflash+35634580]
                \tdbms/src/Common/UniThreadPool.cpp:160
  0xaaaace477b38\tDB::ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) [tiflash+35617592]
                \t/usr/lib/llvm-17/bin/../include/c++/v1/__functional/function.h:517
  0xaaaace479f9c\tvoid* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void DB::ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, long, std::__1::optional<unsigned long>, bool)::'lambda0'()>>(void*) [tiflash+35626908]
                \tdbms/src/Common/UniThreadPool.cpp:160
  0xffff9201d5c8\tstart_thread [libc.so.6+513480]
                \t./nptl/./nptl/pthread_create.c:442"] [source=BaseDaemon] [thread_id=2007]Show

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

based on v7.5.3 with FAP functions

@JaySon-Huang
Copy link
Contributor

Only happens when FAP is enabled. Which is not enabled by default. Marks the severity as major.

ti-chi-bot bot added a commit that referenced this issue Sep 2, 2024
ref #8673, ref #9394

Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
@ti-chi-bot ti-chi-bot bot added the affects-8.5 This bug affects the 8.5.x(LTS) versions. label Nov 1, 2024
@CalvinNeo
Copy link
Member Author

The reason is because UniPSService could init S3LocalLockManager with s3lock_client == null, because it could be inited after UniPSService in Server.cpp.

@CalvinNeo CalvinNeo reopened this Nov 20, 2024
ti-chi-bot bot pushed a commit that referenced this issue Nov 20, 2024
close #9394

Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
ti-chi-bot bot pushed a commit that referenced this issue Nov 20, 2024
… (#9659)

close #9394

Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>

Co-authored-by: Calvin Neo <calvinneo1995@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-8.5 This bug affects the 8.5.x(LTS) versions. component/storage impact/panic severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants