Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add four letter commands to keeper #28981

Merged
merged 49 commits into from
Nov 24, 2021

Conversation

JackyWoo
Copy link
Contributor

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

We need to implement similar commands in clickhouse-keeper: https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands

Detailed description / Documentation draft:

  1. When startup register 4 letter word command with a <code, command_executor> map. The code is name reinterpreted int32.
  2. When new connection comes, first extract the first 4 characters and check whether it is 4lw command, if yes, run the command.
  3. add some methods to KeeperStorageDispatcher to collect server info.

Because the first request of a connection is hand shake request whose first 4 characters is request length and the length is fixed 44 or 45. All 4lw code is larger than 2^24 or lower than 0, so collision never happens.

The pr is still WIP the first commit only add 'ruok' command.

The pr will close #28649 #26976.

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Sep 13, 2021
@alesapin alesapin self-assigned this Sep 13, 2021
@JackyWoo
Copy link
Contributor Author

@alesapin if the pr is ok, I will work on the rest part.

Copy link
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. Let's implement several more interesting commands!

src/Coordination/FourLetterCommand.h Outdated Show resolved Hide resolved
src/Coordination/FourLetterCommand.h Outdated Show resolved Hide resolved
src/Coordination/FourLetterCommand.h Outdated Show resolved Hide resolved
src/Coordination/FourLetterCommand.h Outdated Show resolved Hide resolved
@alesapin
Copy link
Member

@JackyWoo How is it going?)

@CLAassistant
Copy link

CLAassistant commented Sep 28, 2021

CLA assistant check
All committers have signed the CLA.

@JackyWoo
Copy link
Contributor Author

@alesapin Sorry for late reply. Because We are in the biggest event of the year, the work delayed. I will add more commands in two or three weaks.

@JackyWoo JackyWoo force-pushed the add_4_letter_words_commands branch from f72f77d to fe28a52 Compare September 30, 2021 06:18
@JackyWoo JackyWoo force-pushed the add_4_letter_words_commands branch from 73a1321 to a60663e Compare October 27, 2021 14:05
@JackyWoo
Copy link
Contributor Author

JackyWoo commented Oct 27, 2021

Add 4lw command: mntr, srst, conf

  1. mntr output almost like Zookeeper mntr output
  2. srst can reset mntr counter metrics
  3. conf print keeper_server configuration
  4. Merge mastr branch

Change log

  1. add some command with test
  2. add some interfaces to KeeprServer, StateMachine, KeeperDispatcher which provides some useful information
  3. add KeeperSettings to encapsulate keeper configuration

Copy link
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Are you going to add more commands in this PR? Or we can merge it and continue in a new one?

src/Coordination/CoordinationSettings.cpp Outdated Show resolved Hide resolved
@JackyWoo
Copy link
Contributor Author

JackyWoo commented Oct 28, 2021

@alesapin I intend to add most Zookeeper 4lw commands and white list feature. What is your opinion?
Adn shell we merge the pr first? Because merging conflicts may be a hard work.

@alesapin
Copy link
Member

@alesapin I intend to add most Zookeeper 4lw commands and white list feature. What is your opinion? Adn shell we merge the pr first? Because merging conflicts may be a hard work.

Great idea. Let's continue in this PR, I'm not going to create huge changes in Coordination/ directory.

@JackyWoo
Copy link
Contributor Author

JackyWoo commented Oct 29, 2021

Great idea. Let's continue in this PR, I'm not going to create huge changes in Coordination/ directory.

Ok, thank you.

In mntr command calculating approximately data size will traverse the whole state machine which may block user requests. And there is another solution embeding data size logic into SnapshotableHashTable. But it require element has method such as size which will introduce limitation to SnapshotableHashTable.

So what's you opinion?

@alesapin
Copy link
Member

alesapin commented Oct 29, 2021

Great idea. Let's continue in this PR, I'm not going to create huge changes in Coordination/ directory.

Ok, thank you.

In mntr command calculating approximately data size will traverse the whole state machine which may block user requests. And there is another solution embeding data size logic into SnapshotableHashTable. But it require element has method such as size which will introduce limitation to SnapshotableHashTable.

So what's you opinion?

I think it's Ok. SnapshotableHashTable has very limited use, so it shouldn't affect anything.

@JackyWoo
Copy link
Contributor Author

Ok, I will work on it.

@JackyWoo
Copy link
Contributor Author

JackyWoo commented Nov 1, 2021

@alesapin add sizeOf util functions to SnapshotableHashTable which were chosen at compile time according to whether element has a size method.

Copy link
Contributor Author

@JackyWoo JackyWoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work, thanks.

src/Coordination/FourLetterCommand.cpp Outdated Show resolved Hide resolved
tests/queries/shell_config.sh Outdated Show resolved Hide resolved
Copy link
Contributor Author

@JackyWoo JackyWoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved

src/Coordination/Keeper4LWInfo.h Show resolved Hide resolved
src/Coordination/KeeperDispatcher.cpp Show resolved Hide resolved
tests/queries/shell_config.sh Outdated Show resolved Hide resolved
@alesapin
Copy link
Member

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Nov 22, 2021

update

☑️ Nothing to do

  • -closed [:pushpin: update requirement]
  • #commits-behind>0 [:pushpin: update requirement]

@alesapin
Copy link
Member

Finally fixed:

/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.659512 [ 26617 ] {} <Fatal> BaseDaemon: ########################################
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.666285 [ 26617 ] {} <Fatal> BaseDaemon: (version 21.12.1.8825, build id: 7EAA75FE01C502B751BD36D980EEF0442297DA43) (from thread 2697) (no query) Received signal Aborted (6)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.670013 [ 26617 ] {} <Fatal> BaseDaemon: 
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.672388 [ 26617 ] {} <Fatal> BaseDaemon: Stack trace: 0x7f2f5414218b 0x7f2f54121859 0x1c06502e 0x209e3b08 0x209e3a75 0xd3aa51b 0x1cc42f59 0xd3bd04d 0xd3bcf85 0x1debd12b 0x1debd2ee 0x1ed33e2e 0x1ee27c27 0x1ee254ec 0x7f2f542f7609 0x7f2f5421e293
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.673487 [ 26617 ] {} <Fatal> BaseDaemon: 3. raise @ 0x4618b in /usr/lib/x86_64-linux-gnu/libc-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.683707 [ 26617 ] {} <Fatal> BaseDaemon: 4. abort @ 0x25859 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.701779 [ 26617 ] {} <Fatal> BaseDaemon: 5. ./obj-x86_64-linux-gnu/../base/daemon/BaseDaemon.cpp:454: terminate_handler() @ 0x1c06502e in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.721355 [ 26617 ] {} <Fatal> BaseDaemon: 6. ./obj-x86_64-linux-gnu/../contrib/libcxxabi/src/cxa_handlers.cpp:61: std::__terminate(void (*)()) @ 0x209e3b08 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:49.729291 [ 26617 ] {} <Fatal> BaseDaemon: 7. ./obj-x86_64-linux-gnu/../contrib/libcxxabi/src/cxa_handlers.cpp:0: std::terminate() @ 0x209e3a75 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.229384 [ 26617 ] {} <Fatal> BaseDaemon: 8. ? @ 0xd3aa51b in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.278621 [ 26617 ] {} <Fatal> BaseDaemon: 9. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromPocoSocket.cpp:89: ? @ 0x1cc42f59 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.361595 [ 26617 ] {} <Fatal> BaseDaemon: 10. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2474: std::__1::__shared_count::__release_shared() @ 0xd3bd04d in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.393948 [ 26617 ] {} <Fatal> BaseDaemon: 11. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2517: std::__1::__shared_weak_count::__release_shared() @ 0xd3bcf85 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.492828 [ 26617 ] {} <Fatal> BaseDaemon: 12. ./obj-x86_64-linux-gnu/../src/Server/KeeperTCPHandler.cpp:632: DB::KeeperTCPHandler::~KeeperTCPHandler() @ 0x1debd12b in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.630555 [ 26617 ] {} <Fatal> BaseDaemon: 13. ./obj-x86_64-linux-gnu/../src/Server/KeeperTCPHandler.cpp:630: DB::KeeperTCPHandler::~KeeperTCPHandler() @ 0x1debd2ee in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.665356 [ 26617 ] {} <Fatal> BaseDaemon: 14.1. inlined from ./obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1072: int std::__1::__cxx_atomic_fetch_sub<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.678260 [ 26617 ] {} <Fatal> BaseDaemon: 14.2. inlined from ../contrib/libcxx/include/atomic:1725: std::__1::__atomic_base<int, true>::fetch_sub(int, std::__1::memory_order)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.680215 [ 26617 ] {} <Fatal> BaseDaemon: 14.3. inlined from ../contrib/libcxx/include/atomic:1760: std::__1::__atomic_base<int, true>::operator--()
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.688048 [ 26617 ] {} <Fatal> BaseDaemon: 14.4. inlined from ../contrib/poco/Net/src/TCPServerDispatcher.cpp:236: Poco::Net::TCPServerDispatcher::endConnection()
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.692676 [ 26617 ] {} <Fatal> BaseDaemon: 14. ../contrib/poco/Net/src/TCPServerDispatcher.cpp:119: Poco::Net::TCPServerDispatcher::run() @ 0x1ed33e2e in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.710864 [ 26617 ] {} <Fatal> BaseDaemon: 15. ./obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/ThreadPool.cpp:213: Poco::PooledThread::run() @ 0x1ee27c27 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.734268 [ 26617 ] {} <Fatal> BaseDaemon: 16.1. inlined from ./obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/SharedPtr.h:156: Poco::SharedPtr<Poco::Runnable, Poco::ReferenceCounter, Poco::ReleasePolicy<Poco::Runnable> >::assign(Poco::Runnable*)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.739546 [ 26617 ] {} <Fatal> BaseDaemon: 16.2. inlined from ../contrib/poco/Foundation/include/Poco/SharedPtr.h:208: Poco::SharedPtr<Poco::Runnable, Poco::ReferenceCounter, Poco::ReleasePolicy<Poco::Runnable> >::operator=(Poco::Runnable*)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.756160 [ 26617 ] {} <Fatal> BaseDaemon: 16. ../contrib/poco/Foundation/src/Thread_POSIX.cpp:360: Poco::ThreadImpl::runnableEntry(void*) @ 0x1ee254ec in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.758285 [ 26617 ] {} <Fatal> BaseDaemon: 17. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.767138 [ 26617 ] {} <Fatal> BaseDaemon: 18. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.948280 [ 26616 ] {} <Fatal> BaseDaemon: 8. ? @ 0xd3aa51b in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:52.964452 [ 26616 ] {} <Fatal> BaseDaemon: 9. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromPocoSocket.cpp:89: ? @ 0x1cc42f59 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.020427 [ 26616 ] {} <Fatal> BaseDaemon: 10. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2474: std::__1::__shared_count::__release_shared() @ 0xd3bd04d in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.078542 [ 26616 ] {} <Fatal> BaseDaemon: 11. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2517: std::__1::__shared_weak_count::__release_shared() @ 0xd3bcf85 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.106284 [ 26616 ] {} <Fatal> BaseDaemon: 12. ./obj-x86_64-linux-gnu/../src/Server/KeeperTCPHandler.cpp:632: DB::KeeperTCPHandler::~KeeperTCPHandler() @ 0x1debd12b in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.146149 [ 26616 ] {} <Fatal> BaseDaemon: 13. ./obj-x86_64-linux-gnu/../src/Server/KeeperTCPHandler.cpp:630: DB::KeeperTCPHandler::~KeeperTCPHandler() @ 0x1debd2ee in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.161391 [ 26616 ] {} <Fatal> BaseDaemon: 14.1. inlined from ./obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1072: int std::__1::__cxx_atomic_fetch_sub<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.163655 [ 26616 ] {} <Fatal> BaseDaemon: 14.2. inlined from ../contrib/libcxx/include/atomic:1725: std::__1::__atomic_base<int, true>::fetch_sub(int, std::__1::memory_order)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.165854 [ 26616 ] {} <Fatal> BaseDaemon: 14.3. inlined from ../contrib/libcxx/include/atomic:1760: std::__1::__atomic_base<int, true>::operator--()
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.170214 [ 26616 ] {} <Fatal> BaseDaemon: 14.4. inlined from ../contrib/poco/Net/src/TCPServerDispatcher.cpp:236: Poco::Net::TCPServerDispatcher::endConnection()
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.172304 [ 26616 ] {} <Fatal> BaseDaemon: 14. ../contrib/poco/Net/src/TCPServerDispatcher.cpp:119: Poco::Net::TCPServerDispatcher::run() @ 0x1ed33e2e in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.200445 [ 26616 ] {} <Fatal> BaseDaemon: 15. ./obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/ThreadPool.cpp:213: Poco::PooledThread::run() @ 0x1ee27c27 in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.217797 [ 26616 ] {} <Fatal> BaseDaemon: 16.1. inlined from ./obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/SharedPtr.h:156: Poco::SharedPtr<Poco::Runnable, Poco::ReferenceCounter, Poco::ReleasePolicy<Poco::Runnable> >::assign(Poco::Runnable*)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.218933 [ 26616 ] {} <Fatal> BaseDaemon: 16.2. inlined from ../contrib/poco/Foundation/include/Poco/SharedPtr.h:208: Poco::SharedPtr<Poco::Runnable, Poco::ReferenceCounter, Poco::ReleasePolicy<Poco::Runnable> >::operator=(Poco::Runnable*)
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.220526 [ 26616 ] {} <Fatal> BaseDaemon: 16. ../contrib/poco/Foundation/src/Thread_POSIX.cpp:360: Poco::ThreadImpl::runnableEntry(void*) @ 0x1ee254ec in /usr/bin/clickhouse
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.222241 [ 26616 ] {} <Fatal> BaseDaemon: 17. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.225577 [ 26616 ] {} <Fatal> BaseDaemon: 18. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.272362 [ 26617 ] {} <Fatal> BaseDaemon: Calculated checksum of the binary: 56E0317DB6CEECE62B1AAA9078259F89. There is no information about the reference checksum.
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:25:53.634259 [ 26616 ] {} <Fatal> BaseDaemon: Calculated checksum of the binary: 56E0317DB6CEECE62B1AAA9078259F89. There is no information about the reference checksum.
/var/log/clickhouse-server/clickhouse-server.log.2:2021.11.22 20:26:14.005281 [ 477 ] {} <Fatal> Application: Child process was terminated by signal 6.```

@alesapin
Copy link
Member

image
Bugs in old infrastructure

@alesapin
Copy link
Member

Stateless tests flaky check (address, actions) — fail: 4, passed: 93, unknown: 3 -- too many concurrent runs, it's ok.
Stress test (memory, actions) — Killed by signal (in clickhouse-server.log) -- reported: #31689
Stress test (thread, actions) — Test script failed -- unrelated, but strange failurе.
01622_defaults_for_url_engine -- unrelated, flaky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add 4 letters words to NuRaftKeeper
4 participants