-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Improve Traces #10660
Closed
amitanandaiyer opened this issue
Nov 20, 2021
· 0 comments
· May be fixed by ryan-ally/yugabyte-db#200 or nyndyny/yugabyte-db#171
Closed
[DocDB] Improve Traces #10660
amitanandaiyer opened this issue
Nov 20, 2021
· 0 comments
· May be fixed by ryan-ally/yugabyte-db#200 or nyndyny/yugabyte-db#171
Labels
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
Comments
amitanandaiyer
added a commit
that referenced
this issue
Nov 21, 2021
…it can be used across all defined Rpcs Summary: 1) Define trace in RpcCommand so that it can be used across all defined RPCs - remove trace_ from transaction_rpc.cc because we already have it in the superclass 2) Add begin/end to demarcate the traces received from the server. 3) Plumb through the tracing for write ops. The initial traces in the handler threads were being missed earlier. 4) Adding traces to SharedLockManager. TO BE FIXED. Does not decode keys properly. 5) Add more tracing to the conflict resolution part 6) Add a tracing depth flag for controlling how many levels of children traces are to be printed. 7) Populate trace_buffer even upon write failures. Changes the write failure message to 23954 2021-11-12 22:57:42.347 UTC [12299] STATEMENT: UPDATE t SET value = 'NEW' WHERE key = 0 23955 ../../src/yb/yql/pgwrapper/pg_mini-test.cc:214: Failure 23956 Failed 23957 Bad status: Network error (yb/yql/pgwrapper/libpq_utils.cc:226): Execute of 'UPDATE t SET value = 'NEW' WHERE key = 0' failed: 7, message: ERROR: All transparent retries exhausted. Operation failed. Try again.: 84f3da10-d961-4251-97ef-96f197810acf Conflicts with higher priority transaction: 7420b446-874c-4ab c-8d77-c51b33f93008 23958 (pgsql error 40001) 23959 W1112 22:57:43.135255 11959 outbound_call.cc:380] RPC callback for RPC call yb.tserver.TabletServerService.GetTransactionStatus -> { remote: 127.0.0.5:19056 idx: 2 protocol: 0x00007f0a2ba62ea0 -> tcpc } , state=FINISHED_SUCCESS. took 0.135s Test Plan: jenkins. run a test rb -- --cxx-test pg_mini-test --gtest_filter PgMiniTest.FollowerReadsWithWrite --test_args --collect_end_to_end_traces=trust_args --ybclient_print_trace_every_n=1 --test_args --tracing_level=1 and look for the traces. Example output: I1113 06:14:35.862326 16976 async_rpc.cc:160] Write(tablet: 80956799d32543be8c74bbfd97cae8e9, num_ops: 1, num_attempts: 1, txn: ca97464e-1ee3-4dc4-a4a1-aaf746a4efd8, subtxn: [none]) took 8000us. Trace: .. 1113 06:14:35.854683 (+ 0us) async_rpc.cc:487] WriteRpc initiated .. 1113 06:14:35.854683 (+ 0us) async_rpc.cc:488] Tablet 80956799d32543be8c74bbfd97cae8e9 table yugabyte.t .. 1113 06:14:35.854683 (+ 0us) async_rpc.cc:169] SendRpc() called. .. 1113 06:14:35.854683 (+ 0us) tablet_rpc.cc:101] SelectTabletServer() .. 1113 06:14:35.854683 (+ 0us) tablet_rpc.cc:153] Selected { uuid: 53a481d52a984d1a8b7e96d249aff21f private: [host: "127.0.0.6" port: 26739] public: [host: "127.0.0.7.ip.yugabyte" port: 26739] cloud_info: placement_cloud: "cloud2" placement_region: "rack3" placement_zone: "zone" .. 1113 06:14:35.854683 (+ 0us) async_rpc.cc:547] SendRpcToTserver .. 1113 06:14:35.855683 (+ 1000us) async_rpc.cc:552] RpcDispatched Asynchronously .. 1113 06:14:35.861684 (+ 6001us) tablet_rpc.cc:308] Done(OK) .. 1113 06:14:35.861684 (+ 0us) async_rpc.cc:413] ProcessResponseFromTserver(Operation failed. Try again.: ca97464e-1ee3-4dc4-a4a1-aaf746a4efd8 Conflicts with higher priority transaction: 52ec72c4-c941-4f40-96ad-3478a3ca0528 (transaction error 3)) .. 1113 06:14:35.861684 (+ 0us) async_rpc.cc:415] Received from server: BEGIN .. 1113 06:14:35.855820 (+ 0us) inbound_call.cc:90] Created InboundCall .. 1113 06:14:35.855820 (+ 0us) service_pool.cc:180] Inserting onto call queue .. 1113 06:14:35.855820 (+ 0us) service_pool.cc:261] Handling call Write .. 1113 06:14:35.855820 (+ 0us) tablet_service.cc:1591] Start Write .. 1113 06:14:35.855820 (+ 0us) tablet.cc:2019] AcquireLocksAndPerformDocOperations .. 1113 06:14:35.855820 (+ 0us) tablet.cc:3776] Acquiring write permit .. 1113 06:14:35.856820 (+ 1000us) operation_counter.cc:187] ScopedRWOperation Tablet schema .. 1113 06:14:35.856820 (+ 0us) tablet.cc:3777] Acquiring write permit done .. 1113 06:14:35.856820 (+ 0us) operation_counter.cc:199] Reset null .. 1113 06:14:35.856820 (+ 0us) operation_counter.cc:199] Reset null .. 1113 06:14:35.856820 (+ 0us) operation_counter.cc:187] ScopedRWOperation RocksDB non-abortable read/write operations .. 1113 06:14:35.856820 (+ 0us) operation_counter.cc:199] Reset null .. 1113 06:14:35.856820 (+ 0us) shared_lock_manager.cc:255] Locking a batch of 3 keys .. 1113 06:14:35.856820 (+ 0us) shared_lock_manager.cc:295] Acquired a lock batch of 3 keys .. 1113 06:14:35.856820 (+ 0us) conflict_resolution.cc:966] ResolveTransactionConflicts .. 1113 06:14:35.857821 (+ 1001us) conflict_resolution.cc:401] FetchingTransactionStatus for 52ec72c4-c941-4f40-96ad-3478a3ca0528 .. 1113 06:14:35.857821 (+ 0us) running_transaction.cc:210] SendStatusRequest : 52ec72c4-c941-4f40-96ad-3478a3ca0528 .. 1113 06:14:35.857821 (+ 0us) conflict_resolution.cc:973] resolver->Resolve done .. 1113 06:14:35.857821 (+ 0us) operation_counter.cc:199] Reset null .. 1113 06:14:35.857821 (+ 0us) operation_counter.cc:199] Reset null .. .. Related trace: .. .. 1113 06:14:35.857821 (+ 0us) transaction_rpc.cc:104] GetTransactionStatus .. .. 1113 06:14:35.857821 (+ 0us) transaction_rpc.cc:105] tablet_id: "8d36c698a53745d99d0f4a15e6f52206" transaction_id: "R\354r\304\311AO@\226\2554x\243\312\005(" propagated_hybrid_time: 6704267574709948416 .. .. 1113 06:14:35.857821 (+ 0us) transaction_rpc.cc:56] SendRpc .. .. 1113 06:14:35.857821 (+ 0us) tablet_rpc.cc:101] SelectTabletServer() .. .. 1113 06:14:35.857821 (+ 0us) tablet_rpc.cc:153] Selected { uuid: 320973da990b4ce7ba3915b1075f0848 private: [host: "127.0.0.2" port: 22188] public: [host: "127.0.0.3.ip.yugabyte" port: 22188] cloud_info: placement_cloud: "cloud1" placement_region: "rack1" placement_zone: "zone" .. .. 1113 06:14:35.857821 (+ 0us) transaction_rpc.cc:131] InvokeAsync .. .. 1113 06:14:35.860821 (+ 3000us) transaction_rpc.cc:61] Finished .. .. 1113 06:14:35.860821 (+ 0us) tablet_rpc.cc:308] Done(OK) .. .. 1113 06:14:35.860821 (+ 0us) transaction_rpc.cc:124] InvokeCallback .. .. Related trace: .. .. 1113 06:14:35.857821 (+ 0us) outbound_call.cc:165] yb.tserver.TabletServerService.GetTransactionStatus. .. .. 1113 06:14:35.857821 (+ 0us) reactor.cc:736] Scheduled. .. .. 1113 06:14:35.857821 (+ 0us) outbound_call.cc:441] Queued. .. .. 1113 06:14:35.857821 (+ 0us) outbound_call.cc:451] Call Sent. .. .. 1113 06:14:35.860821 (+ 3000us) outbound_call.cc:392] Response received. .. .. 1113 06:14:35.860821 (+ 0us) outbound_call.cc:393] from { remote: 127.0.0.3:22188 idx: 7 protocol: 0x00007f529eaa6ea0 -> tcpc } .. .. 1113 06:14:35.860821 (+ 0us) outbound_call.cc:354] Callback called asynchronously. END. .. .. Related trace: .. .. 1113 06:14:35.854683 (+ 0us) outbound_call.cc:165] yb.tserver.TabletServerService.Write. .. .. 1113 06:14:35.855683 (+ 1000us) reactor.cc:736] Scheduled. .. .. 1113 06:14:35.855683 (+ 0us) outbound_call.cc:441] Queued. .. .. 1113 06:14:35.855683 (+ 0us) outbound_call.cc:451] Call Sent. .. .. 1113 06:14:35.861684 (+ 6001us) outbound_call.cc:392] Response received. .. .. 1113 06:14:35.861684 (+ 0us) outbound_call.cc:393] from { remote: 127.0.0.7:26739 idx: 0 protocol: 0x00007f3147488cc0 -> tcp } .. .. 1113 06:14:35.861684 (+ 0us) outbound_call.cc:354] Callback called asynchronously. Reviewers: hbhanawat, rthallam, rsami Reviewed By: rsami Subscribers: ybase, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D13938
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
Jun 8, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
Jira Link: DB-734
Description
add traces to better understand transactions/conflict resolution and rpcs.
The text was updated successfully, but these errors were encountered: