Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YCQL] TTL enforcement delayed when non-default (weaker) read-consistency is used #2394

Closed
kmuthukk opened this issue Sep 22, 2019 · 5 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug

Comments

@kmuthukk
Copy link
Collaborator

kmuthukk commented Sep 22, 2019

On the slack channel, a user reported an issue with a program using Consistency(gocql.One) and TTL enforcement for a sample Golang test program with YCQL using gocql driver.

With defaults (i.e. strong read consistency), Consistency(gocql.Quorum), the issue doesn't arise.

Even with the weaker gocql.One the issue seems to be transient. When the tablet/shard gets additional writes (to even unrelated keys) after an item has expired, then the value is correctly deemed as expired.

So in practice impact of issue is minimal, and the values will get purged in due course. Nevertheless, we'll look into what's causing the delay in enforcement for the read consistency level One.

Sample test case:

https://gist.github.com/kmuthukk/f182837117f7af553a0d500c54e89ee3

To run:

5:12 $ go run <filename>
Created keyspace ybdemo
Created table ybdemo.employee
Inserted data: INSERT INTO ybdemo.employee(id, name, age, language)  VALUES (1, 'John+1', 1, 'Go')
Inserted data: INSERT INTO ybdemo.employee(id, name, age, language)  VALUES (2, 'John+2', 2, 'Go')
Inserted data: INSERT INTO ybdemo.employee(id, name, age, language)  VALUES (3, 'John+3', 3, 'Go')
Inserted data: INSERT INTO ybdemo.employee(id, name, age, language)  VALUES (4, 'John+4', 4, 'Go')
Inserted data: INSERT INTO ybdemo.employee(id, name, age, language)  VALUES (5, 'John+5', 5, 'Go')
Sleeping 6 seconds
Query for id=1 returned: Row[John+1, 1, Go]   ---> NO ROW SHOULD BE RETURNED HERE.

Note:

  1. Problem does not happen if:
iter := session.Query(`SELECT name, age, language FROM ybdemo.employee WHERE id = 1`)
     .Consistency(gocql.One).Iter()

is changed to use default consistency (quorum)

iter := session.Query(`SELECT name, age, language FROM ybdemo.employee WHERE id = 1`)
     .Iter()

or

iter := session.Query(`SELECT name, age, language FROM ybdemo.employee WHERE id = 1`)
     .Consistency(gocql.Quorum).Iter()
  1. If after the expiry, an additional row is inserted to the same tablet (as will normally be the case in an active cluster), then too, the enforcement is correctly done.
@kmuthukk kmuthukk added kind/bug This issue is a bug area/docdb YugabyteDB core features labels Sep 22, 2019
@OlegLoginov
Copy link
Contributor

Root cause: in case of Consistency Level == 'One' (== YB_CONSISTENT_PREFIX) any tablet replica can be used to get READ (SELECT) request result. So, any FOLLOWER (non-leader replica) can be used to process the read request.

When the read request is processed by Leader (in case of Consistency Level == QUORUM == YB_STRONG) the read time is correct. But if the request is processed by a Follower - the read time is invalid.
This happens due to invalid values returned by yb::tablet::MvccManager::SafeTimeForFollower method.

In case of 9 sec pause between write and read I see in the logs the following time values.

I0926 15:49:52.594996 112869376 tablet.cc:1993] HT SafeTime = { physical: 15695021 92 594 987 }
I0926 15:49:52.595088 112869376 tablet.cc:1994] HT SafeTimeForFollower = { physical: 15695021 83 590 826 }

So MvccManager::SafeTimeForFollower provides invalid (9 sec in past) read time, while the MvccManager::SafeTime returns correct value.

@OlegLoginov
Copy link
Contributor

Stack trace:

yb::tablet::MvccManager::SafeTimeForFollower(yb::HybridTime, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >) const mvcc.cc:272
yb::tablet::Tablet::DoGetSafeTime(yb::StronglyTypedBool<yb::tablet::RequireLease_Tag>, yb::HybridTime, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >) const tablet.cc:1987
yb::tserver::ReadContext::PickReadTime(yb::server::Clock*) tablet_service.cc:1010
yb::tserver::TabletServiceImpl::Read(yb::tserver::ReadRequestPB const*, yb::tserver::ReadResponsePB*, yb::rpc::RpcContext) tablet_service.cc:1199
yb::tserver::TabletServerServiceIf::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) tserver_service.service.cc:154
yb::rpc::ServicePoolImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) service_pool.cc:246
yb::rpc::ServicePool::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) service_pool.cc:460
yb::rpc::Messenger::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) messenger.cc:519
yb::rpc::Proxy::DoAsyncRequest(yb::rpc::RemoteMethod const*, google::protobuf::Message const&, google::protobuf::Message*, yb::rpc::RpcController*, std::__1::function<void ()>, bool) proxy.cc:184
yb::rpc::Proxy::AsyncRequest(yb::rpc::RemoteMethod const*, google::protobuf::Message const&, google::protobuf::Message*, yb::rpc::RpcController*, std::__1::function<void ()>) proxy.cc:118
yb::tserver::TabletServerServiceProxy::ReadAsync(yb::tserver::ReadRequestPB const&, yb::tserver::ReadResponsePB*, yb::rpc::RpcController*, std::__1::function<void ()>) tserver_service.proxy.cc:45
yb::client::internal::ReadRpc::CallRemoteMethod() async_rpc.cc:661
yb::client::internal::AsyncRpc::SendRpcToTserver() async_rpc.cc:257
yb::client::internal::AsyncRpcBase<yb::tserver::ReadRequestPB, yb::tserver::ReadResponsePB>::SendRpcToTserver() async_rpc.cc:332
yb::client::internal::TabletInvoker::Execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) tablet_rpc.cc:205
yb::client::internal::AsyncRpc::SendRpc() async_rpc.cc:147
yb::client::internal::Batcher::FlushBuffersIfReady() batcher.cc:564
yb::client::internal::Batcher::FlushAsync(boost::function<void (yb::Status const&)>) batcher.cc:275
yb::client::YBSession::FlushAsync(boost::function<void (yb::Status const&)>) session.cc:131
yb::ql::Executor::FlushAsync() executor.cc:1451
yb::ql::Executor::ExecuteAsync(yb::ql::ParseTree const&, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>) executor.cc:92
yb::ql::QLProcessor::ExecuteAsync(yb::ql::ParseTree const&, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>) ql_processor.cc:354
yb::ql::Statement::ExecuteAsync(yb::ql::QLProcessor*, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>) const statement.cc:90
yb::cqlserver::CQLProcessor::ProcessRequest(yb::cqlserver::ExecuteRequest const&) cql_processor.cc:306
yb::cqlserver::CQLProcessor::ProcessCall(std::__1::shared_ptr<yb::rpc::InboundCall>) cql_processor.cc:174
yb::cqlserver::CQLServiceImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) cql_service.cc:142
yb::rpc::ServicePoolImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) service_pool.cc:246
yb::rpc::InboundCall::InboundCallTask::Run() inbound_call.cc:207
yb::rpc::(anonymous namespace)::Worker::Execute() thread_pool.cc:99
yb::Thread::SuperviseThread(void*) thread.cc:690

@OlegLoginov
Copy link
Contributor

OlegLoginov commented Sep 26, 2019

In logs:

I0925 20:54:34.856376 103337984 mvcc.cc:272] T edb324fcb0b7473388dd25f2a3696381 P c2dbcd03e21c48118c14d85feb2c5871: SafeTimeForFollower(<min>), result = { safe_time: { physical: 1569434065810305 } source: kPropagated }

I0925 20:54:34.856406 103337984 tablet_service.cc:1271] Read time: { read: { physical: 1569434065810305 } local_limit: { physical: 1569434065810305 } global_limit: { physical: 1569434065810305 } in_txn_limit: <invalid> serial_no: 0 }, safe: { physical: 1569434065810305 }

@OlegLoginov
Copy link
Contributor

Implementation:

HybridTime MvccManager::SafeTimeForFollower(
    HybridTime min_allowed, CoarseTimePoint deadline) const {
  std::unique_lock<std::mutex> lock(mutex_);
  SafeTimeWithSource result;
  auto predicate = [this, &result, min_allowed] {
    // last_replicated_ is updated earlier than propagated_safe_time_, so because of concurrency it
    // could be greater than propagated_safe_time_.
    if (propagated_safe_time_ > last_replicated_) {
      result.safe_time = propagated_safe_time_;               <<<<<<<<<<<<<<<<<<<<<<<<<<<
      result.source = SafeTimeSource::kPropagated;
    } else {
      result.safe_time = last_replicated_;
      result.source = SafeTimeSource::kLastReplicated;
    }
    return result.safe_time >= min_allowed;
  };
  if (deadline == CoarseTimePoint::max()) {
    cond_.wait(lock, predicate);
  } else if (!cond_.wait_until(lock, deadline, predicate)) {
    return HybridTime::kInvalid;
  }
  VLOG_WITH_PREFIX(1) << "SafeTimeForFollower(" << min_allowed
                      << "), result = " << result.ToString();

Use propagated_safe_time_ which is a moment in past (not a current moment of time).

@mbautin
Copy link
Collaborator

mbautin commented Sep 26, 2019

@OlegLoginov: propagated_safe_time_ is supposed to get updated with every Raft heartbeat message (UpdateConsensus) received from the leader, even if there are no writes. Is it not happening in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

3 participants