Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YCQL] TS crash if a built-in function argument is a collection #1944

Closed
OlegLoginov opened this issue Jul 31, 2019 · 8 comments
Closed

[YCQL] TS crash if a built-in function argument is a collection #1944

OlegLoginov opened this issue Jul 31, 2019 · 8 comments
Assignees
Labels
kind/bug This issue is a bug

Comments

@OlegLoginov
Copy link
Contributor

OlegLoginov commented Jul 31, 2019

Scenario:

cqlsh:a> create table ts (h int primary key, t text);
cqlsh:a> insert into ts (h, t) values (1, cast([1,2] as text));
NoHostAvailable: 

Error in the log: F.... eval_const.cc:200] Illegal datatype conversion: varint to unknown
Code:

CHECKED_STATUS Executor::PTExprToPB(const PTConstVarInt *const_pt, QLValuePB *const_pb,
                                    bool negate) {
  switch (const_pt->expected_internal_type()) {    // <<<<<< == VALUE_NOT_SET

    default:
      LOG(FATAL) << Substitute("Illegal datatype conversion: $0 to $1", "varint",
                               QLValue::ToCQLString(const_pt->expected_internal_type()));
  }

Reason: PTConstVarInt::expected_internal_type_ == {yb::InternalType} VALUE_NOT_SET.

@OlegLoginov
Copy link
Contributor Author

Call stack:

__pthread_kill 0x00007fff7798c23e
pthread_kill 0x00007fff77a42c1c
abort 0x00007fff778f51c9
yb::(anonymous namespace)::DumpStackTraceAndExit() logging.cc:179
google::LogMessage::SendToLog() 0x000000010e0077b2
google::LogMessage::Flush() 0x000000010e007eb5
google::LogMessageFatal::~LogMessageFatal() 0x000000010e00c9bf
google::LogMessageFatal::~LogMessageFatal() 0x000000010e008ea9
yb::ql::Executor::PTExprToPB(yb::ql::PTExprConst<(yb::QLValuePB::ValueCase)16, (yb::DataType)12, std::__1::shared_ptr<std::__1::basic_string<char, std::__1::char_traits<char>, yb::internal::ArenaAllocatorBase<char, yb::internal::ArenaTraits> > >, yb::ql::PTLiteralString> const*, yb::QLValuePB*, bool) eval_const.cc:200
yb::ql::Executor::PTConstToPB(std::__1::shared_ptr<yb::ql::PTExpr> const&, yb::QLValuePB*, bool) eval_const.cc:59
yb::ql::Executor::PTExprToPB(yb::ql::PTCollectionExpr const*, yb::QLValuePB*) eval_const.cc:435
yb::ql::Executor::PTConstToPB(std::__1::shared_ptr<yb::ql::PTExpr> const&, yb::QLValuePB*, bool) eval_const.cc:92
yb::ql::Executor::PTExprToPB(std::__1::shared_ptr<yb::ql::PTExpr> const&, yb::QLExpressionPB*) eval_expr.cc:39
yb::ql::Executor::TSCallToPB(yb::ql::PTBcall const*, yb::QLExpressionPB*) eval_bcall.cc:104
yb::ql::Executor::PTExprToPB(yb::ql::PTBcall const*, yb::QLExpressionPB*) eval_bcall.cc:29
yb::ql::Executor::PTExprToPB(std::__1::shared_ptr<yb::ql::PTExpr> const&, yb::QLExpressionPB*) eval_expr.cc:61
yb::ql::Executor::ColumnArgsToPB(yb::ql::PTDmlStmt const*, yb::QLWriteRequestPB*) eval_col.cc:54
yb::ql::Executor::ExecPTNode(yb::ql::PTInsertStmt const*, yb::ql::TnodeContext*) executor.cc:1026
yb::ql::Executor::ExecTreeNode(yb::ql::TreeNode const*) executor.cc:247
yb::ql::Executor::Execute(yb::ql::ParseTree const&, yb::ql::StatementParameters const&) executor.cc:177
yb::ql::Executor::ExecuteAsync(yb::ql::ParseTree const&, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>) executor.cc:91
yb::ql::QLProcessor::ExecuteAsync(yb::ql::ParseTree const&, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>) ql_processor.cc:354
yb::ql::QLProcessor::RunAsync(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, yb::ql::StatementParameters const&, yb::Callback<void (yb::Status const&, std::__1::shared_ptr<yb::ql::ExecutedResult> const&)>, bool) ql_processor.cc:372
yb::cqlserver::CQLProcessor::ProcessRequest(yb::cqlserver::QueryRequest const&) cql_processor.cc:312
yb::cqlserver::CQLProcessor::ProcessRequest(yb::cqlserver::CQLRequest const&) cql_processor.cc:220
yb::cqlserver::CQLProcessor::ProcessCall(std::__1::shared_ptr<yb::rpc::InboundCall>) cql_processor.cc:174
yb::cqlserver::CQLServiceImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) cql_service.cc:142
yb::rpc::ServicePoolImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) service_pool.cc:243
yb::rpc::InboundCall::InboundCallTask::Run() inbound_call.cc:207
yb::rpc::(anonymous namespace)::Worker::Execute() thread_pool.cc:100
yb::Thread::SuperviseThread(void*) thread.cc:690
_pthread_body 0x00007fff77a40305
_pthread_start 0x00007fff77a4326f
thread_start 0x00007fff77a3f415

Local vars:

Signal = SIGABRT (signal SIGABRT)
this = {yb::ql::Executor * | 0x1139d06b8} 0x00000001139d06b8
const_pt = {const yb::ql::PTConstVarInt * | 0x111bba7e8} 0x0000000111bba7e8
 yb::ql::PTExpr0<yb::QLValuePB::kVarintValue, yb::VARINT, yb::ql::PTExpr> = {yb::ql::PTExpr0<yb::QLValuePB::kVarintValue, yb::VARINT, yb::ql::PTExpr>} 
  yb::ql::PTExpr = {yb::ql::PTExpr} 
   yb::ql::TreeNode = {yb::ql::TreeNode} 
   op_ = {yb::ql::ExprOperator} kConst
   ql_op_ = {yb::QLOperator} QL_OP_NOOP
   internal_type_ = {yb::InternalType} kVarintValue
   ql_type_ = {SharedPtr} std::__1::shared_ptr<yb::QLType>::element_type @ 0x0000000113f87198 strong=5 weak=1
   expected_internal_type_ = {yb::InternalType} VALUE_NOT_SET      <<<<<<<<<<<<<<
 yb::ql::PTLiteralString = {yb::ql::PTLiteralString} 
const_pb = {yb::QLValuePB * | 0x1141a1050} 0x00000001141a1050
negate = {bool} 

@OlegLoginov OlegLoginov self-assigned this Jul 31, 2019
@OlegLoginov OlegLoginov added the kind/bug This issue is a bug label Jul 31, 2019
@OlegLoginov
Copy link
Contributor Author

OlegLoginov commented Aug 12, 2019

The failed statement:
> insert into ts (h, t) values (1, cast([1,2] as text));

In the statement analysis we have the following part of parser tree:
[1,2] ~ PTCollection -> PTConstVarInt, PTConstVarInt

To handle some built-in functions the function arguments must be saved into PB (to be sent to the server & processed on the DocDB level).
And the saving into PB failed, because the PTConstVarInt was not analyzed and the internal type was not set.

CHECKED_STATUS Executor::PTExprToPB(const PTConstVarInt *const_pt, QLValuePB *const_pb,
                                    bool negate) {
  switch (const_pt->expected_internal_type()) {    // <<<<<< == VALUE_NOT_SET

    default:
      LOG(FATAL) << Substitute("Illegal datatype conversion: $0 to $1", "varint",
                               QLValue::ToCQLString(const_pt->expected_internal_type()));
  }

To fix it all elements in Collections must be analyzed to setup the expected type.
The code:

CHECKED_STATUS PTCollectionExpr::Analyze(SemContext *sem_context) {
  RETURN_NOT_OK(CheckOperator(sem_context));
  const shared_ptr<QLType>& expected_type = sem_context->expr_expected_ql_type();

  // If no expected type is given, use type inferred during parsing
  if (expected_type->main() == DataType::UNKNOWN_DATA) {
+    // Call Analyze() for child PT nodes to init expected type.
+    SemState sem_state(sem_context, QLType::Create(UNKNOWN_DATA), InternalType::VALUE_NOT_SET);
+    for (auto& key : keys_) {
+      RETURN_NOT_OK(key->Analyze(sem_context));
+    }
+    for (auto& value : values_) {
+      RETURN_NOT_OK(value->Analyze(sem_context));
+    }
+    sem_state.ResetContextState();
+
    return CheckExpectedTypeCompatibility(sem_context);
  }

@OlegLoginov
Copy link
Contributor Author

OlegLoginov commented Aug 12, 2019

But there is a tricky case.. See the example:

> create table tu (h frozen<udt1>, r int, v int, PRIMARY KEY((h), r));
> insert into tu (h, r, v) values ({a:1}, 2, 3);
> select * from tu where token(h) = token({a:1});

Here we have built-in function: token()
And the argument {a:1} ~ PTCollection (map) -> PTRef, PTConstVarInt
So, the parser expects that 'a' is a column name. But in fact sometimes it is not a name (in the example 'a' is a UDT field in fact).
With the fix suggested above (to call Analyze() for all elements inside PTCollection) the statement failed:
... token({a:1});
with error: 'a' - Undefined Column. Column doesn't exist.
From grammar parser point of view the 'a' is a_expr = c_expr = columnref -> PTRef

To prevent the statement
> insert into tu (h, r, v) values ({a:1}, 2, 3);
failure the idea - to exclude the class PTRef from the Analyze() calls:

  if (expected_type->main() == DataType::UNKNOWN_DATA) {
     // Call Analyze() for child PT nodes to init expected type.
     SemState sem_state(sem_context, QLType::Create(UNKNOWN_DATA), InternalType::VALUE_NOT_SET);
     for (auto& key : keys_) {
+      if (key->opcode() != TreeNodeOpcode::kPTRef) {
         RETURN_NOT_OK(key->Analyze(sem_context));
+      }
     }
     for (auto& value : values_) {
+      if (value->opcode() != TreeNodeOpcode::kPTRef) {
         RETURN_NOT_OK(value->Analyze(sem_context));
+      }
     }
     sem_state.ResetContextState();

And to raise a error if the unresolved PTRef is tried to be save into PB.

Possible alternative ways I see:

  • to change PTRef::Analyze() (which now failed if cannot find a column with the provided name)
  • to change the parser and to replace PTRef by another node (extended PTName? or a new node class)
  • review & change the grammar for a separate name

Let's discuss the solution.. CC: @frozenspider @nocaway

@frozenspider
Copy link
Contributor

I think that PTRef::Analyze() should be able to properly handle the case when it referenced UDT field, not table column. This is by far more logical than doing if (key->opcode() != TreeNodeOpcode::kPTRef) special case exception in Analyze of higher-level expression.

Also, I'm not sure that token({a:1}) is a valid syntax at all, since we don't (or not always, for other functions?) have UDT schema for reference

@OlegLoginov
Copy link
Contributor Author

Also, I'm not sure that token({a:1}) is a valid syntax at all, since we don't (or not always, for other functions?) have UDT schema for reference

That's valid. The case works in YB & Cassandra:

 create type udt1 (a int);
 create table tu (h frozen<udt1>, r int, v int, PRIMARY KEY((h), r));
 insert into tu (h, r, v) values ({a:1}, 2, 3);
 select * from tu where token(h) = token({a:1});

@OlegLoginov
Copy link
Contributor Author

[Neil]: I have similar thoughts with Alex for this bug. It's a bug in "cql_cast" error check and not a bug in argument datatype.
<<< From Alex: Also, I'm not sure that token({a:1}) is a valid syntax at all, since we don't (or not always, for other functions?) have UDT schema for reference >>>

  • As you can see in the "bfql" code, "cql_cast" only supports primitive type. However, its signature was using ANYTYPE (For example: "INT16, {ANYTYPE, INT16}"). And that is a bug.
  • Using ANYTYPE is ok, but we should error check in "pt_bcall.cc" by adding the following check "if (*name_ == "cql_cast")".
CHECKED_STATUS PTBcall::Analyze(SemContext *sem_context) {
  RETURN_NOT_OK(CheckOperator(sem_context));
  .......
  if (!bfdecl->is_server_operator()) {
    // Use the builtin-function opcode since this is a regular builtin call.
    bfopcode_ = static_cast<int32_t>(bfopcode);

    if (*name_ == "cql_cast" || *name_ == <Any other functions that have this same bug>) {
      // Argument must be of primitive type for these operators.
      for (const auto &expr : exprs) {
        if (expr->ql_type()->IsParametric()) {
          return sem_context->Error(expr, "Input argument must be of primitive type",
                                    ErrorCode::INVALID_ARGUMENTS);
        }
      }
    }

  } else {
     ...........................
  }

@OlegLoginov
Copy link
Contributor Author

[Oleg]: If I understand correctly you want to undo changes in PBCollection::Anaylize() and just prevent the parametric args processing in PBCall
[Neil]: yes.

[Oleg]: Do we need such syntax in such case? ToJson( [1,2] )
[Neil]: I think you don't have to (Customers didn't asked for this?). You can raise the same errors as for cqlcast().

OlegLoginov added a commit that referenced this issue Aug 20, 2019
…lection.

Summary:
Fixed PTBCall::Analyze() - to prevent usage of parametric types with failed BCalls.
Plus minor clean-up in macros.

Test Plan:
ybd --cxx-test ql-insert-table-test
ybd --cxx-test ql-query-test
ybd --cxx-test ql_ql-select-expr-test

Reviewers: mihnea, dmitry, neil

Reviewed By: neil

Subscribers: alex, eng

Differential Revision: https://phabricator.dev.yugabyte.com/D6986
@OlegLoginov
Copy link
Contributor Author

Fixed. (See commit above.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

2 participants