Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-429] TPC-DS Q14a/b get slowed down within setting spark.oap.sql.columnar.sortmergejoin.lazyread=true #432

Merged
merged 15 commits into from
Aug 4, 2021

Conversation

zhztheplayer
Copy link
Collaborator

No description provided.

@github-actions
Copy link

#429

@zhztheplayer zhztheplayer marked this pull request as ready for review August 4, 2021 07:26
@zhztheplayer zhztheplayer changed the title [NSE-429] WIP: TPC-DS Q14a/b get slowed down within setting spark.oap.sql.columnar.sortmergejoin.lazyread=true [NSE-429] TPC-DS Q14a/b get slowed down within setting spark.oap.sql.columnar.sortmergejoin.lazyread=true Aug 4, 2021
@zhztheplayer
Copy link
Collaborator Author

Ready to merge once CI check passed

@@ -317,19 +288,23 @@ class TypedLazyLoadRelationColumn<DataType, enable_if_string_like<DataType>>
return arrow::Status::OK();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to apply below patch to optimize the null check

diff --git a/native-sql-engine/cpp/src/codegen/common/relation_column.h b/native-sql-engine/cpp/src/codegen/common/relation_column.h
index 33bf151d..66e78601 100644
--- a/native-sql-engine/cpp/src/codegen/common/relation_column.h
+++ b/native-sql-engine/cpp/src/codegen/common/relation_column.h
@@ -216,6 +216,8 @@ class TypedLazyLoadRelationColumn<DataType, enable_if_number_or_decimal<DataType
     in_ = in;
     field_id_ = field_id;
     return arrow::Status::OK();
+    AdvanceTo(0);
+    has_null_ = TypedRelationColumn<DataType>::HasNull();
   };

   void AdvanceTo(int array_id) override {

@zhouyuan zhouyuan merged commit f9308b8 into oap-project:master Aug 4, 2021
zhouyuan added a commit that referenced this pull request Aug 5, 2021
* Minor: Remove debug code for PR#387 (#431)

* [NSE-436] Fix for Arrow Data Source test suite (#437)

Closes #436

* [NSE-254]Solve the redundant arrow library issue (#440)

* [NSE-254]Issue0410 jar size (#441)

* [NSE-254]Solve the redundant arrow library issue

* Remove mvn jar plugin 3.2.0

* fix packaging (#442)

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-207] Fix aggregate and refresh UT test script (#426)

* fix expr id difference in Partial Aggregate

* add fallback to window

* refresh ut

* use stol instead of stoi

* enable ut full test

* [NSE-429] TPC-DS Q14a/b get slowed down within setting spark.oap.sql.columnar.sortmergejoin.lazyread=true (#432)

* wip

* debug commit

* wip

* discard proxy pattern

* fix

* fix

* fix

* fix unexpected O(n^2) time

* fix1

* fix2

* Add a bunch of caches; Adjust prefetch batch length to 16

* format

* fix

* disable spark.oap.sql.columnar.sortmergejoin.lazyread by default

* fix

Co-authored-by: Hongze Zhang <hongze.zhang@intel.com>
Co-authored-by: Wei-Ting Chen <weiting.chen@intel.com>
Co-authored-by: Rui Mo <rui.mo@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants