-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support hll_raw_agg in Aggregate Function #832
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some doc/help for this function?
be/src/exprs/aggregate_functions.cpp
Outdated
DCHECK(!src.is_null); | ||
DCHECK_EQ(src.len, HLL_SETS_BYTES_NUM); | ||
|
||
StringVal result(ctx, src.len+1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringVal result(ctx, src.len+1); | |
StringVal result(ctx, src.len + 1); |
FunctionContext* ctx, | ||
StringVal* dst) { | ||
hll_union_agg_init(ctx, dst); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All calculations can be based on full type, so dst will be initted to a hll full type .
261759f
to
0ac20a6
Compare
be/src/exprs/aggregate_functions.cpp
Outdated
@@ -25,6 +23,7 @@ | |||
#include "runtime/string_value.h" | |||
#include "runtime/datetime_value.h" | |||
#include "exprs/anyval_util.h" | |||
#include "exprs/aggregate_functions.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd better not move #include "exprs/aggregate_functions.h"
, we put this first to make sure that all need includes is written in header files
select HLL_CARDINALITY(uv_set) from test_uv; | ||
select dt, HLL_CARDINALITY(uv_set) from test_uv; | ||
select dt, HLL_CARDINALITY(uv) from (select dt, HLL_RAW_AGG(set1) as uv from test group by dt) tmp; | ||
select dt, HLL_UNION_AGG(set1) as uv from test group by dt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that these three queries don't lead to same result.
1 would return many rows for one dt
value
2 and 3 queries would group by dt
column and return one row for each value
So, you should split them into different examples
be/src/udf/udf.h
Outdated
return (doris::HllDataType)(ptr[0]); | ||
} | ||
|
||
uint8_t &getHllData(int idx) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't return a reference value, you can add a function named setHllData
be/src/exprs/aggregate_functions.h
Outdated
return hll_algorithm(dst.ptr, dst.len); | ||
} | ||
static int64_t hll_algorithm(const HllVal &dst) { | ||
return hll_algorithm(dst.ptr+1, dst.len-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return hll_algorithm(dst.ptr+1, dst.len-1); | |
return hll_algorithm(dst.ptr + 1, dst.len - 1); |
0ac20a6
to
6276230
Compare
be/src/udf/udf.h
Outdated
@@ -762,6 +766,45 @@ struct LargeIntVal : public AnyVal { | |||
} | |||
}; | |||
|
|||
struct HllVal : public StringVal { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think HllFullVal is a better name, because your HllVal only represent the Full type.
be/src/exprs/aggregate_functions.cpp
Outdated
} | ||
|
||
void AggregateFunctions::hll_union_parse_and_cal(HllSetResolver& resolver, StringVal* dst) { | ||
void AggregateFunctions::hll_union_parse_and_cal(HllSetResolver& resolver, HllVal* dst) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hll_union_parse_and_cal can be a member of HllVal, and it support a function that merge other type or src StringVal, so we will get a better encapsulation for hll.
be/src/exprs/aggregate_functions.cpp
Outdated
|
||
for (int i = 0; i < src.len; ++i) { | ||
dst->ptr[i] = std::max(dst->ptr[i], src.ptr[i]); | ||
dst->setHllData(i, std::max(dst->getHllData(i), src.getHllData(i))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HLLVal can support a member of merge other HllVal.
be/src/udf/udf.h
Outdated
DCHECK(!is_null); | ||
|
||
return (doris::HllDataType)(ptr[0]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return doris::HllDataType::HLL_DATA_FULL.
hll_raw_agg Function aggregates the HLL type value, and return the HLL type value
6276230
to
f62f7c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTMG
…he#832) Also, change some config variable names of file cache.
Support hll_raw_agg in Aggregate Function
hll_raw_agg Function aggregates the HLL type value, and return the HLL type value