v1.2.609
What's Changed
Exciting New Features โจ
- feat: support recursive cte used in multiple places by @xudong963 in #15835
- feat(query): support distinct from for hash join by @Dousir9 in #15838
- feat: Implement Default for CatalogInfo by @Xuanwo in #15844
- feat(query): enable udf in udf call by @sundy-li in #15839
- feat(query): support row fetch for merge into by @Dousir9 in #15859
- feat: support filling uuid when loading data and uuid is missing/empty in some rows. by @youngsofun in #15871
- feat: stream consuming for
copy into LOCATION from
stmt by @dantengsky in #15882 - feat: publish databend-common-ast to crates.io by @andylokandy in #15905
- feat: Persist Meta-Service State Machine on Disk by @drmingdrmer in #15772
- feat: failsafe tool
fuse_amend
by @dantengsky in #15929 - feat: refactor create share endpoint ddl by @lichuang in #15937
- feat: add built-in udfs in config by @BohuTANG in #15938
- feat(planner): support pruning columns for merge into when insert only by @Dousir9 in #15948
- feat: clickhouse handler no longer support insert with format. by @youngsofun in #15952
- feat(query): Array lambda function support outer scope columns by @b41sh in #15957
- feat: support real-time retrieval of profiles from admin API (part 1) by @dqhl76 in #15958
- feat: refactor create database from share ddl by @lichuang in #15950
- feat: pretty print backtrace by @andylokandy in #15913
- feat: allow tuple field name to contain '_'. by @youngsofun in #15965
- feat(query): alter user support modify current password by @TCeason in #15962
- feat: update sort algorithm using loser tree for multi sort merge by @forsaken628 in #15869
- feat: orc support missing fields. by @youngsofun in #15970
- feat: orc add option missing_field_as. by @youngsofun in #15974
- feat(executor): support error profiling for cluster mode by @zhang2014 in #15969
- feat(query): implement HAVERSINE by @kkk25641463 in #15971
- feat(query): implement st_length/st_distance by @kkk25641463 in #15982
- feat: config option for syncing disk cache data by @dantengsky in #15984
- feat: add tenant&queryid to udf client by @BohuTANG in #15987
- feat: Reducing column.clone overhead in transform aggregate by introducing InputColumns by @forsaken628 in #15991
- feat: support real-time retrieval of profiles from admin API (part 2) by @dqhl76 in #15975
- feat: parquet add option
missing_field_as
. by @youngsofun in #15993 - feat: http handler limit body size of each response to 10MB. by @youngsofun in #15960
- feat: add NameResolutionSuggest to enhancement table name case sensitive error by @forsaken628 in #15889
- feat: refactor share spec location and format by @lichuang in #15989
- feat(query): add setting sort_spilling_batch_bytes by @sundy-li in #16019
- feat: change nan/inf to NaN/Infinity in http handler results. by @youngsofun in #16017
- feat: pretty print backtrace on panic by @andylokandy in #16024
- feat(ast): add TokenKind::{Get, Put} by @andylokandy in #16020
- feat(query): add new setting enable_dst_hour_fix by @TCeason in #16022
- feat: Change 'inf' to 'Infinity' when unloading CSV/TSV by @youngsofun in #16028
- feat(query): ST_CONTAINS by @kkk25641463 in #15994
- refactor: Refactor aggregate hashtable, replace &[Column] with InputColumns by @forsaken628 in #16038
- feat(query): ST_SETSRID/ST_NPOINTS by @kkk25641463 in #16035
- feat: new table function
set_cache_capacity
by @dantengsky in #16016 - feat: support must change password option for create user by @b41sh in #16031
- feat: snapshots generated in multi txn has same prev_snapshot_id by @SkyFan2002 in #16044
- feat(query): add option [ IGNORE | RESPECT NULLS ] functionality for window rank function(first_value, last_value, nth_value) by @TCeason in #15919
- feat: add function name to udf client header by @everpcpc in #16053
- feat: back compatibility of old share db by @lichuang in #16056
- feat: log file with local time by @everpcpc in #16064
- feat: add share integration test by @lichuang in #16051
- feat(query): impl LocalShuffle by @Freejww in #16055
- feat(query): add setting enable_strict_datetime_parser by @TCeason in #16067
- feat: improve selectivity accuracy by @xudong963 in #16069
- feat: optimize loser tree, disable peek top2 by @forsaken628 in #15979
- feat(storage): add table function clustering_statistics by @zhyass in #16081
- feat: limit auto cast when loading from column store. by @youngsofun in #16082
- feat(query): PartitionsShuffleKind::ConsistentHash support by @forsaken628 in #16094
- feat(http): expose queries queue length to /v1/status by @flaneur2020 in #16112
- feat(storage): refactor recluster by @zhyass in #16070
- feat(query): support json null for nullable response by @everpcpc in #16120
- feat(query): hash partitioning in window by @Freejww in #16090
- feat: generate accurate histogram by @xudong963 in #16018
- feat(query): add sql variable support by @sundy-li in #16134
- feat(query): support parse create dictionary stmt by @Winnie-Hong0927 in #16137
- feat(storage): refactor compact by @zhyass in #16119
- feat: add histogram info to fuse_statistic table by @xudong963 in #16141
- feat(query): add user function admin api by @zhang2014 in #16146
- feat: add
is_attach
column tosystem.tables
and disable index creation for attache tables by @dantengsky in #16166 - feat(planner): unify execution of DML statements (MERGE, UPDATE, DELETE) by @Dousir9 in #16060
- feat(query): Support
json_array_agg
function by @b41sh in #16169 - feat: add share catalog by @lichuang in #16172
- feat: explain analyze output partial infos to debug cardinality estimator by @xudong963 in #16185
- feat: transfer leader command for meta-service by @drmingdrmer in #16198
- feat: meta-service transfer_leader: add response and default value by @drmingdrmer in #16201
- feat: allow casting int to uint for string functions by @andylokandy in #16210
- feat(query): Support materialized view match algorithm by @b41sh in #16023
- feat: grant view to share by @lichuang in #16186
- feat(query): ST_TRANSFORM by @kkk25641463 in #15992
- feat: introduce new setting
enable_last_snapshot_location_hint
by @dantengsky in #16226
Thoughtful Bug Fix ๐ง
- fix(query): list user stage not check privilege by @TCeason in #15800
- fix: fix sequence used in function calls by @lichuang in #15814
- fix: multi-tbl insert with lateral flattern panics by @dantengsky in #15822
- fix: uuid() as default value result in copy error. by @youngsofun in #15837
- fix: function uuid() should be non_deterministic. by @youngsofun in #15840
- fix: stream illegal after source rename by @zhyass in #15843
- fix: parse comment in raw value by @andylokandy in #15834
- fix(query): to_timestmap should always return err if parse err by @TCeason in #15850
- fix(query): decimal div overflow by @sundy-li in #15856
- fix(query): fix sum rewriter by @sundy-li in #15870
- fix: merge into unresolved conflict by @zhyass in #15884
- fix(query): window function out of bounds access panic by @shamb0 in #15860
- fix: stream error after alter table by @zhyass in #15907
- fix(query): fix expr stack overflow [part 1] by @zhang2014 in #15790
- fix(jwks): refresh the jwks store on key id not found by @flaneur2020 in #15921
- fix(query): fix cluster mode stack overflow by @zhang2014 in #15927
- fix: remove VisitorWithParent, add more delete/update by subquery test by @lichuang in #15526
- fix: no-recursive cte reference itself by @xudong963 in #15941
- fix(query): fix stackoverflow for ci release mode by @zhang2014 in #15944
- fix(query): window panic when max_block_size is small by @TCeason in #15949
- fix: when upgrade snapshot v002 to v003, put kvs after expire index by @drmingdrmer in #15947
- fix(meta): fix reserved number err by @TCeason in #15955
- fix(storage): stream show columns maybe error by @zhyass in #15964
- fix: update orc-rust to fix error when read columns of array. by @youngsofun in #15968
- fix(query): fix tuple inner fields name with quotes by @b41sh in #15973
- fix(query): udf function support lambda arguments by @b41sh in #15981
- fix: wrong arrow schema when fuse engine read parquet by @SkyFan2002 in #15997
- fix(query): fix drop table column with quotes by @b41sh in #16006
- fix(query): fix incorrect fast_memcmp function by @sundy-li in #16008
- fix(query): coalesce continue loop when arg is null by @TCeason in #16002
- fix(query): unify udf allow list validator by @sundy-li in #16012
- fix: arrow schema in parquet reader by @SkyFan2002 in #16004
- fix(query): fix create table as query with null type by @b41sh in #16041
- fix(query): fix ambiguous time by @TCeason in #16046
- fix: output format missing '-' for '-Infinity'. by @youngsofun in #16052
- fix(query): if role not store ownership skip revoke ownership from role by @TCeason in #16047
- fix(query): pad information_schema.views.view_definition by @TCeason in #16065
- fix: bitmap with empty buffer stands for empty. by @youngsofun in #16066
- fix(query): if timestamp is some, no need to init parsed by @TCeason in #16078
- fix(query): fix read decimal by @sundy-li in #16080
- fix(query): fix session manager dead lock if call instance status by @zhang2014 in #16088
- fix: fix native decompress binary offsets out of bounds by @b41sh in #16085
- fix(query): fix incorrect derive_stats for window and limit plan by @sundy-li in #16087
- fix: wrong prev snapshot id by @SkyFan2002 in #16092
- fix: error message in data. by @youngsofun in #16089
- fix(storage): recluster maybe endless loop by @zhyass in #16096
- fix(query): fix subquery with function failed by @b41sh in #16103
- fix(query): cast string to decimal respect numeric_cast_option setting by @sundy-li in #16101
- fix: tenant tables with stream by @zhyass in #16115
- fix(query): fix stackoverflow in collect_statistics by @zhang2014 in #16123
- fix(query): revert window hash partition by @Freejww in #16129
- fix(query): fix window hash partition optimization by @Freejww in #16136
- fix: deadlock with export() and build_snapshot() by @drmingdrmer in #16138
- fix: vacuum dropped table in parallel by @dantengsky in #16139
- fix: drop attached table that can no longer reach table data by @dantengsky in #16125
- fix(query): revert find_eq_and_or_filter by @TCeason in #16148
- fix(query): fix add column default value is indeterministic expression by @b41sh in #16153
- fix(query): fix in operator convert to subquery cast failed by @b41sh in #16159
- fix: explain analyze lack of statistics by @dqhl76 in #16162
- fix: meta-service: data inconsistency risk by @drmingdrmer in #16175
- fix(query): reduce get table meta call by @TCeason in #16170
- fix: properly ignore column statistic from meta v2 if string contains non-utf8 by @andylokandy in #16180
- fix(query): support optimize or filter in system.tables by @TCeason in #16165
- fix(query): fix wasm udf runtime create with code by @b41sh in #16191
- fix: skip column statistics entirely if deserialization failed by @andylokandy in #16192
- fix(query): to_year/monday... should respect dst by @TCeason in #16195
- fix(query): rounding decimal multiply results by @sundy-li in #16196
- fix: backward compatibility of table statistics by @dantengsky in #16200
- fix(query): remove deserialize_cluster_stats by @sundy-li in #16204
- fix(query): fix index out of bounds with constant expr by @zhang2014 in #16206
- fix(query): cannot use fully qualified names with views by @TCeason in #16223
- fix(query): fix unexpected panic message by @zhang2014 in #16221
- fix: block_entry.memory_size by @forsaken628 in #16230
Code Refactor ๐
- refactor: move databend-meta binaries to separate dir by @drmingdrmer in #15803
- refactor: adaptive entries count for
AppendEntriesRequest
by @drmingdrmer in #15805 - refactor: add meta-meta compat test by @drmingdrmer in #15823
- refactor: Migrate catalog info into table info by @Xuanwo in #15857
- refactor(query): refactor merge into pipeline by @Freejww in #15891
- refactor(query): remove async for query binder by @zhang2014 in #15900
- refactor: CatalogInfo has been stored in TableInfo by @Xuanwo in #15902
- refactor: Make CatalogCreator Accept Arc by @Xuanwo in #15906
- refactor: simplify adding transform to pipeline by @dantengsky in #15873
- refactor: improve modulo predicate selectivity by @xudong963 in #15917
- refactor: optimize gen_columns_statistics() for scalar. by @youngsofun in #15909
- refactor: optimize building bloom index for scalar. by @youngsofun in #15910
- refactor: attach table read only by default by @dantengsky in #15922
- refactor(cluster): refactor flight actions and add flight secret by @zhang2014 in #15930
- refactor: speed up ColumnBuilder::repeat(Scalar::Null). by @youngsofun in #15939
- refactor: Bump arrow to 52 by @Xuanwo in #15943
- refactor: fuse statistics table function by @xudong963 in #15954
- refactor: optimize OrcChunkReader. by @youngsofun in #15967
- refactor: Use iceberg-rust to replace icelake by @Xuanwo in #15951
- refactor: unify transaction related code by @SkyFan2002 in #15966
- refactor: rename metrics & tweak system.caches table by @dantengsky in #15996
- refactor: upgrade QuotaMgr to using protobuf by @forsaken628 in #15858
- refactor(query): use shuffle on distributed merge into by @Freejww in #15946
- refactor: add history record for queries_profiling table by @dqhl76 in #16014
- refactor: optimize take_by_slices_limit_from_blocks by @forsaken628 in #15978
- refactor(query): add crash handler for databend-query by @zhang2014 in #16054
- refactor: Simplify HashMethod.build_keys_state function signature by replacing &[(Column, DataType)] with InputColumns by @forsaken628 in #16050
- refactor: meta-service wait at most 1 second to shutdown by @drmingdrmer in #16104
- refactor(query): Optimize aggregate function arg_min_max by @forsaken628 in #16109
- refactor(query): If table_id is specified, use it directly by @TCeason in #16098
- refactor: upgrade to Openraft-0.10.0-alpha.2 by @drmingdrmer in #16091
- refactor: Bump OpenDAL to 0.48.0 by @Xuanwo in #16147
- refactor: tweak
table_statistics
of trait Table by @dantengsky in #16152 - refactor: add retry logic for flight service by @dqhl76 in #16097
- refactor: refector ShareMeta struct by @lichuang in #16100
- refactor(query): Optimize udf js runtime to avoid lock blocking by @b41sh in #16174
- refactor(query): if projection not query stat/owner will not get table_stats/ownership when access system.tables by @TCeason in #16183
- refactor: improve network error handling. by @drmingdrmer in #16194
- refactor(query): remove useless meter by @zhang2014 in #16202
- refactor(query): remove useless hash builder by @zhang2014 in #16203
- chore: remove unused minmax by @zhyass in #16197
- refactor: display unix timestamp for human by @drmingdrmer in #16212
- refactor: support unset session|global by @TCeason in #16214
- refactor: Use native-tls as default by @Xuanwo in #16199
- refactor: fuse table funcs by @dantengsky in #16149
- refactor(query): use bigint to handle the fallback of decimal op overflow by @sundy-li in #16215
- refactor: CrudMgr::update should return only when the value is changed by @drmingdrmer in #16225
Build/Testing/CI Infra Changes ๐
- ci: code reuse and get raid of streaming load. by @youngsofun in #15811
- ci: test load parquet unloaded by databend. by @youngsofun in #15813
- ci: longer timeout for test_stateless_standalone. by @youngsofun in #15936
- ci: Build runner for nightly-2024-07-02 by @Xuanwo in #16033
Others ๐
- chore: recluster final ignore error by @zhyass in #15815
- chore: reduce recluster depth threshold by @zhyass in #15819
- chore(query): remove redundant code in merge into by @Freejww in #15812
- chore: normalize compat test script names by @drmingdrmer in #15817
- chore: fix scripts in meta-meta compatibility test by @drmingdrmer in #15826
- chore(ci): add reporter for meta chaos test by @everpcpc in #15829
- chore: bump arrow-udf by @sundy-li in #15836
- chore(query): remove addRowNumber if source is physical table by @Freejww in #15828
- chore(query): show view just display view name by @TCeason in #15841
- chore: Fix build of arrow-udf by @Xuanwo in #15846
- chore: change meta chaos io delay param by @lichuang in #15852
- chore(ci): fix duplicate failure artifact name by @everpcpc in #15854
- chore: disable recursive cte test for mysql by @xudong963 in #15853
- chore(query): add retry logs by @sundy-li in #15862
- chore: refactor name and match expression by @lichuang in #15863
- chore: update logcall by @andylokandy in #15831
- chore: install gdb for query service image by @everpcpc in #15875
- chore(query): string_to_date/ts should not ignore error by @TCeason in #15878
- chore: fix recursive cte hang by @xudong963 in #15883
- chore(executor): avoid lock poisoning for pipeline executor by @zhang2014 in #15887
- chore: add log to show if enable distributed optimization by @xudong963 in #15890
- chore: print the select plan of insert query by @xudong963 in #15892
- chore(query): improve error msg by @sundy-li in #15874
- chore: replace patched codespan-reporting with rspack-codespan-reporting by @andylokandy in #15894
- chore(query): udf allow list only judge host match by @sundy-li in #15896
- chore: change meta chaos io delay param by @lichuang in #15897
- chore: Bump minitrace for new opentelementry support by @Xuanwo in #15700
- chore(ci): switch runner to aws by @everpcpc in #15903
- chore: improve spiller buffer by @xudong963 in #15904
- chore: copy into default value using
RemoteExpr
by @b41sh in #15893 - chore: update domain for datasets & benchmark by @everpcpc in #15920
- chore: remove open sharding binary and test by @lichuang in #15923
- chore: add test case of issue 15791 by @lichuang in #15931
- chore(storage): limit compact num when recluster by @zhyass in #15926
- chore: fix param typo in function process_left_or_full_join_null_block by @lichuang in #15933
- chore(planner): refactor join equi conditions by @Dousir9 in #15924
- chore(planner): refine join_order_changed check in merge into by @Dousir9 in #15934
- chore: remove streaming_load api. by @youngsofun in #15935
- chore(query): readd some tests by @sundy-li in #15940
- chore: add issue case for pr15941 by @xudong963 in #15945
- chore(query): enable none lazy pruner in lazy read by @sundy-li in #15942
- chore(docs): update domain for databend.rs by @everpcpc in #15959
- chore: Bump hive_metastore to 0.1.0 by @Xuanwo in #15976
- chore: move display_ident() to ast by @andylokandy in #15980
- chore(query): show user functions should display built-in udfs by @TCeason in #15990
- chore(query): fix cluster ci failure if set license by @zhang2014 in #15999
- chore(ci): add license for cluster test by @everpcpc in #15995
- chore: update nom-rule by @andylokandy in #16009
- chore: orc and parquet use option missing_field_as. by @youngsofun in #16007
- chore: add repo link in databend-common-ast lib by @lewiszlw in #16010
- chore(query): disallow agg/window/udfcall in insert expr by @sundy-li in #16015
- chore: remove unused ShareEndpointManager and ShareTableConfig by @lichuang in #16013
- chore: add
errors
column and more tests forsystem.queries_profiling
table by @dqhl76 in #16029 - chore(query): add full expr in error message by @sundy-li in #16032
- chore: temporarily disable a flaky test by @dqhl76 in #16042
- chore: add privilege checking for
fuse_amend
by @dantengsky in #16045 - chore: add
node
column to tablesystem.malloc_stats_totals
by @dantengsky in #16043 - chore(query): enable distributed merge into by default by @Dousir9 in #16059
- chore(ci): setup license with mask by @everpcpc in #16061
- chore(query): information_schema.tables support display view engine by @TCeason in #16058
- chore: fix share test integration missing query binary by @lichuang in #16071
- chore: Bump jsonb version 0.4.1 by @b41sh in #16073
- chore(query): better error msg in check_function by @sundy-li in #16075
- chore(query): fix merge into virtual computed field by @Dousir9 in #16086
- chore: update rust toolchain to nightly-2024-07-02 by @andylokandy in #16026
- chore(code): update the version of object_store in Cargo.lock by @Dousir9 in #16110
- chore(query): disable inner columns in prewhere by @Dousir9 in #16108
- chore(query): add sigaltstack for signal handler by @zhang2014 in #16122
- chore(query): limit frames size for capture backtrace by @zhang2014 in #16118
- chore(query): remove useless box processor impl by @zhang2014 in #16128
- chore: resolve lints by @andylokandy in #16127
- chore: fake time test by @dantengsky in #16130
- chore: resolve lints (part 2) by @andylokandy in #16133
- chore(http): exposes max_running_query_executed_time in /v1/status by @flaneur2020 in #16131
- chore(query): optimize tables query speed by @TCeason in #16144
- chore(ci): add buildkitd config & fix typos by @everpcpc in #16151
- chore: reset table_lock_expire_secs default value by @zhyass in #16154
- chore(query): move udf admin api to management mode by @zhang2014 in #16155
- chore: improve parser by @andylokandy in #16156
- chore: reduce the sleep range during lock holder by @zhyass in #16158
- chore(ci): split internal test sql by @TCeason in #16167
- chore(query): revert 16097 by @zhang2014 in #16171
- chore(ci): reset table lock expire in sqllogic by @zhyass in #16160
- chore(query): skip empty block for udf script by @zhang2014 in #16182
- chore(ci): disable musl build by @everpcpc in #16205
- chore: refine TimedFutre wrapper by @drmingdrmer in #16211
- chore(query): change stream table is_local to false by @Dousir9 in #16208
- chore(ci): try fix main ci err by @TCeason in #16224
- chore: refine meta-service by @drmingdrmer in #16229
- chore(planner): fix distributed merge into by @Dousir9 in #16228
New Contributors
Full Changelog: v1.2.530...v1.2.609