Skip to content

v0.8.0-nightly

Pre-release
Pre-release
Compare
Choose a tag to compare
@Xuanwo Xuanwo released this 18 Aug 10:06
· 15408 commits to main since this release
965f01a

Databend 0.8.0 is out! 🚀 🚀 🚀
Thank you to everyone for the work over the past 5 months!

Significant improvements

New Planner: JOIN! JOIN! JOIN!

To better support complex SQL queries and improve user experience, Databend v0.8 is designed with a new Planner framework.

image

Databend has added JOIN and proper subquery support, driven by New Planner.

select vip_info.Client_ID, vip_info.Region 
    from vip_info right 
    join purchase_records 
    on vip_info.Client_ID = purchase_records.Client_ID;

New Parser: The Best Parser!

While refactoring Planner, the databend community has implemented a new nom-based Parser that balances development efficiency with user experience.

New Parser makes it easy for developers to design/develop/test complex SQL syntax in an intuitive way

COPY
    ~ INTO ~ #copy_unit
    ~ FROM ~ #copy_unit
    ~ ( FILES ~ "=" ~ "(" ~ #comma_separated_list0(literal_string) ~ ")")?
    ~ ( PATTERN ~ "=" ~ #literal_string)?
    ~ ( FILE_FORMAT ~ "=" ~ #options)?
    ~ ( VALIDATION_MODE ~ "=" ~ #literal_string)?
    ~ ( SIZE_LIMIT ~ "=" ~ #literal_u64)?

It also gives the user specific and precise information about the error.

MySQL [(none)]> select number from numbers(10) as t inner join numbers(30) as t1 using(number);
ERROR 1105 (HY000): Code: 1065, displayText = error:
  --> SQL:1:8
  |
1 | select number from numbers(10) as t inner join numbers(30) as t1 using(number)
  |        ^^^^^^ column reference is ambiguous

No more worrying about not knowing what's wrong with SQL.

Visit The New Databend SQL Planner for more information.

New Features

In addition to the newly designed Planner, the Databend community has implemented a number of new features.

COPY Enhancement

COPY capabilities have been greatly enhanced, and Databend can now:

  • Copy data from any supported storage service (even https!)
    COPY 
        INTO ontime200 
        FROM 'https://repo.databend.rs/dataset/stateful/ontime_2006_[200-300].csv' 
        FILE_FORMAT = (TYPE = 'CSV')
  • Support for copying compressed files
    COPY 
        INTO ontime200 
        FROM 's3://bucket/dataset/stateful/ontime.csv.gz' 
        FILE_FORMAT = (TYPE = 'CSV' COMPRESSION=AUTO)
  • UNLOAD data to any supported storage service
    COPY 
        INTO 'azblob://bucket/'  
        FROM ontime200
        FILE_FORMAT = (TYPE = 'PARQUET‘)

Hive Support

Databend v0.8 designed and developed the Multi Catalog and implemented Hive Metastore support on top of it!

Databend can now interface directly to Hive and read data from HDFS.

select * from hive.default.customer_p2 order by c_nation;

Time Travel

A long time ago, the Databend community shared an implementation of the underlying FUSE Engine, From Git to Fuse Engine, where one of the most important features was the support for time travel, allowing us to query data tables at any point in time.

Starting from v0.8, this feature is now officially installed and we can now

  • Query the data table for a specified time
    -- Travel to the time when the last row was inserted
    select * from demo at (TIMESTAMP => '2022-06-22 08:58:54.509008'::TIMESTAMP); 
    +----------+
    | c        |
    +----------+
    | batch1.1 |
    | batch1.2 |
    | batch2.1 |
    +----------+
  • Recover mistakenly deleted data tables
    DROP TABLE test;
    
    SELECT * FROM test;
    ERROR 1105 (HY000): Code: 1025, displayText = Unknown table 'test'.
    
    -- un-drop table
    UNDROP TABLE test;
    
    -- check
    SELECT * FROM test;
    +------+------+
    | a    | b    |
    +------+------+
    |    1 | a    |
    +------+------+

Make business data have more security!

CTE Support

CTE (Common Table Expression) is a frequently used feature in OLAP business to define a temporary result set within the execution of a single statement, which is valid only during the query period, enabling the reuse of code segments, improving readability and better implementation of complex queries.

Databend v0.8 re-implements the CTE based on New Planner and now users can happily use WITH to declare the CTE.

WITH customers_in_quebec 
     AS (SELECT customername, 
                city 
         FROM   customers 
         WHERE  province = 'Québec') 
SELECT customername 
FROM   customers_in_quebec
WHERE  city = 'Montréal' 
ORDER  BY customername; 

In addition to these features mentioned above, Databend v0.8 also supports UDFs, adds DELETE statements, further enhances support for semi-structured data types, not to mention the numerous SQL statement improvements and new methods added. Thanks to all the contributors to the Databend community, without you all the new features mentioned here would not have been possible!

Quality Enhancement

Feature implementation is just the first part of product delivery. In Databend v0.8, the community introduced the concept of engineering quality, which evaluates the quality of Databend development in three dimensions: users, contributors, and community.

Reassuring users

In order for users to use Databend with confidence, the community has added a lot of tests over the last three months, fetching stateless test sets from YDB and others, adding stateful tests for ontime, hits and other datasets, putting SQL Logic Test online to cover all interfaces, and enabling SQL Fuzz testing to cover boundary cases.

Furthermore, the community has also gone live with Databend Perf to do continuous performance testing of Databend in production environments to catch unexpected performance regressions in time.

Make contributors comfortable

Databend is a large Rust project that has been criticized by the community for its build time.
To improve this issue and make contributors feel comfortable, the community went live with a highly configurable, specially tuned Self-hosted Runner to perform integration tests for PR and enabled several services or tools such as Mergify, mold, dev-tools, etc. to optimize the CI process.
We also initiated a new plan to restructure the Databend project, splitting the original huge query crate into multiple sub-crates to avoid, as much as possible, the situation of changing one line of code and check execution for five minutes.

Keeping the community happy

Databend is a contributor and participant in the open source community. During the development of v0.8, the Databend community established the principle of Upstream First, actively following and adopting the latest upstream releases, giving feedback on known bugs, contributing their own patches, and starting Tracking issues of upstream first violation to keep up with the latest developments.
The Databend community is actively exploring integration with other open source projects and has already implemented integration and support for third-party drivers such as Vector, sqlalchemy, clickhouse-driver, etc.


Next Steps

Databend v0.8 is a solid foundation release with a new Planner that makes it easier to implement features and make optimizations. In version 0.9, we expect improvements in the following areas.

  • Query Result Cache
  • JSON Optimization
  • Table Share
  • Processor Profiling
  • Resource Quota
  • Data Caching

Please check the Release proposal: Nightly v0.9 for the latest news~


Detialed Changes

What's Changed

Exciting New Features ✨

  • feat(data type): Support Semi-structured array, object data type by @b41sh in #4571
  • Basic clickhouse REST handler by @youngsofun in #4613
  • ISSUE-4558: Add check_json function by @kevinw66 in #4606
  • feat(processor): support pushing executor by @zhang2014 in #4625
  • feature: support REGEXP_INSTR function by @nange in #4629
  • feat(processor): support complete executor by @zhang2014 in #4639
  • feature: support logical view by @Veeupup in #4628
  • Feature: add state machine range and subscribe API by @lichuang in #4608
  • Feature: support information_schema database by @Veeupup in #4672
  • Feature: add range-map module by @lichuang in #4669
  • Support show grants for role by @junnplus in #4700
  • support show full tables by @TCeason in #4702
  • feat(io): refactor type deserialization by @sundy-li in #4634
  • feat(function): Support Semi-structured function GET/GET_IGNORE_CASE/GET_PATH by @b41sh in #4684
  • support query: show table status by @TCeason in #4757
  • feature: support REGEXP_SUBSTR function by @nange in #4771
  • feat(mysql): sqlalchemy execute work by @BohuTANG in #4774
  • feat(function): Support Semi-structured access elements by @b41sh in #4780
  • feat(function): Support cast variant to other data types by @b41sh in #4787
  • query(expressions): add try_cast function by @sundy-li in #4794
  • improve: pass parameter from query to functions by @Veeupup in #4805
  • support parse select expr as alias_name by @TCeason in #4841
  • Compatible: show schemas; synonym for show databases by @wubx in #4824
  • feat: add timezone session settings by @Veeupup in #4852
  • Support Keyword DATABASE synonym SCHEMA by @TCeason in #4855
  • Feature: Watch api by @lichuang in #4779
  • feature: metasrv has to be compatible with 20220413-34e89c99e4e35632718e9227f6549b3090eb0fb9 by @drmingdrmer in #4901
  • feat: create user if not exists on JWT authenticate by @junnplus in #4924
  • Refine new planner framework by @leiysky in #4895
  • feat(parser): add select statement by @andylokandy in #4941
  • chore(mysqldump): support mysqldump dump schema by @BohuTANG in #4972
  • ISSUE-4964: export data_compressed_size to system.tables by @dantengsky in #4966
  • feature: support REGEXP_REPLACE function by @nange in #4944
  • feat(datavalues): add pop_data_value in MutableColumn and TypeDeserializer by @ygf11 in #4977
  • chore(show): add snapshot_location back to show create table by @BohuTANG in #4979
  • feat(functions): support aggregate function retention by @fkuner in #4970
  • feat(data type): implement PartialOrd for Variant data-type by @b41sh in #4959
  • feat(parser): switch to the new sql parser by @andylokandy in #4983
  • feat(planner): Add switch to enable new planner by @leiysky in #4989
  • feat(datatype): datatype timestamp with precision by @Veeupup in #4997
  • feat(function): Support Semi-structured function json_extract_path_text by @b41sh in #4992
  • feature(types): function in support other datatypes by @fkuner in #5011
  • feat: Implement azblob support by @Xuanwo in #5025
  • feat(parser): implement insert statement by @andylokandy in #5029
  • feat(test): new sql logic test framework RFC by @ZhiHanZ in #5048
  • feat: memory profiling by @dantengsky in #5050
  • Feature: transaction api by @lichuang in #5030
  • feat(planner): Support select operator in new planner framework by @leiysky in #5059
  • use jwtk for es512 by @junnplus in #5062
  • Feature: add leave node API by @lichuang in #5069
  • feat(planner): implement aggregate operator in new planner framework by @xudong963 in #5027
  • feat(data type): variant add alias json, object add alias map by @b41sh in #5099
  • feat(parser): support more statements by @andylokandy in #5089
  • Introduce a helper ExpressionEvaluator to simplify expression evaluation by @leiysky in #5108
  • feat(planner): support more aggregate syntax by @xudong963 in #5115
  • feat(planner): Support TableFunction in new planner by @leiysky in #5135
  • feat: Add scalar function humanize by @cadl in #5073
  • feat(planner): Refine Scalar with enum_dispatch and support more scalar expressions by @leiysky in #5162
  • feat(fuse): add system$fuse_segment function by @BohuTANG in #5172
  • Feature: impl upsert_table_option with kv-txn by @lichuang in #5183
  • Feature: impl get_table_by_id with kv-txn by @lichuang in #5185
  • Add cluster key statistics in block meta by @zhyass in #5194
  • feat(planner): support having and scalar expression in group by for new planner by @xudong963 in #5200
  • feat(group_by): support two-level hashmap by @fkuner in #5075
  • feat(query): support timezone by @Veeupup in #4878
  • add access check for management mode by @junnplus in #5211
  • feat(planner):integrate the stateless test for the new planner's aggregation by @xudong963 in #5204
  • feat: Introduce opendal 0.6 and enable retry support by @Xuanwo in #5216
  • feat(planner): Implement hash inner join by @leiysky in #5175
  • feat(data type): ArrayType support inner dataType by @b41sh in #5049
  • feat: Add HDFS support by @Xuanwo in #5245
  • feat(planner): select without from by @Veeupup in #5256
  • feat(planner): support order by in new planner by @xudong963 in #5253
  • feat(format): support parquet input format by @zhang2014 in #5271
  • Feature: multi-catalog by @dantengsky in #4947
  • feature(planner): Support subqueries in new planner by @leiysky in #5283
  • feat(planner): display error with source span by @andylokandy in #5290
  • Feature: add user common types to pb impl by @lichuang in #5289
  • feat(function): support length function for Array & Array by @fkuner in #5274
  • Feature: user api pb convert impl by @lichuang in #5296
  • feat(function): Support generic Array access elements by index by @b41sh in #5244
  • impl alter database rename by @TCeason in #5286
  • store endpoints to metasrv and use balance endpoints grpc connection channel by @ariesdevil in #4987
  • feat(parser): add span for expression by @andylokandy in #5309
  • feat(planner): support limit for new planner by @fkuner in #5301
  • Feature: add metrics in metasrv by @lichuang in #5208
  • RFC: Config Backward Compatibility by @Xuanwo in #5324
  • feat(planner): support map access expression by @andylokandy in #5358
  • feature(planner): Support some scalar expressions in new planner by @leiysky in #5362
  • feature(planner): Support context function in new planner by @leiysky in #5369
  • feat(planner): support subquery table reference type for new planner by @xudong963 in #5279
  • feat(function): Support connection_id function by @b41sh in #5381
  • Feature: add more metrics in metasrv by @lichuang in #5376
  • feat(function): change retention return type from Variant to Array by @fkuner in #5302
  • feat(metasrv): add metrics to http service by @RinChanNOWWW in #5389
  • Feat(httphandler): result download by @youngsofun in #5395
  • support tenant quota by @junnplus in #5406
  • Feature: Metasrv metrics by @lichuang in #5420
  • feat(planner): support using and natural for join by @xudong963 in #5423
  • feat(function): support date_add for new parser by @fkuner in #5419
  • Feature: Support DISTINCT in new planner by @ygf11 in #5410
  • feature(format): refactor output format by @sundy-li in #5422
  • feature(planner): Enhance GROUP BY semantic check by @leiysky in #5431
  • MySQL Handler Kill Query by @TCeason in #5448
  • feat(function): support object_keys function by @fkuner in #5461
  • feat(function): Support compare variant with other data types by @b41sh in #5463
  • feat(fuse): add system$clustering_information function by @zhyass in #5426
  • chore: Towards the next nightly by @Xuanwo in #5478
  • feat(table statistics): add statistics to TableMeta by @dantengsky in #5476
  • feature(planner): Translate subquery into apply operator by @leiysky in #5510
  • feature(planner): Common tree structure formatter for plan display by @leiysky in #5512
  • feat(function): Support variant max/min functions by @b41sh in #5525
  • feat: snapshot timestamp & navigation by @dantengsky in #5535
  • Feature: Support TRIM function in new planner by @ygf11 in #5541
  • Feature: add metasrv time travel functions by @lichuang in #5468
  • feat(function): Support variant as function by @b41sh in #5442
  • feat(planner):support array literal in new planner by @xudong963 in #5551
  • Feature: add metasrv time travel functions, add more unit test and re… by @lichuang in #5566
  • feat(httphandler): support download with formats. by @youngsofun in #5568
  • Feat : undrop table & show history by @dantengsky in #5562
  • add func user() by @TCeason in #5584
  • Feature: get_{db|table}_history ignore the data out of retention date by @lichuang in #5597
  • feature(executor): Support correlated subquery by @leiysky in #5593
  • feature(query): copy into stage support by @sundy-li in #5579
  • add system.stages table and show stages by @junnplus in #5581
  • feat(planner): support explain for new planner by @xudong963 in #5587
  • feat: data time travel "select at" by @dantengsky in #5617
  • feat(planner): support position function by @xudong963 in #5618
  • feature(planner): Support tuple in new planner by @leiysky in #5640
  • feat(parser): implment string unescape by @andylokandy in #5638
  • feature(planner): Support read from view in new planner by @leiysky in #5652
  • fix(parser): allow mysql-style hex number and single-item array by @andylokandy in #5654
  • [metasrv] feature: exchange protocol version with client by @drmingdrmer in #5645
  • add call function system$search_tables by @junnplus in #5663
  • feat: Add decompress support for COPY INTO and streaming loading by @Xuanwo in #5655
  • feat(query): Add support for compression auto and raw_deflate by @Xuanwo in #5669
  • feat(query): add call stats functions by @everpcpc in #5646
  • feat(query): Fail fast if the underlying storage is not available by @Xuanwo in #5671
  • feat(function): Support variant order by by @b41sh in #5668
  • feature(query): support FixedKey u128, u256, u512 in group query by @sundy-li in #5678
  • feat(function): Support variant group by by @b41sh in #5694
  • feature(executor): Introduce ScalarEvaluator to evaluate Scalar by @leiysky in #5689
  • Feature: add mock module, add test of out retention time data by @lichuang in #5707
  • feat(planner): support cross join by @xudong963 in #5715
  • feat(meta): record count of tables for a tenant in KV space by @RinChanNOWWW in #5708
  • feature(planner): Introduce InterpreterFactoryV2 for new planner by @leiysky in #5729
  • Add cluster name to metasrv and meta-client by @devillove084 in #5740
  • add gc out of drop retention time data schema API and unit tests by @lichuang in #5746
  • feat(meta): Output [binary] in debug message instead by @Xuanwo in #5752
  • Remove the init_cluster parameter in open_create_boot by @devillove084 in #5754
  • feature(query): support errorcode hint in new planner by @sundy-li in #5756
  • feat: undrop database by @LiuYuHui in #5770
  • feature(planner): Support CREATE TABLE statement in new planner by @leiysky in #5771
  • [meta] feature: protobuf message has to persist MIN_COMPATIBLE_VER in it by @drmingdrmer in #5785
  • fix(query): Don't load credential while reading stage by @Xuanwo in #5783
  • feat(query): add function NULLIF by @RinChanNOWWW in #5772
  • feat(query): alter table cluster key by @zhyass in #5718
  • feat(planner): support udf by @xudong963 in #5751
  • feature(planner): Support system SHOW statements by @leiysky in #5800
  • Remove subscribe_metrics todos. by @devillove084 in #5801
  • feature(planner): Migrate CREATE USER statement to new planner by @TCeason in #5802
  • feat(query): support async insert mode to improve throughput by @fkuner in #5567
  • move Clickhouse HTTP handler to its own port. by @youngsofun in #5797
  • feat(planner): support create view in new planner by @b41sh in #5816
  • feature(planner): Migrate DROP USER statement to new planner by @TCeason in #5813
  • feature(planner): Migrate ALTER USER statement to new planner by @TCeason in #5823
  • feat(planner): Support create database in new planner by @ygf11 in #5804
  • feat(stage): remove directory recursive in drop internal stage by @fkuner in #5809
  • Feature: add metrics about metasrv network and docs by @lichuang in #5842
  • feat(query): support drop cluster key by @zhyass in #5835
  • support function timezone() by @TCeason in #5840
  • feat(planner): support drop database in new planner by @ygf11 in #5846
  • Add DDL STAGE for new planner framework by @sundy-li in #5821
  • feat: Window func by @doki23 in #5401
  • Feature: add benchmark scripts of metasrv by @lichuang in #5865
  • fix(planner): support semi and anti join in new planner by @xudong963 in #5869
  • feat(stage): remove stage files by @Kikkon in #5788
  • federated query:show collation, show charset, timediff func by @TCeason in #5868
  • feat(planner): support rename database in new planner by @ygf11 in #5887
  • feature(optimizer): Refine optimizer framework by @leiysky in #5877
  • feat(query): support parse exponential notation data values by @b41sh in #5902
  • feat(planner): support alter view in new planner by @b41sh in #5862
  • clickhouse http handler support TsvWithNamesAndTypes. by @youngsofun in #5898
  • feat(query): new table engine: RANDOM. by @RinChanNOWWW in #5896
  • feature(optimizer): Support predicate push down through join by @leiysky in #5914
  • feat(query): move global configs to settings by @fkuner in #5850
  • feat(planner): support drop view in new planner by @b41sh in #5920
  • feature(optimizer): Support constant folding by @leiysky in #5924
  • feat(planner): support set operators in parser and planner by @xudong963 in #5833
  • feat: clickhouse http api support compress by @youngsofun in #5934
  • Feature: add DeleteByPrefix API in txn::op by @lichuang in #5936
  • feat(data type): support struct data type by @b41sh in #5940
  • feat(columns): add some cols for system.columns by @TCeason in #5946
  • query(storage/s3): Add enable virtual host style support by @Xuanwo in #5976
  • add tenant header for http handler by @junnplus in #5985
  • Feature: add meta grpc client network metrics by @lichuang in #5978
  • feature(planner): support table statements in new planner by @andylokandy in #5907
  • feature(planner): support left outer join and right outer join by @xudong963 in #5972
  • feature(planner): support show statements in new planner by @andylokandy in #6013
  • handler(clickhouse): ck http handler supports settings by @fkuner in #5945
  • Migrate show users/roles statement to new planner by @junnplus in #6016
  • feat: transient fuse table by @dantengsky in #5968
  • feat(planner): support ORDER BY column ordinal by @xudong963 in #6028
  • feat: export JAVA_HOME in dev_setup.sh by @dantengsky in #6029
  • support ifnull feature by @yuuch in #5921
  • feat(ci): export JAVA_HOME/lib/server as LD_LIB_PATH by @dantengsky in #6034
  • feat(script): generate tpch data set by @xudong963 in #6024
  • feat: improve clickhouse output format by @youngsofun in #6027
  • ISSUE-5829: Support field comment. by @RinChanNOWWW in #5952
  • feature(parser): stop parsing at insert statement by @andylokandy in #6048
  • feat: support date_sub by @PsiACE in #6050
  • feature(query): add multi_if function by @sundy-li in #6039
  • feature(planner): Support query log for new planner by @leiysky in #6053
  • feature(functions): add function to_nullable and assume_not_null by @sundy-li in #6055
  • feature(optimizer): Decorrelate EXISTS subquery by @leiysky in #6051
  • feat(planner) migrate grant to planner v2 by @TCeason in #6049
  • feat(query): Configurable repr of float denormals in output formats. by @youngsofun in #6065
  • Feature: add meta service network metrics and http health checker by @lichuang in #6071
  • feature(optimizer): Support push down filter through cross apply by @leiysky in #6079
  • feat(planner): migrate revoke to planner v2 by @TCeason in #6066
  • ISSUE-4471 integrate cluster select query with new processor by @zhang2014 in #4544
  • Clickhouse http handler support set database. by @youngsofun in #6097
  • refactor(query/planner): Migrate COPY to new planner by @Xuanwo in #6074
  • feat(query): support SELECT ... FROM ... { AT TIMESTAMP } on planner_v2 by @cadl in #6056
  • Add funciton coalesce by @TianLangStudio in #5922
  • feat(planner): migrate insert statement to new planner by @fkuner in #5897
  • feat(planner): support non-equi conditions in hash join by @xudong963 in #6145
  • feat(query): support exists statement. by @youngsofun in #6166
  • [metasrv] feature: leave a cluster with databend-meta --leave.. by @drmingdrmer in #6181
  • [improving] store grpc addr to node info and auto refresh backends addrs for grpc client by @ariesdevil in #5495
  • feature(planner): Introduce serializable physical plan by @leiysky in #6191
  • feature(planner): Decorrelate EXISTS subquery with non-equi condition by @leiysky in #6232
  • order by sub stmt support db.table.col by @TCeason in #6196
  • feat: integration with sentry by @PsiACE in #6226
  • Feat: statement delete from... by @dantengsky in #5691
  • feat(planner): migrate CreateUDF to planner v2 by @TennyZhuang in #5905
  • feat(optimizer): rewrite predicate and accelerate tpch19 by @xudong963 in #6301
  • Feature: add import init cluster support by @lichuang in #6280
  • feat: add call procedure for sync stage by @junnplus in #6344
  • Migrate call statement to new planner by @junnplus in #6361
  • show query support format by @TCeason in #6366
  • feat: Implement "IS [NOT] DISTINCT FROM" operator" by @JialuGong in #6170
  • feat(ast): add span info for TableReference by @PragmaTwice in #6370
  • feat(storage): Improve optimize table compact by @zhyass in #6373
  • feat: show settings support like by @TCeason in #6394
  • feat: Add xz compression support by @Xuanwo in #6421
  • feat(query): support all JsonEachRowOutputFormat variants. by @youngsofun in #6434
  • feat(planner): Support qualified column name with database specified by @leiysky in #6444
  • feat: introduce system.tables_with_history by @dantengsky in #6435
  • feat(planner): support mark join, (not)in/any subquery, make tpch16 and tpch18 happy by @xudong963 in #6412
  • feat(parser): support any, all and some subquery by @xudong963 in #6438
  • feat(query): Support geo_to_h3 function by @ariesdevil in #6389
  • feat(rfc): Add Presign statement by @Xuanwo in #6503
  • feat(meta): refactor watch key range from [start,end] to [start,end) by @lichuang in https://github.com//pull/6499
  • feat(parser): Add Presign Statement by @Xuanwo in #6513
  • feat: Implement presign support by @Xuanwo in #6529
  • feat(query): migrate window function to new pipeline by @sundy-li in #6500
  • feat(function): add date_trunc function by @ariesdevil in #6540
  • feat(query): /v1/download support limit number of rows. by @youngsofun in #6546
  • feat(planner): support ALL and SOME subquery, mark join with non-equi condition, and make tpch q20 happy by @xudong963 in #6534
  • feat(format): add format diagnostic by @fkuner in #6530
  • feat(parser): support order by nulls first in parser by @GrapeBaBa in #6544
  • feat(query): format date/datetime/enum in query log by @everpcpc in #6575
  • feat: Allow COPY FROM/INTO different storage services by @Xuanwo in #6573
  • feat(setting): support global setting by @fkuner in #6579
  • feat(expr): add new crate common-expression by @andylokandy in #6576
  • feat(query): pretty format for explain by @jiaoew1991 in #6585
  • feat(expr): implement pretty print for Chunk by @andylokandy in #6597
  • feat: add {db,table}_id map to {(tenant,db_name), (db_id, table_name)} in metasrv. by @lichuang in #6607
  • feat: Allow create stage for different services by @Xuanwo in #6602
  • feat: add share metasrv ShareApi(create_share,drop_share) by @lichuang in #6582
  • feat(query): support insert zero date and zero datetime by @b41sh in #6592
  • feat: Allow COPY and CREATE STAGE from public read buckets without credentials by @Xuanwo in #6623
  • feat(hive): support read boolean, float, double, date, array columns by @sandflee in #6629
  • feat(query): add StageFileFormatType::Tsv. by @youngsofun in #6651
  • feat(expr): Implement domain calculation by @andylokandy in #6649
  • feat(planner): support create table as select in planner v2 by @yuuch in #6618
  • feat(expr): implement error report by @andylokandy in #6661
  • feat(expr): allow function to return runtime error by @andylokandy in #6662
  • feat: Implement new commands for databend-meta. by @RinChanNOWWW in #6559
  • feat: add share metasrv ShareApi {add|remove}_share_account by @lichuang in #6656

Thoughtful Bug Fix 🔧

  • docs: datetime functions docs by @Veeupup in #4611
  • bump opensrv for fix issue-4429 by @junnplus in #4626
  • ISSUE-4262: prohibits using reserved table option in create table statement. by @dantengsky in #4632
  • fix(datetime): fix wrong datetime64 comparision and cast by @sundy-li in #4656
  • [Cloud] keep environment variable immutable during reload by @ZhiHanZ in #4655
  • deps: Bump to OpenDAL v0.4 by @Xuanwo in #4678
  • Update function_doc_asset.rs by @sundy-li in #4694
  • query(values): support take for null column && update streaming load docs by @sundy-li in #4701
  • fix(mysql): bump mysql srv by @zhang2014 in #4735
  • fix(query): Create on existing dir returns unexpected error by @Xuanwo in #4747
  • fix(function): Const(Nullable(Object)) column downcast to Nullable failed by @b41sh in #4733
  • Fix schema display on http api by @flaneur2020 in #4751
  • ISSUE-4668: Enable Lz4Raw & rm parquet_format_async_temp by @dantengsky in #4726
  • fix(query): Fix support for endpoint without scheme by @Xuanwo in #4767
  • fix(function): try_cast from varaint return Null instead of Err by @b41sh in #4793
  • refactor(credits): try persist credits at build time by @PsiACE in #4791
  • fix: select * shouldn't return results by @xudong963 in #4796
  • query(plan): add default expression validations by @sundy-li in #4806
  • fix(mysql_handler): salt characters should be ascii. by @youngsofun in #4810
  • add env helper for http handler host by @junnplus in #4811
  • fix v1/config not updated after reload config by @junnplus in #4820
  • query(fuse): limit push down respect orders by @sundy-li in #4818
  • fix create database db.t not fail by @TCeason in #4833
  • [MetaSrv]rename table should keep table_id nochange by @ariesdevil in #4838
  • ISSUE-4847: Replace FactoryCreator with FactoryCreatorWithTypes for functions by @zhyass in #4688
  • feat(query): support group by datetimes & dates by @sundy-li in #4846
  • Select expr support use backquoted by @TCeason in #4857
  • Fix show grants from inherited role by @junnplus in #4873
  • fix(executor): fix dead loop for tasks by @zhang2014 in #4845
  • fix(function): validate function args before get type by @b41sh in #4888
  • fix(function): display correct field name for map-access and cast by @b41sh in #4812
  • ISSUE-4860: fix empty query by @cadl in #4894
  • fixes(group): fix group by with negative value by @zhang2014 in #4902
  • fixes(limit): fixes limit and offset with one block by @zhang2014 in #4907
  • feat(cast): fix the behavior of null to boolean by @sundy-li in #4911
  • feat(tests): add native mysql client uexpect by @sundy-li in #4956
  • [metactl] fix: by default do not export from a running meta node. by @drmingdrmer in #4991
  • fix: Poetry in tests not updated by @Xuanwo in #5024
  • fix(use database): report error when try to use an empty database. by @chowc in #4939
  • fix(parser): update golden test file by @andylokandy in #5037
  • bug-fix: make describe view table work by @Veeupup in #5045
  • fix(parser): allow to omit semicolon by @andylokandy in #5058
  • chore(query): manually drop the aggregate states to avoid memory leak by @sundy-li in #5056
  • hotfix(build): Revert "feat: add "instal_pgk thrift" to dev_setup.sh" by @dantengsky in #5085
  • fix doc parser by @jiahui-97 in #5088
  • fix(functions): manually drop state in function eval_aggr function by @sundy-li in #5080
  • bug: fix interval_function flaky test by @Veeupup in #5094
  • fix(functions): use drop guard to ensure the states dropped by @sundy-li in #5097
  • fix: clickhouse worker hang when interpreter fail to execute by @chowc in #5091
  • chore(planner): fix duplicate column name by @sundy-li in #5112
  • chore(base): disable backtrace by default by @sundy-li in #5127
  • fix: fix trim by @jiahui-97 in #5136
  • fix(planner): make aggregator work and add simple stateless tests by @xudong963 in #5165
  • fix: Handle exception display bug by @Chasen-Zhang in #5218
  • bugfix(planner): Fix wrong result of hash join when join keys have different types by @leiysky in #5222
  • fix(parser): show alternative tokens even if the branch is optional by @andylokandy in #5230
  • fix(ast): improve helper message for error by @andylokandy in #5239
  • fixes(format): fix string type csv truncate failure by @zhang2014 in #5243
  • bugfix(pipeline): Fix state machine of hash join by @leiysky in #5242
  • feat(format): add scan progress values by @zhang2014 in #5262
  • fix(planner): fix some cases in aggregator plan by @xudong963 in #5307
  • fix(storage/azblob): Azblob API uri not constructed correctly by @Xuanwo in #5316
  • Temporarily delete todo codes to fix panic bug by @ariesdevil in #5321
  • bugfix(executor): Fix wrong result of memory table engine by @leiysky in #5364
  • bugfix(parser): t.a should be a column ref by @andylokandy in #5370
  • fixes(format): fixes incorrect rows size for csv stream load by @zhang2014 in #5383
  • [meta] fix: query enables embedded meta only when meta.address is configured to be empty. Do not check endponit. Some user does not have endpoints in their old config by @drmingdrmer in #5388
  • fixes(processor): fix server hang when processor panic by @zhang2014 in #5394
  • fix(datatypes): fix datetime from negative micro timestamp bug by @sundy-li in #5396
  • fixes(handler): fix clickhouse handler dead loop when error by @zhang2014 in #5412
  • fix(parser): wrong error code. by @youngsofun in #5414
  • fixes(insert): fix drop dispatcher when commit insert query by @zhang2014 in #5424
  • fix(data type): update arrow2 to fix array nullable write by @b41sh in #5429
  • fix(functions): make aggregate function sum/avg/min/max support null … by @sundy-li in #5436
  • fix(query): Fix test_query_log when RUST_BACKTRACE is not set by @Xuanwo in #5440
  • fix(function): fix retention aggregation coredump bug by @fkuner in #5450
  • fix(httphandler): req should return as soon as results is exhausted. by @youngsofun in #5462
  • fixes(processor): fix server hang when parallel execute query by @zhang2014 in #5482
  • fix(array): fix incorrect column meta of Array by @sundy-li in #5507
  • fix(scripts): deploy minio in k8s failed by @hantmac in #5526
  • fix(planner): Fix wrong result of aggregate in subquery by @leiysky in #5538
  • bugfix(executor): Fix incorrect context usage by @leiysky in #5539
  • fix(metasrv): Fix env config not loaded correctly by @Xuanwo in #5552
  • fix(function): fix object_keys array type by @b41sh in #5532
  • fix(scripts):deploy meta-service failed because of no pv definition by @hantmac in #5557
  • fix: sql logic test by @ZeaLoVe in #5578
  • fix: copy bug by @Chasen-Zhang in #5585
  • feat(query): Retry while meeting error during load_credential by @Xuanwo in #5590
  • moving role_cache_manager to session manager by @junnplus in #5605
  • fix(query): cluster keys take effect in cluster mode by @zhyass in #5608
  • fix(query): Deny the root login from others host by @chowc in #5588
  • fix: twitter card by @Chasen-Zhang in #5672
  • fix artifacts permissions by @junnplus in #5675
  • fix action missing shell by @junnplus in #5684
  • fix(user): fix 'create user if not exists' fail when user exists by @chowc in #5682
  • fix(query): fix AtString parser by @sundy-li in #5695
  • fix: Fix logic test by @ZeaLoVe in #5700
  • bugfix(planner): Fix LIMIT with offset by @leiysky in #5705
  • fix(parser): support cross join by @andylokandy in #5730
  • fix: ensure builtin roles on create user by jwt by @flaneur2020 in #5741
  • docs: Fix front matter for type_conversion.md by @Xuanwo in #5755
  • fix(query): Fix compressed buf not consumed correctly by @Xuanwo in #5727
  • fixes(executor): support abort for pipeline executor stream by @zhang2014 in #5803
  • fix(query): Fix read array(string) from fuse engine by @b41sh in #5792
  • hotfix: lz4raw compression of zero len buffer by @dantengsky in #5806
  • fix: remove incompatible modification of chunk compression by @dantengsky in #5817
  • fix: add test result file of stateless test drop all by @dantengsky in #5818
  • fixes(processor): fix server hang when sync work panic by @zhang2014 in #5814
  • feature(query): fix read quoted string by @sundy-li in #5870
  • fix(parser): don't consume extra tokens for expr by @andylokandy in #5890
  • fix(planner): corretly handle catalog in statements by @andylokandy in #5909
  • fix(insert): server panic when exceeds max active sessions by @fkuner in #5928
  • bugfix(planner): Fix correlated subquery with joins by @leiysky in #5947
  • fix status cause mysql client hang by @TCeason in #5961
  • bugfix: ProcessorExecutorStream lost data by @youngsofun in #5983
  • fix(function): fix incorrect return datatype of function if by @sundy-li in #5980
  • fix: Check all proto files if changed by @Xuanwo in #5998
  • Fix: native build with sse42 capability by @dantengsky in #6001
  • fix(csv): fix de_csv with escaped quoted by @sundy-li in #6008
  • fix: add_profile shoud return 0 if no items found in profile by @dantengsky in #6032
  • fix(planner): consider NULL for binary op in type checker by @xudong963 in #6043
  • bugfix(optimizer): Fix error of EXISTS subquery by @leiysky in #6073
  • fix(groupby): support group by constant string by @sundy-li in #6095
  • fix(parser): remove a wrong cut parser in subquery by @andylokandy in #6111
  • fix: Transient object storage IO operation fault handling #6045 by @dantengsky in #6085
  • revert meta /v1/health check by @lichuang in #6116
  • chore(sql): fix short sql by @sundy-li in #6135
  • fix(docker): fix certificates dependencies by @wfxr in #6136
  • fix(query): fix query_log incorrect event_time by @b41sh in #6142
  • fix(mysql_federated) mysql connector failed with Unknown variable: SQ… by @sandflee in #6149
  • fix(subquery): the order of children in hash join is reversed by @xudong963 in #6178
  • fixes(cluster): fix node id truncation when cluster id is escaped by @zhang2014 in #6193
  • test(query): add async insert test by @fkuner in #5964
  • query(functions): fix aggregate count incorrect state place by @sundy-li in #6218
  • [metasrv] fix: add step instruction to tests/metactl/test-metactl; by @drmingdrmer in #6227
  • bugfix(planner): Fix grouping check by @leiysky in #6219
  • Bump opendal to v0.9.1 by @Xuanwo in #6249
  • [metasrv] Fix flaky test #6242 by @ariesdevil in #6248
  • Fix output of to_datetime() by @youngsofun in #6252
  • [fix] update makefile's run options by @ClSlaid in #6287
  • Improve(deserializer): fix nullable csv deser by @sundy-li in #6300
  • fix: deletion of null values by @dantengsky in #6277
  • [session]: fix MySQL connection close_wait or fin_wait_2 by @ClSlaid in #6341
  • fix(planner): limit returns error result by @xudong963 in #6358
  • query(fix): fix hashmap memory leak by @sundy-li in #6354
  • fix(query): Add NestedCheckpointReader for input format parser by @youngsofun in #6385
  • fix(metasrv): openraft: wrong range when searching for membership entries by @drmingdrmer in #6408
  • fix(processor): show correctly progress in cluster mode by @zhang2014 in #6253
  • fix(query): escape record_delimiter when displaying by @Veeupup in #6417
  • fix(query): fix array inner type with null by @b41sh in #6407
  • fix(query): fix tsv deserialization by @sundy-li in #6453
  • fix(planner): Fix case of table alias by @leiysky in #6466
  • fix(build-tool): add c module include path for musl build-tool image by @everpcpc in #6472
  • fix(build-tool): add symlink for musl-gcc by @everpcpc in #6494
  • fix(jaeger): split package to send big package to jaeger by @sandflee in #6497
  • fix(cluster): fix cannot destroy thread in cluster mode by @zhang2014 in #6436
  • fix(processor): fix lost event in resize processor by @zhang2014 in #6501
  • fix: remove keyword table from func is_reserved_ident by @TCeason in #6512
  • fix: fix txn watch kv changed bug by @lichuang in #6516
  • fix(query): fix database and user related functions in planner v2 by @Defined2014 in #6473
  • fix(query): fix date/timestamp deserializing error by @b41sh in #6515
  • fix(query): fix input format CSV by @youngsofun in #6524
  • fix(processor): fix thread unsafe when processor schedule by @zhang2014 in #6533
  • fix: show query with limit will failed when enable planner v2 by @TCeason in #6381
  • fix: add watch txn unit test by @lichuang in #6526
  • fix: asc -> nulls first by @gaoxinge in #6545
  • fix: serde compatibility of null_count by @dantengsky in #6558
  • fix(query): fix load json value by csv format by @b41sh in #6548
  • fix(processor): call on finished if has error before execute by @zhang2014 in #6563
  • fix(query): fix values parser in skip_to_next_row by @sundy-li in #6565
  • fix(cluster): use max threads as exchange source pipe size by @zhang2014 in #6571
  • fix: big query hang with clickhouse by @TCeason in #6583
  • fix(query): catchup planner update in http handler. by @youngsofun in #6572
  • fix(logictest): reset clickhouse sqlalchemy engine for each testfile. by @youngsofun in #6593
  • fix: remove unsupported IntervalType in DateTrunc by @ariesdevil in #6611
  • fix: range delete panic and incorrect statistics (of in_memory_size) by @dantengsky in #6609
  • fix(metactl): extend sleep time before exporting data from a running databend-meta by @drmingdrmer in #6621
  • fix(join): disable null values in join by @sundy-li in #6616
  • fix: delete unused DatabaseInfo and TableInfo pb format, fix rename_table bug by @lichuang in #6617
  • fix: Copy shoud be able to run under new planner by @Xuanwo in #6624
  • fix(planner): InSubquery returns error result by @xudong963 in #6641
  • fix(query): fix variant map access filter by @b41sh in #6645

Code Refactor 🎉

  • refactor: split formats by @PsiACE in #6443
  • refactor: split common-meta-store by @PsiACE in #6456
  • refactor(metasrv): adjust some info level logging to debug or warn by @drmingdrmer in #6460
  • refactor: split hashtable by @PsiACE in #6467
  • refactor: migrate to common-base http shutdown by @PsiACE in #6469
  • refactor: intro common-http to reduce duplicate code by @PsiACE in #6484
  • refactor: try abandon internal parquet2 patches by @dantengsky in #6067
  • refactor: split common-users by @PsiACE in #6561
  • refactor(interpreter): refactor interpreter factory for reuse interpreters code by @zhang2014 in #6566
  • refactor: replace infallible by @PsiACE in #6568
  • refactor(query): improve performances for group by queries by @sundy-li in #6551
  • refactor(processor): remove old processor useless code by @zhang2014 in #6584
  • refactor(query): Stage Copy use internal InputFormat. by @youngsofun in #6638

Build/Testing/CI Infra Changes 🔌

  • chore: separate build/dev docker tools by @everpcpc in #4604
  • chore(ci): developing ci with docker build tool by @everpcpc in #4592
  • chore(ci): link with mold in build-tool by @everpcpc in #4681
  • add sys view information_schema.schemata by @TCeason in #4714
  • fix(ci): add jemalloc page size option for macos cross build by @everpcpc in #4721
  • ci: Fix mergify status check by @Xuanwo in #4748
  • ci(mergify): Remove not needed check for build by @Xuanwo in #4752
  • chore(ci): try keep git history by @PsiACE in #4770
  • Remove stateless test data dir by @GrapeBaBa in #4870
  • chore(ci): add clickhouse_driver to dev_setup by @TCeason in #4887
  • compatibility mysql insert and select by @TCeason in #4883
  • chore(stateless): add 11_data_type stateless category by @BohuTANG in #4892
  • build(dev_setup): add "install_pgk thrift" to dev_setup.sh by @dantengsky in #5081
  • feat(build): add thrift to dev setup by @dantengsky in #5110
  • feat(scripts/setup): Install jdk by @Xuanwo in #5255
  • Revert "chore(query): introduce meta Runtime" by @sundy-li in #5300
  • ci: MacOS's stateless standalone has been disabled by @Xuanwo in #5404
  • feat(test): Sql logic test framework improve by @ZeaLoVe in #5416
  • chore: Move all databend generted folders into .databend by @Xuanwo in #5446
  • chore: allow additonal headers in sql logic test http handler by @ZhiHanZ in #5457
  • chore: add feature flagging for logic tst by @ZhiHanZ in #5460
  • chore: sqllogic test ci by @ZeaLoVe in #5464
  • deps(tests): Fix toml is missing by @Xuanwo in #5470
  • ci: Add issues labeled A-storage into storage project by @Xuanwo in #5486
  • ci: Fix crowdin not configure correctly by @Xuanwo in #5546
  • fix flaky test by @TCeason in #5544
  • Support plan v2 for user() function and add stateless test by @TCeason in #5594
  • test(meta): check db's seq after create/drop/rename table by @RinChanNOWWW in #5637
  • [metasrv] ci: test-compat downloads config file for old query by @drmingdrmer in #5736
  • add test for abort pipeliner executor by @TCeason in #5807
  • feat(ci): enable logic test by @ZeaLoVe in #5836
  • ci: Move build test to self-hosted runners by @Xuanwo in #5867
  • ci: Make macOS checks optional by @Xuanwo in #5884
  • ci: Don't need to wait for macOS for publishing image by @Xuanwo in #5906
  • chore(ci): user friendly release by @PsiACE in #5932
  • test(stateless): add 02_0057_function_nullif result by @PsiACE in #5958
  • test: use fuse engine instead of memory engine in test by @b41sh in #5530
  • chore(release): add simple scripts and more info by @PsiACE in #6044
  • fix flaky test by @TCeason in #6159
  • ci(mergify): Enable speculative_checks and batch_size by @Xuanwo in #6201
  • ci: Only run developing workflow on pull request by @Xuanwo in #6203
  • ci(mergify): Push PR into queue when it passed all checks by @Xuanwo in #6207
  • chore(test): delete test generated tmp file by @fkuner in #6199
  • feat(test): add tpch stateless-test by @xudong963 in #6225
  • ci(mergify): Use mergify to check PR description by @Xuanwo in #6270
  • ci(mergify): Make yamllint happy by @Xuanwo in #6274
  • ci(mergify): Only embark PR while it's ready for review by @Xuanwo in #6288
  • ci(mergify): Fix build_musl not used correctly by @Xuanwo in #6294
  • [meta/schemaapi] test: move test cases into an entry function to simplify testing for every implementation by @drmingdrmer in #6326
  • ci: logic test with clickhouse handler by @ZeaLoVe in #6329
  • ci: Bump rust toolchain of build tools by @Xuanwo in #6376
  • Revert "ci: Bump rust toolchain of build tools" by @Xuanwo in #6377
  • ci: Bump rust toolchain for build-tools by @Xuanwo in #6378
  • ci: Bump rust to 1.64.0-nightly by @Xuanwo in #6375
  • ci: Enable semantic PRs by @Xuanwo in #6386
  • ci: Fix typo in mergify by @Xuanwo in #6387
  • ci: Fix databend release auto tag by @Xuanwo in #6391
  • ci: Fix generated tag not pushed by @Xuanwo in #6392
  • ci: Fix databend release not generated correctly by @Xuanwo in #6398
  • ci: Rename databend workflow files to avoid confuse github by @Xuanwo in #6405
  • ci: Add post checks for CLA and Semantic PR by @Xuanwo in #6419
  • ci: Don't start discussion while releasing by @Xuanwo in #6426
  • ci: Merge test tool release into databend release workflow by @Xuanwo in #6428
  • ci: Fix typo in test-tools release by @Xuanwo in #6429
  • ci: Enable cargo sparse-registry by @Xuanwo in #6437
  • ci: Disable merge queue to reduce CI usage by @Xuanwo in #6523
  • ci: Cleanup Makefile and setup scripts by @Xuanwo in #6550
  • ci: add timeout to dev linux jobs by @everpcpc in #6647

Documentation 📔

New Contributors

Full Changelog: v0.7.0-nightly...v0.8.0-nightly