Releases: lakesoul-io/LakeSoul
Releases · lakesoul-io/LakeSoul
v2.6.2
Full Changelog: v2.6.1...v2.6.2
v2.6.1
Full Changelog: v2.6.0...v2.6.1
v2.6.0
What's Changed
- [Rust] Apply clippy and fix typos; by @mag1c1an1 in #404
- [Docs] Add Spark Getting Started Guide by @Ceng23333 in #403
- [Docs] Add Flink Getting Started Guide by @moresun in #405
- [Docs] Modify Getting Started Env Guide by @F-PHantam in #406
- [Docs] Update docs format by @xuchen-plus in #408
- [Docs] Fix Docs Page Show Errors and Update LakeSoul Version by @F-PHantam in #409
- [Docs]Fine check usage cases of spark guide by @Ceng23333 in #411
- [Website] fix website zh-Hans Homepage docs link by @mag1c1an1 in #413
- [Docs] add pyspark in spark-guide by @moresun in #414
- [Spark/Rust/Test] Fix MergeOperatorSuite && Disable 3 cases by @Ceng23333 in #417
- [Spark] Implement columnar write for compaction by @xuchen-plus in #415
- [Spark] Add debug print for compaction tests by @xuchen-plus in #418
- [Docs] Update docs to 2.5.1 by @xuchen-plus in #419
- [Spark/Rust] Fix Unicode column name at native io by @Ceng23333 in #420
- Support sqlserver CDC by @ChenYunHey in #421
- [Docs] Fix python doc typos by @Ceng23333 in #425
- [Spark/Rust] Support filter on nesting column name by @Ceng23333 in #422
- [Docs] Add docs and recent blogs by @xuchen-plus in #423
- [Flink/Rust] Adjust rolling file logic to reduce memory usage during write by @xuchen-plus in #426
- [Rust] Enable metadata max retries by @Ceng23333 in #431
- [Flink] Fix CDC entry db name by @ChenYunHey in #430
- [Rust] Keep only push down rules in datafusion by @xuchen-plus in #432
- [Rust] Datafusion Catalog Support by @mag1c1an1 in #429
- [Flink] Fix non-primary key table's sink parallelism by @ChenYunHey in #433
- [Python] Fix python host build by @xuchen-plus in #434
- [Spark] Add sleep 1s for compaction tests by @xuchen-plus in #435
- [Docs] Add deployment docs by @xuchen-plus in #437
- [Rust] fix catalog unittest by @mag1c1an1 in #438
- [Flink]Fix create table options by @ChenYunHey in #436
- [Rust] fix panic by @mag1c1an1 in #440
- [Rust/CI] Add consistency-ci by @Ceng23333 in #441
- [Spark] Compaction bugfix by @Ceng23333 in #442
- [Rust]Fix Consistency CI by @Ceng23333 in #443
- [Flink] Shade guava for flink package by @xuchen-plus in #444
- Bump org.postgresql:postgresql from 42.5.1 to 42.5.5 in /lakesoul-common by @dependabot in #445
- [Project] Bump version by @xuchen-plus in #446
- [Spark] Fix spark rbac test by @xuchen-plus in #447
- [Rust]DataFusion connector supports partition column by @Ceng23333 in #449
- [Rust] add create split logic in rust by @mag1c1an1 in #448
- [Rust/BugFix]fix escape path error by @Ceng23333 in #450
- [Rust] (metadata) move metadataclient to rawclient by @mag1c1an1 in #451
- [NativeIO] Shade packages into lakesoul-io-java by @moresun in #453
- Bump mio from 0.8.10 to 0.8.11 in /rust by @dependabot in #456
- [Project] Add shaded jar for common and io by @xuchen-plus in #455
- [Project] Refine shade pacakges by @xuchen-plus in #457
- [Flink] Fix LakeSoul table export with timestamp local timezone type by @ChenYunHey in #427
- [Flink]fix readPartitionInfo on UpdateCommit by @Ceng23333 in #458
- [Project] Adjust pom and version by @xuchen-plus in #462
- [Flink]support Delete statement on partition column by @Ceng23333 in #459
- [Project] Fix pom flattern issue by @xuchen-plus in #467
- [Flink] Support MongoDB CDC Import/Export by @ChenYunHey in #460
- [Rust] add substrait for flink and be compatible for other engines by @mag1c1an1 in #454
- [Rust/Flink]Flink repartition pushdown by @Ceng23333 in #463
- [Flink]update global committer for bounded case by @Ceng23333 in #469
- [Flink]support flink watermark and computed column by @Ceng23333 in #472
- [Flink] Fix time type for flink. fix hdfs dir permission in cdc sync by @xuchen-plus in #477
- [Flink] Verify primary keys and partition keys during create table by @xuchen-plus in #475
- [Flink]fix flink update statement for non pk table by @Ceng23333 in #478
- [Flink] Add dependencies to shaded jar for flink by @xuchen-plus in #479
- Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /website by @dependabot in #482
- Bump whoami from 1.4.1 to 1.5.0 in /rust by @dependabot in #480
- Bump rustls from 0.21.10 to 0.21.12 in /rust by @dependabot in #481
- Bump tar from 6.2.0 to 6.2.1 in /website by @dependabot in #483
- Bump express from 4.18.2 to 4.19.2 in /website by @dependabot in #484
- Bump follow-redirects from 1.15.5 to 1.15.6 in /website by @dependabot in #485
- Bump h2 from 0.3.22 to 0.3.26 in /rust by @dependabot in #486
- [Flink] Throw exception when create table dir failed by @xuchen-plus in #487
- [Flink]Support dynamic partition pushdown for streaming source by @Ceng23333 in #489
- [NativeIO]support chrono partition column by @Ceng23333 in #490
- [Flink]Add Flink DataStream Sink for Arrow RecordBatch by @Ceng23333 in #491
- [Flink] Fix sql submit main entry by @xuchen-plus in #494
- [Flink] Fix flink package by @xuchen-plus in #495
- [Flink] add arrow datastream source by @Ceng23333 in #496
- [Spark] Fix compaction for cdc table by @xuchen-plus in #498
- Bump braces from 3.0.2 to 3.0.3 in /website by @dependabot in #497
- [Fix]fix LakeSoulArrowSource serialization by @Ceng23333 in #499
- [Native] Use mimalloc in native libs by @xuchen-plus in #500
- Revert "[Native] Use mimalloc in native libs" by @xuchen-plus in #501
- [Flink] Fix flink select only partition column by @xuchen-plus in #502
- [Python/Native] Support Python read pk table by @xuchen-plus in #503
- [Flink] Fix hdfs ns dir permission by @xuchen-plus in #504
- [Fix/Spark] Fix spark type compatibility with flink's time and timestamp types by @Ceng23333 in #505
- [Native] Update hdfs-sys to use 3.3 libhdfs version by @xuchen-plus in #506
- [NativeIO]Doris filter support by @Ceng23333 in #507
- [Docs] Update version in docs by @xuchen-plus in #508
New Contributors
- @mag1c1an1 made their first contribution in #404
Full Changelog: v2.5.4...v2.6.0
v2.5.4
- Fix class shading in lakesoul common
v2.5.3
- Add shaded packages for release
- Fix compaction may write to incorrect partition
v2.5.1
- Fix Flink sink parallelism for non-primary key table;
- Fix native io filter for non-ascii names and nested columns;
- Optimize compaction performance.
v2.5.0 & Python 1.0.0b1
LakeSoul 2.5.0 Release Note
What's New
- Python Reader supports PyTorch, PyArrow, Pandas, Ray, and distributed execution;
- Support Spark Gluten Vectorized Engine;
- Spark SQL supports Compaction, Rollback and other Call Procedures;
- Flink CDC’s entire database synchronization supports MySQL, PostgreSQL, PolarDB, and Oracle;
- Support streaming and batch export to MySQL, PostgreSQL, PolarDB, and Apache Doris;
- Optimized NativeIO performance.
更新内容
- Python Reader 支持 PyTorch、PyArrow、Pandas、Ray,支持分布式执行;
- 支持 Spark Gluten Vectorized Engine;
- Spark SQL 支持 Compaction、Rollback 等 Call Procedures;
- Flink CDC 整库同步支持 MySQL、PostgreSQL、PolarDB、Oracle;
- 支持流式、批式出湖至 MySQL、PostgreSQL、PolarDB、Apache Doris;
- 优化 NativeIO 性能.
What's Changed
- [Spark]rename MetaVersion at lakesoul-spark as SparkMetaVersion by @Ceng23333 in #353
- [Metadata]Replace table_info.table_schema with arrow kind schema (Backward Compatibility) by @Ceng23333 in #354
- [Python][Dataset] Add Ray reading support by @codingfun2022 in #355
- [Spark]optimize incremental read and fix compact operation cause column disorder bug by @F-PHantam in #352
- [Rust] Create Rust CI by @Ceng23333 in #356
- [Rust][Metadata]Create Rust MetadataClient & add CI test cases by @Ceng23333 in #357
- [Rust][NativeIO]Use stable rustc for lakesoul-io feature default by @Ceng23333 in #358
- [Python][Rust][Metadata] Update python metadata interface && Full arrow types test by @Ceng23333 in #359
- [Spark] Spark Sql Support 'drop partition' Operation by @F-PHantam in #360
- [Python]python deserialized schema from java by @Ceng23333 in #361
- [Python] Fix wheel building; update version to 1.0.0b1 by @codingfun2022 in #362
- [Rust][Metadata]Asynchronized rust metadata method by @Ceng23333 in #365
- Add some rust test cases by @zhaishuangszszs in #364
- [Datafusion]Implement LakeSoul Catalog by @Ceng23333 in #366
- [Rust] add upsert test cases by @zhaishuangszszs in #367
- [Flink] update fury version to 0.4 by @xuchen-plus in #368
- refine upsert test by @zhaishuangszszs in #369
- [Spark] support call sql syntax by @moresun in #370
- [Rust]DataFusion version upgraded to 33.0.0 by @Ceng23333 in #372
- [Spark] Support Gluten Vectorized Engine by @xuchen-plus in #374
- [Flink] Support oracle cdc source by @ChenYunHey in #375
- [NativeIO] Use rust block api in file read by @xuchen-plus in #377
- [Flink] Add export to external dbs for LakeSoul's tables by @ChenYunHey in #376
- [Rust] Add LakeSoulHashTable Sink for DataFusion by @Ceng23333 in #382
- [NativeIO] Enable parquet rowgroup prefetch. Support s3 host style access by @xuchen-plus in #384
- [Rust]fix hash value to spark_murmur3 by @Ceng23333 in #385
- [BugFix]Fails when create table with nullable hash colmun by @Ceng23333 in #387
- [Flink] Add Jdbc cdc sources and sinks by @ChenYunHey in #381
- [Python] fix python meta config parse logic by @xuchen-plus in #388
- [Project/Doc] Bump version to 2.5.0 and update docs by @xuchen-plus in #389
- Bump postcss from 8.4.23 to 8.4.33 in /website by @dependabot in #396
- Bump @babel/traverse from 7.21.5 to 7.23.7 in /website by @dependabot in #393
- Bump follow-redirects from 1.15.2 to 1.15.4 in /website by @dependabot in #399
- Bump org.apache.avro:avro from 1.11.0 to 1.11.3 in /lakesoul-spark by @dependabot in #394
- Bump com.google.guava:guava from 30.1.1-jre to 32.0.0-jre in /lakesoul-presto by @dependabot in #395
- [Rust] Update arrow rs dependencies by @xuchen-plus in #400
New Contributors
- @zhaishuangszszs made their first contribution in #364
Full Changelog: v2.4.1...v2.5.0
Release v2.4.1
What's Changed
- [Flink] Flink can configure global warehouse dir by @F-PHantam in #342
- [NativeIO] Implement DataFusion TableProvider by @Ceng23333 in #341
- [Spark]Spark parquet filter pushdown exactly by @Ceng23333 in #343
- [Spark]Spark parquet filter pushdown evaluation + bugfix by @Ceng23333 in #344
- [Meta] fix meta field compatibility in partition info table by @xuchen-plus in #345
- [Common] Cleanup redundant DataOperation by @Ceng23333 in #346
- [Docs] add kyuubi with lakesoul setup doc. by @Asakiny in #348
- [Native-Metadata] Adaptive jnr buffer size by @Ceng23333 in #347
- [NativeIO][Bug] LakeSoulParquetProvider projection bugfix by @Ceng23333 in #349
- [NativeIO] Enable parquet prefetch & use stable sort by @xuchen-plus in #350
Full Changelog: v2.4.0...v2.4.1
LakeSoul Release v2.4.0 and Python 1.0 Beta
What's New In This Release
- RBAC support for all query engines. doc
- Auto cleaning of old compaction data and partition TTL. doc
- Upgrade Flink version to 1.17 and support row level update/delete in batch sql.
- Optimize whole database Flink cdc sync throughput by 80%: #307
- Presto Reader; doc
- Python reader and integration with PyTorch and HuggingFace. doc
本次更新内容
- 支持 RBAC 角色权限控制,对所有引擎、所有语言API均有效;文档
- 自动清理旧的 compaction 数据,支持分区级生命周期(TTL);文档
- 升级 Flink 版本到 1.17,并支持批模式下行级别更新和删除;
- 优化整库同步 Flink 作业,吞吐提升 80%: #307 ;
- 支持 Presto 读取;文档
- 支持原生 Python 读取,提供 PyTorch、HuggingFace 的集成。文档
What's Changed
- [NativeIO] Upgrade datafusion to 27 by @xuchen-plus in #282
- [Flink] implement filter pushdown and fix partition pushdown in flink by @xuchen-plus in #287
- Upgrade Flink to 1.17 by @xuchen-plus in #288
- [Python][NativeIO] Add C interface definition by @xuchen-plus in #291
- [NativeIO] update arrow version by @xuchen-plus in #290
- Add Built-in RBAC support by @clouddea in #292
- fix apache license by @clouddea in #293
- [Native-Metadata] Rust implementation of DAO layer by @Ceng23333 in #294
- [Flink] fix jackson-core package in flink by @xuchen-plus in #297
- [Docs] update docs by @xuchen-plus in #298
- [Flink] upgrade flink cdc connector to 2.4 by @xuchen-plus in #303
- clean old compaction data and redundant data by @ChenYunHey in #304
- [Python][Native-Metadata] Python interface of lakesoul metadata by @Ceng23333 in #305
- [Python] C callback with data by @xuchen-plus in #306
- [Python][Dataset] PyArrow and PyTorch dataset api for LakeSoul by @codingfun2022 in #308
- [Flink] rollback flink cdc to 2.3.0 and supplement tables check in benchmark by @F-PHantam in #309
- [Flink] Optimize CDC sink serde with Fury by @xuchen-plus in #307
- [NativeIO] add hdfs feature in lakesoul-io-c by @xuchen-plus in #311
- [Python] exclude partition column at get_arrow_schema_by_table_name by @Ceng23333 in #312
- [Native-Metadata] Retry when native metadata client fail by @Ceng23333 in #313
- [Flink] cdc supplement data delay check mechanism and fix logicallyDropColumn bug by @F-PHantam in #315
- Presto Connector Support by @clouddea in #314
- add scala in common to address build in idea intellij by @xuchen-plus in #316
- [Flink] Ignore exception when hadoop env missing by @xuchen-plus in #317
- [NativeIO] Merge native modules by @Ceng23333 in #318
- bump version to 2.4.0 by @xuchen-plus in #319
- [RBAC] Set hdfs dir owner by @xuchen-plus in #321
- [BugFix]support query metadata with null string by @Ceng23333 in #324
- [Spark] list namespace should return empty array by @xuchen-plus in #323
- [Python][Dataset] Update Python dataset api for LakeSoul by @codingfun2022 in #325
- [Python] Examples using Python API for AI model training by @Ceng23333 in #327
- update docs and readme for release 2.4 by @xuchen-plus in #328
- [Docs] Usage on auto table clean by @ChenYunHey in #326
- [Docs] Add presto connector deployment docs by @xuchen-plus in #329
- [Docs] Add docs for Python and PyTorch by @Ceng23333 in #330
- [Docs] add workspace and rbac docs by @xuchen-plus in #331
- [Bug] turn off native meta query and temporarily disable io prefetch by @F-PHantam in #333
- [Bug]filter should not pushdown before merge on read by @Ceng23333 in #310
- Support view、batch update、batch delete in flink by @moresun in #332
- [Docs ] Refine flink sql and python docs by @xuchen-plus in #337
Full Changelog: https://github.com/lakesoul-io/LakeSoul/commits/v2.4.0
LakeSoul Release v2.3.1
- Fix jackson-core packaging for Flink package
- Fix commons-lang class missing
- Fix snapshot rollback/cleanup with local timezone