Release Notes

0.9.2

545 Support Python 3.12
548 Support single dict[str,Any] as transformer input and output

0.9.1

543 Support type hinting with standard collections
544 Fix Spark connect import issue on worker side

0.9.0

482 Move Fugue SQL dependencies into extra [sql] and functions to become soft dependencies
504 Create Fugue pytest fixtures and plugins
541 Change table temp view names to uppercase
540 Fix Ray 2.10+ compatibility issues
539 Fix compatibility issues with Dask 2024.4+
534 Remove ibis version cap
505 Deprecate as_ibis in FugueWorkflow
387 Improve test coverage on 3.10, add tests for 3.11
269 Spark and Dask Take 1 row without sorting optimization

0.8.7

488 Migrate from fs to fsspec
521 Add as_dicts to Fugue API
516 Use _collect_as_arrow for Spark `as_arrow``
520 Add Python 3.10 to Windows Tests
506 Adopt pandas ExtensionDType
504 Create Fugue pytest fixtures
503 Deprecate python 3.7 support
501 Simplify zip/comap, remove join from the implementation
500 Implement all partitioning strategies for Dask
495 Resolve segfault on Duckdb 0.8.1
494 Remove the version cap of Dask

0.8.6

497 Make LocalExecutionEngine respect partition numbers
493 Spark Pandas UDF partitioning improvement
492 Made AnyDataFrame recognized by Creator, Processor and Ouputter
490 Fixed pa.Table as transformer output bug
489 Added version cap to Ibis
485 Made Fugue compatible with Ray 2.5.0
486 Added py.typed to Fugue

0.8.5

481 Moved Fugue SQL dependencies into functions as soft dependencies
478 Removed cloudpickle from the hard dependency of Spark backend
477 Removed tests folder from Fugue package
476 Fix compatibility issues for Pandas 2+ and Spark < 3.4

0.8.4

471 Fix compatibility issues for duckdb 0.8.0+
466 Fix Ray 2.4.0 compatibility issue
464 Support for spark/databricks connect
459 DEPRECATION: Avro support
455 Make Fugue pandas 2 compatible

0.8.3

449 Add coarse partitioning concept
452 Add as_fugue_engine_df

0.8.2

430 Support Polars DataFrames
434 Make Transformations data format aware
408 Remove SQLite support
444 Clean up FunctionWrapper

0.8.1

423 Add seaborn as a domain level extension for visualization
422 Add pandas_df.plot as the first namespace extension
421 Add the namespace concept to Fugue extensions
420 Add is_distributed to engines
419 Log transpiled SQL query upon error

0.8.0

384 Expanding Fugue API
410 Unify Fugue SQL dialect (syntax only)
409 Support arbitrary column names in Fugue
404 Ray/Dask engines guess optimal default partitions
403 Deprecate register_raw_df_type
392 Aggregations on Spark dataframes fail intermittently
398 Rework API Docs and Favicon
393 ExecutionEngine as_context
385 Remove DataFrame metadata
381 Change SparkExecutionEngine to use pandas udf by default
380 Refactor ExecutionEngine (Separate out MapEngine)
378 Refactor DataFrame show
377 Create bag
372 Infer execution engine from input
340 Migrate to plugin mode
369 Remove execution from FugueWorkflow context manager, remove engine from FugueWorkflow
373 Fixed Spark engine rename slowness when there are a lot of columns

0.7.3

362 Remove Python 3.6 Support
363 Create IbisDataFrame and IbisExecutionEngine
364 Enable Map type support
365 Support column names starting with numbers
361 Better error message for cross join

0.7.2

348 Make create data error more informative
349 Ray integration, phase 1: transformation and IO

0.7.1

345: Enabled file as input/output for transform and out_transform

0.7.0

326: Added tests for Python 3.6 - 3.10 for Linux and 3.7 - 3.9 for Windows. Updated devenv and CICD to Python 3.8.
321: Moved out Fugue SQL to https://github.com/fugue-project/fugue-sql-antlr, removed version cap of antlr4-python3-runtime
323: Removed version cap of DuckDB
334: Replaced RLock with SerializableRLock
337: Fixed index warning in fugue_dask
339: Migrated execution engine parsing to triad conditional_dispatcher
341: Added Dask Client to DaskExecutionEngine, and fixed bugs of Dask and Duckdb

0.6.6

Create a hybrid engine of DuckDB and Dask
Save Spark-like partitioned parquet files for all engines
Enable DaskExecutionEngine to transform dataframes with nested columns
A smarter way to determine default npartitions in Dask
Support even partitioning on Dask
Add handling of nested ArrayType on Spark
Change to plugin approach to avoid explicit import
Fixed Click version issue
Added version caps for antlr4-python3-runtime and duckdb as they both released new versions with breaking changes.

0.6.5

Make Fugue exceptions short and useful
Ibis integration (experimental)
Get rid of simple assignment (not used at all)
Improve DuckDB engine to use a real DuckDB ExecutionEngine
YIELD LOCAL DATAFRAME

0.6.4

Add an option to transform to turn off native dataframe output
Add callback parameter to transform and out_transform
Support DuckDB
Create fsql_ignore_case for convenience, make this an option in notebook setup
Make Fugue SQL error more informative about case issue
Enable pandas default SQL engine (QPD) to take lower case SQL

0.6.3

Change pickle to cloudpickle for Flask RPC Server
Add license to package

0.6.1

Parsed arbitrary object into execution engine
Made Fugue SQL accept +, ~, - in schema expression
Fixed transform bug for Fugue DataFrames
Fixed a very rare bug of annotation parsing

0.6.0

Added Select, Aggregate, Filter, Assign interfaces
Made compatible with Windows OS, added github actions to test on windows
Register built-in extensions
Accept platform dependent annotations for dataframes and execution engines
Let SparkExecutionEngine accept empty pandas dataframes
Move to codecov
Let Fugue SQL take input dataframes with name such as a.b

0.5.6

Dask repartitioning improvement
Separate Dask IO to use its own APIs
Improved Dask print function by adding back head
Made assert_or_throw lazy
Improved notebook setup handling for jupyter lab

0.5.5

HOTFIX avro support

0.5.4

Added built in avro support
Fixed dask print bug

0.5.3

Fixed multi take issue for dask
Fixed pandas, dask print slow

0.5.2

Added Codacy and Slack channel badges, fixed pylint
Created transform and out_transform functions
Added partition syntax sugar
Fixed FugueSQL CONNECT bug

0.5.1

Fugueless 1 2 3 4 5
Notebook experience and extension 1 2
NativeExecutionEngine: switched to use QPD for SQL
Spark pandas udf: migrate to applyInPandas and mapInPandas
SparkExecutionEngine take bug
Fugue SQL: PRINT ROWS n -> PRINT n ROWS|ROW
Refactor yield
Fixed Jinja templating issue
Change _parse_presort_exp from a private function to public
Failure to delete execution temp directory is annoying was changed to info

0.5.0

Limit and Limit by Partition
README code is working now
Limit was renamed to take and added to SQL interface
RPC for Callbacks to collect information from workers in real time
Changes in handling input dataframe determinism. This fixes a bug related to thread locks with Spark DataFrames because of a deepcopy.

0.4.9

sample function
Make csv infer schema consistent cross engine
Make loading file more consistent cross engine

0.4.8

Support **kwargs in interfaceless extensions, see this
Support Iterable[pd.DataFrame] as output type, see this
Alter column types
RENAME in Fugue SQL
CONNECT different SQL service in Fugue SQL
Fixed Spark EVEN REPARTITION issue

0.4.7

Add hook to print/show, see this.

0.4.6

Fixed import issue with OutputTransformer
Added fillna as a built-in transform, including SQL implementation

0.4.5

Extension validation interface and interfaceless syntax
Passing dataframes cross workflow (yield)
OUT TRANSFORM to transform and finish a branch of execution
Fixed a PandasDataFrame datetime issue that only happened in transformer interface approach

0.4.3

Unified checkpoints and persist
Drop columns and na implementations in both programming and sql interfaces
Presort takes array as input
Fixed jinja template rendering issue
Fixed path format detection bug

0.4.2

Require pandas 1.0 because of parquet schema
Improved Fugue SQL extension parsing logic
Doc for contributors to setup their environment

0.4.1

Added set operations to programming interface: union, subtract, intersect
Added distinct to programming interface
Ensured partitioning follows SQL convention: groups with null keys are NOT removed
Switched join, union, subtract, intersect, distinct to QPD implementations, so they follow SQL convention
Set operations in Fugue SQL can directly operate on Fugue statemens (e.g. TRANSFORM USING t1 UNION TRANSFORM USING t2)
Fixed bugs
Added onboarding document for contributors

<=0.4.0

Main features of Fugue core and Fugue SQL
Support backends: Pandas, Spark and Dask

Files

RELEASE.md

Latest commit

History

RELEASE.md

File metadata and controls

Release Notes

0.9.2

0.9.1

0.9.0

0.8.7

0.8.6

0.8.5

0.8.4

0.8.3

0.8.2

0.8.1

0.8.0

0.7.3

0.7.2

0.7.1

0.7.0

0.6.6

0.6.5

0.6.4

0.6.3

0.6.1

0.6.0

0.5.6

0.5.5

0.5.4

0.5.3

0.5.2

0.5.1

0.5.0

0.4.9

0.4.8

0.4.7

0.4.6

0.4.5

0.4.3

0.4.2

0.4.1

<=0.4.0