Releases · dimajix/flowman

09 Sep 07:46

kupferk

0.27.0

db59d45

Flowman 0.27.0

The highlights of this release are

New ' jdbcCommand' target for executing arbitrary SQL statements for JDBC sinks
Support direct SQL statements in JDBC relations for creating tables
Upgrade Delta Lake to 2.0.0/2.1.0 (for Spark 3.2 and 3.3 respectively)
Better error messages
Bug fixes and other improvements

In detail, this release contains the following changes:

github-232: [BUG] Column descriptions should be propagates in UNIONs
github-233: [BUG] Missing Hadoop dependencies for S3, Delta, etc
github-235: Implement new rest hook with fine control
github-229: A build target should not fail if Impala "COMPUTE STATS" fails
github-236: 'copy' target should not apply output schema
github-237: jdbcQuery relation should use fields "sql" and "file" instead of "query"
github-239: Allow optional SQL statement for creating jdbcTable
github-238: Implement new 'jdbcCommand' target
github-240: [BUG] Data quality checks in documentation should not fail on NULL values
github-241: Throw an error on duplicate entity definitions
github-220: Upgrade Delta-Lake to 2.0 / 2.1
github-242: Switch to Spark 3.3 as default
github-243: Use alternative Spark MS SQL Connector for Spark 3.3
github-244: Generate project HTML documentation with optional external CSS file

Assets 13

03 Aug 07:17

kupferk

0.26.1

8d5e019

0.26.1 (minor release with reduced prebuilt dists)

This is a minor release addressing some specific issues:

Detailed Changes

github-226: Upgrade to Spark 3.2.2
github-227: [BUG] Flowman should not fail with field names containing "-", "/" etc
github-228: Padding and truncation of CHAR(n)/VARCHAR(n) should be configurable

Since this is only a minor release with a limited impact, please find more prebuilt variants in the 0.26.0 relase

Assets 3

27 Jul 18:52

kupferk

0.26.0

fb34202

0.26.0

Version 0.26.0 of Flowman is another high quality release with a strong focus on improving the work with JDBC targets like Postgres, MariaDB, MS SQL Server, Oracle and more. Also new is the support for Spark 3.3, although still not as battle-proven as Spark 3.2. Moreover many smaller bugs and improvements have been fixed.

Detailed Changes

github-202: Add support for Spark 3.3
github-203: [BUG] Resource dependencies for Hive should be case-insensitive
github-204: [BUG] Detect indirect dependencies in a chain of Hive views
github-207: [BUG] Build should not directly fail if inferring dirty status fails
github-209: [BUG] HiveViews should not trigger cascaded refresh during CREATE phase even when nothing is changed
github-211: Implement new hiveQuery relation
github-210: [BUG] HiveTables should be migrated if partition columns change
github-208: Implement JDBC hook for database based semaphores
github-212: [BUG] Hive views should not be migrated in RELAXED mode if only comments have changed
github-214: Update ImpalaJDBC driver to 2.6.26.1031
github-144: Support changing primary key for JDBC relations
github-216: [BUG] Floats should be represented as FLOAT and not REAL in MySQL/MariaDB
github-217: Support collations for creating/migrating JDBC tables
github-218: [BUG] Postgres dialect should be used for Postgres JDBC URLs
github-219: [BUG] SchemaMapping should retain incoming comments
github-215: Support COLUMN STORE INDEX for MS SQL Server
github-182: Support column descriptions in JDBC relations (SQL Server / Azure SQL)
github-224: Support column descriptions for MariaDB / MySQL databases
github-223: Support column descriptions for Postgres database
github-205: Initial support Oracle DB via JDBC
github-225: [BUG] Staging schema should not have comments

Breaking changes

We take backward compatibility very seriously. But sometimes a breaking change is needed to clean up code and to
enable new features. This release contains some breaking changes, which are annoying but simple to fix.
In order to respect null as keyword in YAML with a special semantics, some entities needed to be renamed, as
described in the following table:

category	old kind	new kind
mapping	null	empty
relation	null	empty
target	null	empty
store	null	none
history	null	none

Assets 14

15 Jun 16:23

kupferk

0.25.1

6a38e94

0.25.1 (Source only release)

This is minor bugfix release

Detailed Changes

github-195: [BUG] Metric "target_records" is not reset correctly after an execution phase is finished
github-197: [BUG] Impala REFRESH METADATA should not fail when dropping views

Assets 2

31 May 14:22

kupferk

0.25.0

bf84866

0.25.0

github-184: Only read in *.yml / *.yaml files in module loader
github-183: Support storing SQL in external file in hiveView
github-185: Missing _SUCCESS file when writing to dynamic partitions
github-186: Support output mode OVERWRITE_DYNAMIC for Delta relation
github-149: Support creating views in JDBC with new jdbcView relation
github-190: Replace logo in documentation
github-188: Log detailed timing information when writing to JDBC relation
github-191: Add user provided description to quality checks
github-192: Provide example queries for JDBC metric sink

Assets 12

29 Apr 14:05

kupferk

0.24.1

ff137b7

0.24.1

github-175: '--jobs' parameter starts way to many parallel jobs
github-176: start-/end-date in report should not be the same
github-177: Implement generic SQL schema check
github-179: Update DeltaLake dependency to 1.2.1

Assets 12

05 Apr 15:48

kupferk

0.24.0

a8f4a76

0.24.0

github-168: Support optional filters in data quality checks
github-169: Support sub-queries in filter conditions
github-171: Parallelize loading of project files
github-172: Update CDP7 profile to the latest patch level
github-153: Use non-privileged user in Docker image
github-174: Provide application for generating YAML schema

Breaking changes

We take backward compatibility very seriously. But sometimes a breaking change is needed to clean up code and to
enable new features. This release contains some breaking changes, which are annoying but simple to fix.
In order to avoid YAML schema inconsistencies, some entities needed to be renamed, as described in the following
table:

category	old kind	new kind
mapping	const	values
mapping	empty	null
mapping	read	relation
mapping	readRelation	relation
mapping	readStream	stream
relation	const	values
relation	empty	null
relation	jdbc	jdbcTable, jdbcQuery
relation	table	hiveTable
relation	view	hiveView
schema	embedded	inline

Assets 12

29 Mar 04:52

kupferk

0.23.1

aeaf752

0.23.1

github-154: Fix failing migration when PK requires change due to data type
github-156: Recreate indexes when data type of column changes
github-155: Project level configs are used outside job
github-157: Fix UPSERT operations for SQL Server
github-158: Improve non-nullability of primary key column
github-160: Use sensible defaults for default documenter
github-161: Improve schema caching during execution
github-162: ExpressionColumnCheck does not work when results contain NULL values
github-163: Implement new column length quality check

Assets 12

18 Mar 17:12

kupferk

0.23.0

55199f2

0.23.0

The main feature of this version is a significant improvement of the new documentation system, which now also includes column level lineage. The automatically generated documentation is a valuable artifact for both developers and business experts to improve the understanding of the data models and transformations. Flowman projects can also specify quality checks (like NOT NULL condition, foreign key relationships or arbitrary SQL expressions), which are not only included in the documentation but also executed on the real data.

Moreover support for SQL databases has been improved again with the introduction of temporary staging tables to perform updates within a transactional commit.

Detailed Changes

github-148: Support staging table for all JDBC relations
github-120: Use staging tables for UPSERT and MERGE operations in JDBC relations
github-147: Add support for PostgreSQL
github-151: Implement column level lineage in documentation
github-121: Correctly apply documentation, before/after and other common attributes to templates
github-152: Implement new 'cast' mapping

Assets 12

01 Mar 15:01

kupferk

0.22.0

80a9ec4

0.22.0

Add new sqlserver relation
Implement new documentation subsystem
Change default build to Spark 3.2.1 and Hadoop 3.3.1
Add new drop target for removing tables
Speed up project loading by reusing Jackson mapper
Implement new jdbc metric sink
Implement schema cache in Executor to speed up documentation and similar tasks
Add new config variables flowman.execution.mapping.schemaCache and flowman.execution.relation.schemaCache
Add new config variable flowman.default.target.verifyPolicy to ignore empty tables during VERIFY phase
Implement initial support for indexes in JDBC relations

Assets 12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detailed Changes

Detailed Changes

Breaking changes

Detailed Changes

Breaking changes

Detailed Changes

Releases: dimajix/flowman

Flowman 0.27.0

0.26.1 (minor release with reduced prebuilt dists)

Detailed Changes

0.26.0

Detailed Changes

Breaking changes

0.25.1 (Source only release)

Detailed Changes

0.25.0

0.24.1

0.24.0

Breaking changes

0.23.1

0.23.0

Detailed Changes

0.22.0