This connector writes data to Iceberg tables using the V2 specification. To optimize write performance, delete events are recorded in delete files, avoiding costly data file rewrites. While this approach significantly improves write performance, it can impact read performance, especially in upsert
mode. However, in append
mode, this performance trade-off is not applicable.
Full schema evolution, such as converting incompatible data types, is not currently supported. However, schema expansion, including adding new fields or expanding existing field data types, is supported. To enable this behavior, set the
debezium.sink.iceberg.allow-field-addition
configuration property to true
.
By default, the Debezium connector will replicate all the tables in the database, resulting in unnecessary load. To avoid replicating tables you don't need, configure the debezium.source.table.include.list
property to specify the exact tables to replicate. This will streamline your data pipeline and reduce the overhead. For more details on this configuration, refer to the Debezium server source documentation.
You can setup aws credentials in the following ways:
- Option 1: use
debezium.sink.iceberg.fs.s3a.access.key
anddebezium.sink.iceberg.fs.s3a.secret.key
inapplication.properties
- Option 2: inject credentials to environment variables
AWS_ACCESS_KEY
andAWS_SECRET_ACCESS_KEY
- Option 3: setup proper
HADOOP_HOME
env then add s3a configuration intocore-site.xml
, more information can be found here.