Conduit's connector for S3 and Apache Iceberg

The S3-Apache Iceberg connector is a Conduit destination connector, which makes it possible to write data into S3 in the Apache Iceberg format.

Pre-requisites

JDK 17
Currently, only Unix-like OSes are supported.

How to build

First, the connector needs to be built, which can be done with scripts/dist.sh. A new directory, dist, will be created. The contents of dist need to be copied into the Conduit connectors' directory (which, by default, is connectors). Read more about installing Conduit connectors here.

Destination

This destination connector pushes data from upstream resources to an S3 iceberg table via Conduit.

For inserts, a record is inserted into the table, which will append an iceberg formatted file into S3. For Deletes, the Record.Key field from the OpenCDC record will be used as a condition for deleting a record. Finally, for Updates, the record will first be deleted according to its Record.Key value, then the updated version will be inserted.

Configuration

Field JSON Name	Description	Required	Default Value	Example
`catalog.name`	Catalog name	true	-	"exampleCatalog"
`namespace`	Namespace	true	-	"exampleNamespace"
`table.name`	Table name	true	-	"exampleTable"
`s3.endpoint`	S3 endpoint URL	true	-	"https://s3.example.com"
`s3.accessKeyId`	S3 Access Key ID	true	-	"exampleAccessKeyId"
`s3.secretAccessKey`	S3 Secret Access Key	true	-	"exampleSecretKey"
`s3.region`	S3 region	true	-	"us-east-1"
`catalog.catalog-impl`	Catalog implementation to be used	true	-	`"org.apache.iceberg.rest.RESTCatalog"` Possible values: - `"org.apache.iceberg.hadoop.HadoopCatalog"` - `"org.apache.iceberg.jdbc.JdbcCatalog"` - `"org.apache.iceberg.rest.RESTCatalog"`
`catalog.{propertyName}`	Set a catalog property with the name `propertyName`	true	-	{"catalog.uri": "http://localhost:8181"}

Example pipeline configuration file

---
version: 2.0
pipelines:
 - id: postgres-to-s3-iceberg
   status: running
   description: Postgres to S3 Iceberg
   connectors:
     - id: pg-source
       type: source
       plugin: "builtin:postgres"
       name: source1
       settings:
         url: "postgresql://username:password@localhost/dbname?sslmode=disable"
         key: id
         table: "iceberg_input"
         snapshotMode: "never"
         cdcMode: "logrepl"
     - id: s3-iceberg-destination
       type: destination
       plugin: standalone:s3-iceberg
       name:  destination1
       settings:
         namespace: "conduit"
         table.name: "test_table"
         catalog.name: "demo"
         catalog.catalog-impl: "org.apache.iceberg.rest.RESTCatalog"
         catalog.uri: "http://localhost:8181"
         s3.endpoint: "http://localhost:9000"
         s3.accessKeyId: "admin"
         s3.secretAccessKey: "password"
         s3.region: "us-east-1"

check Pipeline Configuration Files Docs for more details about how to run this configuration file.

Known Limitations

In rare occasions, when updating a record (which consists of first deleting and then inserting the updated one), If an error occurred while inserting the new record, then an Error will be returned, and a record might be lost if the error happened after the delete operation and before the insert. Check your table's snapshots to recover the data if lost.

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
.github		.github
.idea		.idea
checkstyle		checkstyle
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conduit's connector for S3 and Apache Iceberg

Pre-requisites

How to build

Destination

Configuration

Example pipeline configuration file

Known Limitations

About

Releases

Packages

Contributors 5

Languages

conduitio-labs/conduit-connector-s3-iceberg

Folders and files

Latest commit

History

Repository files navigation

Conduit's connector for S3 and Apache Iceberg

Pre-requisites

How to build

Destination

Configuration

Example pipeline configuration file

Known Limitations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages