ClickHouse connector is one of Conduit plugins. It provides both, a Source and a Destination ClickHouse connectors.
Connector uses Golang SQL database driver for Yandex ClickHouse.
- Go 1.21
- (optional) golangci-lint 1.55.2
Run make build
.
Run make test
to run all unit and integration tests. To run the integration test, set the ClickHouse database URL to
the environment variables as an CLICKHOUSE_URL
.
The ClickHouse Source Connector allows you to move data from the ClickHouse table to Conduit Destination connectors.
It supports all engines of the MergeTree and Log families.
The iterator selects existing rows from the selected table in batches with ordering and where claus:
SELECT {{config.columns}}
FROM {{config.table}}
WHERE {{config.orderingColumn}} > {{position}}
ORDER BY {{config.orderingColumn}};
When all existing data has been read, the connector will only detect new rows.
The position contains the orderingColumn
value of the last processed row. This value is used in the where clause of
the SELECT query.
The metadata of each record is appended by the clickhouse.table
key with the value of the table name from the
configuration.
name | description | required | example |
---|---|---|---|
url |
DSN to connect to the database. | true | http://username:password@host1:8123/database |
table |
Name of the table that the connector should read. | true | table_name |
orderingColumn |
Column name that the connector will use for ordering rows. Column must contain unique values and suitable for sorting, otherwise the snapshot won't work correctly. | true | id |
snapshot |
whether to take a snapshot of the entire table before starting CDC mode or not. The default is true . |
false | false , true |
keyColumns |
Comma-separated list of column names to build the opencdc.Record.Key . See more: key handling. |
false | id,name |
batchSize |
Size of rows batch. Min is 1 and max is 100000. The default is 1000. | false | 100 |
The keyColumns
is an optional field. If the field is empty, the system makes a request to the database and uses the
received list of primary keys of the specified table. If the table does not contain primary keys, the system uses the
value of the orderingColumn
field as the keyColumns
value.
The ClickHouse Destination allows you to move data from any Conduit Source to a ClickHouse table. It takes
a opencdc.Record
and parses it into a valid SQL
query. Log family engines do not support data changes, so
in case of OperationUpdate
or OperationDelete
operations they will return the next
error: Table engine {{table_engine}} doesn't support mutations.
If a record contains a clickhouse.table
property in its metadata, it will work with this table, otherwise, it will
fall back to use the table configured in the connector. Thus, a Destination can support multiple tables in a single
connector, as long as the user has proper access to those tables.
name | description | required | example |
---|---|---|---|
url |
DSN to connect to the database. | true | http://username:password@host1:8123/database |
table |
Name of the table that the connector should write to. | true | table_name |
keyColumns |
Comma-separated list of column names for key handling. | false | id,name |
sdk.rate.perSecond |
Maximum times the Write function can be called per second (0 means no rate limit). | false | 200 |
sdk.rate.burst |
Allow bursts of at most X writes (0 means that bursts are not allowed). | false | 10 |
If the opencdc.Record.Key
is empty, it is formed from opencdc.Record.Payload
data by the comma-separated keyColumns
list
of keys (for update operations only).
Creating a Source or Destination connector will fail if the table does not exist or if the user does not have permission to work with the specified table.