The Vitess connector is one of Conduit plugins. It provides both, a source and a destination Vitess connector.
- Go 1.21
- MySQL versions 5.7 to 8.0 with binlog enabled
- Vitess v14.0
- Docker and docker-compose
- (optional) golangci-lint
Run make build
.
Run make test
to run all the unit and integration tests, which require Docker and Compose V2 to be installed and
running. The command will handle starting and stopping docker containers for you.
The Vitess Source Connector connects to a VTGate with the provided address
, keyspace
and tabletType
and starts
creating records for each change detected in a table.
Upon starting, the source takes a snapshot of a given table in the database, then switches into CDC mode. In CDC mode, the plugin reads events from a VStream. In order for this to work correctly, binlog must be enabled on MySQL instances you use with Vitess.
When the connector first starts, snapshot mode is enabled. The connector reads all rows of a table in batches via SELECT
with keyset pagination by orderingColumn
, limiting the rows by batchSize
and ordering them by the orderingColumn
. Once all rows in that initial snapshot are read the connector switches into
CDC mode.
This behavior is enabled by default, but can be turned off by adding "snapshot": false
to the Source configuration.
This connector implements CDC features for Vitess by connecting to a VStream that listens to changes in the configured table. Every detected change is converted into a record and returned in the call to Read
.
The connector in the CDC mode retrieves all the available shards from Vitess and tracks changes from all of them. If a reshard occurs, the connector will see the change and will listen for events from the new shards.
Warning Changes that happen during a snapshot are not captured.
name | description | required | default |
---|---|---|---|
address |
The address pointed to a VTGate instance. Format: hostname:port |
true | |
table |
The name of the table that the connector should read from. | true | |
keyspace |
The keyspace specifies a VTGate keyspace. | true | |
orderingColumn |
The name of a column that the connector will use for ordering rows. Its values must be unique and suitable for sorting, otherwise, the snapshot won't work correctly. | true | |
keyColumn |
Column name that records should use for their Key fields. |
false | primary key of a table or value of the orderingColumn |
username |
Username of a VTGate user. Required if your VTGate instance has a static authentication enabled. |
false | |
password |
Password of a VTGate user. Required if your VTGate instance has a static authentication enabled. |
false | |
tabletType |
Specifies a VTGate tablet type. | false | primary |
columns |
Comma separated list of column names that should be included in each Record's payload. If the field is not empty it must contain a value of the orderingColumn field. |
false | all columns |
batchSize |
Size of rows batch. Min is 1 and max is 100000 . |
false | 1000 |
maxRetries |
The number of reconnect retries the connector will make before giving up if a connection goes down. | false | 3 |
retryTimeout |
The time period that will be waited between retries. | false | 1 |
snapshot |
The field determines whether or not the connector will take a snapshot of the entire table before starting CDC mode. | false | true |
If the keyColumn
field is not provided, then the connector will attempt to look up the primary key column of the
table. If that can't be determined it will use a value of the orderingColumn
field.
If no column names are provided in the config, then the connector will assume that all columns in the table should be returned. If columns are changed (renamed, etc.), the connector will see those changes and use the updated columns.
Changing a table name during a connector update can cause quite unexpected results. Therefore, it's highly not recommended to do this.
The Vitess Destination takes a record.Record
and parses it into a valid SQL query. The Destination is designed to
handle different payloads and keys. Because of this, each record is individually parsed and upserted.
If a record contains a table
property in its metadata it will be inserted in that table, otherwise it will fall back
to use the table configured in the connector. Thus, a Destination can support multiple tables in a the same connector,
as long as the user has proper access to those tables.
If the target table already contains a record with the same key, the Destination will upsert with its current received values. Since the keys must be unique, this can lead to overwriting and potential data loss, so the keys must be correctly assigned from the Source.
In case if there is no key, the record will be simply appended.
name | description | required | default |
---|---|---|---|
address |
The address pointed to a VTGate instance. Format: hostname:port |
true | |
table |
The name of the table that the connector should write to. | true | |
keyColumn |
Column name used to detect if the target table already contains the record. | true | |
keyspace |
Specifies a VTGate keyspace. | true | |
username |
Username of a VTGate user. Required if your VTGate instance has a static authentication enabled. |
false | |
password |
Password of a VTGate user. Required if your VTGate instance has a static authentication enabled. |
false | |
tabletType |
Specifies a VTGate tablet type. | false | primary |
maxRetries |
The number of reconnect retries the connector will make before giving up if a connection goes down. | false | 3 |
retryTimeout |
The time period that will be waited between retries. | false | 1 |