title | summary |
---|---|
TiCDC CSV Protocol |
Learn the concept of TiCDC CSV Protocol and how to use it. |
When using a cloud storage service as the downstream sink, you can send DML events to the cloud storage service in CSV format.
The following is an example of the configuration when using the CSV protocol:
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="csv-test" --sink-uri="s3://bucket/prefix" --config changefeed.toml
The configuration in the changefeed.toml
file is as follows:
[sink]
protocol = "csv"
terminator = "\n"
[sink.csv]
delimiter = ',' # Before v7.6.0, you can only set the delimiter to a single character. Starting from v7.6.0, you can set it to 1-3 characters. For example, `$^` or `|@|`.
quote = '"'
null = '\N'
include-commit-ts = true
output-old-value = false
- In a single CSV file, the
commit-ts
of a row is equal to or smaller than that of the subsequent row. - The same transactions of a single table are stored in the same CSV file.
- Multiple tables of the same transaction can be stored in different CSV files.
For more information about the storage path structure of the data, see Storage path structure.
In the CSV file, each column is defined as follows:
- Column 1: The operation-type indicator, including
I
,U
, andD
.I
meansINSERT
,U
meansUPDATE
, andD
meansDELETE
. - Column 2: Table name.
- Column 3: Schema name.
- Column 4: The
commit-ts
of the source transaction. This column is optional. - Column 5: The
is-update
column only exists when the value ofoutput-old-value
is true, which is used to identify whether the row data change comes from the UPDATE event (the value of the column is true) or the INSERT/DELETE event (the value is false). - Column 6 to the last column: One or more columns with data changes.
Assume that table hr.employee
is defined as follows:
CREATE TABLE `employee` (
`Id` int NOT NULL,
`LastName` varchar(20) DEFAULT NULL,
`FirstName` varchar(30) DEFAULT NULL,
`HireDate` date DEFAULT NULL,
`OfficeLocation` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
When include-commit-ts = true
and output-old-value = false
, the DML events of this table are stored in the CSV format as follows:
"I","employee","hr",433305438660591626,101,"Smith","Bob","2014-06-04","New York"
"U","employee","hr",433305438660591627,101,"Smith","Bob","2015-10-08","Los Angeles"
"D","employee","hr",433305438660591629,101,"Smith","Bob","2017-03-13","Dallas"
"I","employee","hr",433305438660591630,102,"Alex","Alice","2017-03-14","Shanghai"
"U","employee","hr",433305438660591630,102,"Alex","Alice","2018-06-15","Beijing"
When include-commit-ts = true
and output-old-value = true
, the DML events of this table are stored in the CSV format as follows:
"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York"
"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai"
"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles"
"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas"
"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai"
"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing"
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
MySQL type | CSV type | Example | Description |
---|---|---|---|
BOOLEAN /TINYINT /SMALLINT /INT /MEDIUMINT /BIGINT |
Integer | 123 |
- |
FLOAT /DOUBLE |
Float | 153.123 |
- |
NULL |
Null | \N |
- |
TIMESTAMP /DATETIME |
String | "1973-12-30 15:30:00.123456" |
Format: yyyy-MM-dd HH:mm:ss.%06d |
DATE |
String | "2000-01-01" |
Format: yyyy-MM-dd |
TIME |
String | "23:59:59" |
Format: yyyy-MM-dd |
YEAR |
Integer | 1970 |
- |
VARCHAR /JSON /TINYTEXT /MEDIUMTEXT /LONGTEXT /TEXT /CHAR |
String | "test" |
UTF-8 encoded |
VARBINARY /TINYBLOB /MEDIUMBLOB /LONGBLOB /BLOB /BINARY |
String | "6Zi/5pav" or "e998bfe696af" |
Base64 or hex encoded |
BIT |
Integer | 81 |
- |
DECIMAL |
String | "129012.1230000" |
- |
ENUM |
String | "a" |
- |
SET |
String | "a,b" |
- |