sql: COPY CSV doesn't support hex encoding for bytea #69640
Labels
A-sql-pgwire
pgwire protocol issues.
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
O-community
Originated from the community
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
X-blathers-triaged
blathers was able to find an owner
Describe the problem
Following up on #68804, CockroachDB does not support hex encoded byte array data when attempting to
COPY FROM STDIN WITH CSV
. As part of #68804, octal/escape encoded byte arrays were fixed which provides a valid workaround, but escape encoding is incredibly inefficient in the amount of data sent over the wire. This can greatly reduce ingest performance if the frequency or size of the bytea columns are large.There was some discussion about how hex encoding isn't supported due to the SQL layer not supporting the PostgreSQL byte array literals syntax due to #26128. I would like to point out that
IMPORT INTO CSV DATA
already does support the\x
hex encoded format for CSVs. Also, sinceCOPY
doesn't support thex'abc'
or the\xabc
syntax it doesn't have the difficult problem having to support both or not breaking compatibility like the SQL layer.The fact that other areas of the system already support
\x
syntax suggests thatCOPY
could be also be enhanced to do the necessary translation from\x
to the format that the SQL layer supports.To Reproduce
Only the rows that came from
IMPORT
are inserted with the correct bytea data.Expected behavior
Hex encoded bytea data supported via COPY CSV.
Additional data / screenshots
If applicable, add screenshots to help explain your problem.
Environment:
Additional context
Unable to use existing dumps that have hex encoding. Reduced ingest performance due to less efficient octal encoding.
Jira issue: CRDB-9700
The text was updated successfully, but these errors were encountered: