WRITE_TRUNCATE appending to table #2326

DannyLee12 · 2016-09-16T06:45:45Z

Using the templates found here
Running the commands

job = client.load_table_from_storage(
    'load-from-storage' + datetime.now().strftime('%Y%m%d%H%M'),
    table, gsbucket)
job.skip_leading_rows = 0
job.writeDisposition = 'WRITE_TRUNCATE'
job.field_delimiter = ','
job.begin()

This piece of code appends to my table instead of overwriting it like the docs say.

WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data.

I know this because if I use:

table.reload()
print(table.num_rows)

after the job it has increased by 2 million - the size of the table.

I use a workaround as follows:

if disposition == 'WRITE_TRUNCATE':
    schema = table.schema
    while table.exists():
        table.delete()
    table = dataset.table(table_name, schema=schema)
    table.create()

Which seems to work fine, and the while loop is just me making certain that the table is deleted but when I tested it without the while loop it also worked.

Anyway, the issue is WRITE_TRUNCATE isn't doing what it say it does in the docs.

tseaver · 2016-09-16T14:34:28Z

@DannyLee12 in #2327 I tried to reproduce this issue, creating a new system test which loads the same table twice, with write_disposition = 'WRITE_TRUNCATE' on the second run. It works as expected: the rows fetched from the table after the second run aren't doubled.

Can you figure out what differs between your case and that new test?

DannyLee12 · 2016-09-19T07:17:53Z

@tseaver I see you are using job.write_disposition while I have been using job.writeDisposition This fixed the issue and thanks for the test.

This is a quote from these documents:

Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.

Emphasis mine. Is it possible to update the docs referenced? Thanks again.

daspecster · 2016-09-19T13:01:05Z

The documentation that you referenced is a more general and somewhat language agnostic documentation to describe the architecture of the service.

I think this might be more helpful for you.

tseaver · 2016-09-19T16:06:39Z

@DannyLee12 those docs describe the field names which the back-end requires be set in the JSON payloads: the property names we expose in the google-cloud-python are all PEP8-conformant, so we translate from the camelCasedNames the API uses to names_with_underscores. As @daspecster notes, you need to look at the docs for this library to see how things are spelled, and use the back-end docs just for concepts.

FWIW: @jonparrott and his team are working on exposing correct snippets for each API language wrapper in the back-end docs. Those snippets will be tested, and therefore match this library.

DannyLee12 · 2016-09-19T18:49:25Z

@tseaver @daspecster Thanks guys, as with everything, it's obvious once you know. Thanks again, really appreciate the assistance.
Cheers.

tseaver added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. api: bigquery Issues related to the BigQuery API. backend labels Sep 16, 2016

tseaver self-assigned this Sep 16, 2016

tseaver mentioned this issue Sep 16, 2016

Exercise the 'WRITE_TRUNCATE' feature. #2327

Closed

DannyLee12 closed this as completed Sep 19, 2016

tseaver removed the backend label Oct 25, 2016

thenaturalist mentioned this issue Oct 18, 2021

GCSToBigQueryOperator WRITE_TRUNCATE is not working apache/airflow#16578

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WRITE_TRUNCATE appending to table #2326

WRITE_TRUNCATE appending to table #2326

DannyLee12 commented Sep 16, 2016 •

edited by dhermes

Loading

tseaver commented Sep 16, 2016

DannyLee12 commented Sep 19, 2016

daspecster commented Sep 19, 2016

tseaver commented Sep 19, 2016

DannyLee12 commented Sep 19, 2016

WRITE_TRUNCATE appending to table #2326

WRITE_TRUNCATE appending to table #2326

Comments

DannyLee12 commented Sep 16, 2016 • edited by dhermes Loading

tseaver commented Sep 16, 2016

DannyLee12 commented Sep 19, 2016

daspecster commented Sep 19, 2016

tseaver commented Sep 19, 2016

DannyLee12 commented Sep 19, 2016

DannyLee12 commented Sep 16, 2016 •

edited by dhermes

Loading