Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/ch155055/emily mo inconsistent to carto error #1734

Merged

Conversation

Mmoncadaisla
Copy link
Contributor

Context

In this small PR from Support we aim to avoid hitting the Batch API 16 kb payload limit when using the _truncate_and_drop_add_columns function, which is called from to_carto (through copy_from) with if_exists='replace' collision strategy if the table structure changes used and the table has a considerable amount of columns.

Further context can be found here: https://app.clubhouse.io/cartoteam/story/155055/emily-mo-inconsistent-to-carto-error

Proposed solution

  1. Create a temporary PostgreSQL user defined function through a standard SQL API call with the required changes
  2. Apply this function in the transaction (after the TRUNCATE operation and before the cartodbfication process)
  3. Drop the temporary function

Example output with the current logic

CartoException: ['Your payload is too large: 18745 bytes. Max size allowed is 16384 bytes (16kb). Are you trying to import data?. Please, check out import api http://docs.cartodb.com/cartodb-platform/import-api/']

Example output with the changes implemented

Success! Data uploaded to table "test_to_carto_truncate_drop_add_" correctly
'test_to_carto_truncate_drop_add_'

We have also verified that the temporary function is dropped as expected.

PR changes

Relevant

  • io/managers/context_manager.py where the logic referred above is implemented. Add helper functions for query definition, add functions to create and drop UDFs and perform necessary changes in the _truncate_and_drop_add_columns function

Minor

  • utils/utils.py add a function create_tmp_name to create temporary-like names using uuid
  • /data/services/service.py adapt the _new_temporary_table_name method to make use of the create_tmp_name function

@shortcut-integration
Copy link

This pull request has been linked to Clubhouse Story #155055: [emily-mo] Inconsistent to_carto error.

@Mmoncadaisla Mmoncadaisla self-assigned this Jun 7, 2021
@Mmoncadaisla Mmoncadaisla requested a review from Jesus89 June 7, 2021 17:18
@Jesus89 Jesus89 requested a review from jgoizueta June 9, 2021 10:39
Copy link
Member

@Jesus89 Jesus89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job. It is a very ingenious solution :)

However, I would check the payload size before going that way because it requires two extra queries. So for every normal situation, it would work the same (with no extra requests) but then the len(query) > X this system can be done.

Regarding the X, the length of the query and the payload of the request are very similar. but instead of using "16384" (which is the limit of the Batch API) we could use 12000 just to make sure there are no false negatives.

@Mmoncadaisla Mmoncadaisla requested a review from Jesus89 June 9, 2021 13:54
Copy link
Contributor

@jgoizueta jgoizueta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice job!

Copy link
Member

@Jesus89 Jesus89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I have added a comment to improve readability

@jgoizueta jgoizueta assigned jgoizueta and unassigned Mmoncadaisla Jun 9, 2021
@Mmoncadaisla
Copy link
Contributor Author

Performed different Q&A tests on both an enterprise and a free plan accounts and it seems to be working as expected. Thank you a lot for the great CR @Jesus89 @jgoizueta ❤️

@Mmoncadaisla Mmoncadaisla merged commit a55e248 into develop Jun 10, 2021
@Jesus89 Jesus89 deleted the bug/ch155055/emily-mo-inconsistent-to-carto-error branch June 10, 2021 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants