Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always set destination table in BigQuery query config in Feast Batch Serving so it can handle large results #392

Conversation

davidheryanto
Copy link
Collaborator

This pull request updates query configuration in Feast Serving for BigQuery store such that the query result is always saved explicitly to a destination table.

BigQuery has a default maximum response size of 10 GB compressed when query results are written to a temporary table managed by BigQuery. To overcome this limit, an explicit destination table is provided.

These destination tables are only intermediate tables used by Feast Batch Serving to create the final features output, hence Feast set them to expire in 1 day (BigQuery will auto delete them when they expire) by default.

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidheryanto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feast-dev feast-dev deleted a comment from feast-ci-bot Dec 25, 2019
@woop
Copy link
Member

woop commented Dec 25, 2019

This pull request updates query configuration in Feast Serving for BigQuery store such that the query result is always saved explicitly to a destination table.

BigQuery has a default maximum response size of 10 GB compressed when query results are written to a temporary table managed by BigQuery. To overcome this limit, an explicit destination table is provided.

These destination tables are only intermediate tables used by Feast Batch Serving to create the final features output, hence Feast set them to expire in 1 day (BigQuery will auto delete them when they expire) by default.

Are you sure you shouldnt configure the allow large results flag in the job options?

https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/java/latest/com/google/api/services/bigquery/model/JobConfigurationQuery.html#setAllowLargeResults-java.lang.Boolean-

Output rows may not have the same order as requested entity rows
@davidheryanto
Copy link
Collaborator Author

davidheryanto commented Dec 25, 2019

Are you sure you shouldnt configure the allow large results flag in the job options?

We only need to set allow large results flag when we're using legacy SQL which we do not use.

@woop
Copy link
Member

woop commented Dec 26, 2019

/lgtm

@feast-ci-bot feast-ci-bot merged commit 2fcddaa into feast-dev:v0.3-branch Dec 26, 2019
zhilingc pushed a commit that referenced this pull request Dec 27, 2019
feast-ci-bot pushed a commit that referenced this pull request Dec 27, 2019
* Implement project namespacing (without auth)
* Update Protos, Java SDK, Golang SDK to support namespacing
* Fixed Python SDK to support project namespacing protos
* Add integration with projects, update code to be compliant with new protos
* Move name, version and project back to spec
* Update Feast Core and Feast Ingestion to support project namespacing
* Update Core and Ingestion based on refactored FeatureSet proto
* Remove entity dataset validation
* Register feature sets first to speed up tests

* Apply PR #392

* Apply spotless

* Order test output

Co-authored-by: Chen Zhiling <chnzhlng@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants