-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Use string as a substitute for unregistered types during schema inference #3646
Conversation
… inference Signed-off-by: phil.park <bakjeeone@hotmail.com>
/assign @zhilingc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: felixwang9817, phil-park The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.32.0](v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([#3618](#3618)) ([bf740d2](bf740d2)) * Broken non-root path with projects-list.json ([#3665](#3665)) ([4861af0](4861af0)) * Clean up snowflake to_spark_df() ([#3607](#3607)) ([e8e643e](e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([#3640](#3640)) ([ef4ef32](ef4ef32)) * Fix scan datasize to 0 for inference schema ([#3628](#3628)) ([c3dd74e](c3dd74e)) * Fix timestamp consistency in push api ([#3614](#3614)) ([9b227d7](9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([#3630](#3630)) ([478caec](478caec)) * Implements connection pool for postgres online store ([#3633](#3633)) ([059509a](059509a)) * Manage redis pipe's context ([#3655](#3655)) ([48e0971](48e0971)) * Missing Catalog argument in athena connector ([#3661](#3661)) ([f6d3caf](f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([#3680](#3680)) ([1c01035](1c01035)) ### Features * Add gunicorn for serve with multiprocess ([#3636](#3636)) ([4de7faf](4de7faf)) * Use string as a substitute for unregistered types during schema inference ([#3646](#3646)) ([c474ccd](c474ccd))
* ci: Add bigtable cleanup script Signed-off-by: Danny C <d.chiao@gmail.com> * fix: Missing Catalog argument in athena connector (feast-dev#3661) update Catalog argument in athena connector Signed-off-by: Gyumin Lee <t1100394@T1100394PM01.local> Co-authored-by: Gyumin Lee <t1100394@T1100394PM01.local> * ci: Disable flaky lambda materialization test Signed-off-by: Danny C <d.chiao@gmail.com> * fix: Broken non-root path with projects-list.json (feast-dev#3665) ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <ben.fletcher@ft.com> * fix: Manage redis pipe's context (feast-dev#3655) Signed-off-by: Jiwon Park <bakjeeone@hotmail.com> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /sdk/python/feast/ui (feast-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /ui (feast-dev#3676) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: For SQL registry, increase max data_source_name length to 255 (feast-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <code@radonn.co.za> * fix: Optimize bytes processed when retrieving entity df schema to 0 (feast-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com> * fix: Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> * fix! join_keys typing. Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> --------- Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> Co-authored-by: williamfoschiera <william.foschiera@buser.com.br> * chore: Bump protobufjs from 7.1.1 to 7.2.4 in /ui (feast-dev#3674) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump protobufjs from 7.1.2 to 7.2.4 in /sdk/python/feast/ui (feast-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /ui (feast-dev#3678) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /sdk/python/feast/ui (feast-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump google.golang.org/grpc from 1.47.0 to 1.53.0 (feast-dev#3670) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(release): release 0.32.0 # [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) * fix: Redshift push ignores schema (feast-dev#3671) * Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Docstring Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Simplify logic Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test offline_write_batch Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Bump to trigger CI Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * another bump for ci Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> --------- Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * fix: Add aws-sts dependency in java sdk so that S3 client acquires IRSA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <harmeet_singh@discovery.com> * Adding initial update changes * Added formatting changes * Revert "Merge branch 'feast-dev:master' into msudhir/add-vector-update-functionality" This reverts commit 8487678, reversing changes made to 0578b9b. * Added more tests and functionality * updating tests * updated functionality and added more tests * correcting a test case * Making formatting corrections and changeing log * Improved tests and added functionality to convert feast schema to milvus readable schema * Added PR Review comments * Fixed failing test --------- Signed-off-by: Danny C <d.chiao@gmail.com> Signed-off-by: Gyumin Lee <t1100394@T1100394PM01.local> Signed-off-by: Ben Fletcher <ben.fletcher@ft.com> Signed-off-by: Jiwon Park <bakjeeone@hotmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Ross Donnachie <code@radonn.co.za> Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com> Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> Signed-off-by: harmeet-singh-discovery <harmeet_singh@discovery.com> Co-authored-by: Danny C <d.chiao@gmail.com> Co-authored-by: 이규민 <32768535+GyuminJack@users.noreply.github.com> Co-authored-by: Gyumin Lee <t1100394@T1100394PM01.local> Co-authored-by: Ben Fletcher <bjfletcher@gmail.com> Co-authored-by: Jiwon Park <bakjeeone@hotmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ross Donnachie <code@radonn.co.za> Co-authored-by: Harry <quanghai.ng1512@gmail.com> Co-authored-by: William Foschiera <wfoschiera@gmail.com> Co-authored-by: williamfoschiera <william.foschiera@buser.com.br> Co-authored-by: feast-ci-bot <feast-ci-bot@willem.co> Co-authored-by: Robin Neufeld <metavee@users.noreply.github.com> Co-authored-by: harmeet-singh-discovery <95894926+harmeet-singh-discovery@users.noreply.github.com> Co-authored-by: Manisha Sudhir <msudhir@expediagroup.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) Signed-off-by: Attila Toth <hello@attilatoth.dev>
… inference (feast-dev#3646) Signed-off-by: phil.park <bakjeeone@hotmail.com> Signed-off-by: zerafachris PERSONAL <zerafachris@gmail.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) Signed-off-by: zerafachris PERSONAL <zerafachris@gmail.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
What this PR does / why we need it:
For column types that exist in bigquery but don't currently exist in Feast's typemap (e.g. DATE), most of them can be used as strings.
Currently, attempting schema inference will result in an error saying that there is no type.
In order to use schema inference more actively, it seems reasonable to infer these columns as string type.
Which issue(s) this PR fixes:
Fixes #