Releases: feathr-ai/feathr
v1.0.0
We are excited to announce Feathr 1.0.0 is released, please refer to https://feathr-ai.github.io/feathr/release-announcements/v1.0.0.html for details
What's Changed
- Clean up some links that's referring to LinkedIn by @xiaoyongzhu in #872
- Insert test coverage check for python client into github pipeline by @enya-yx in #862
- Create docs on how to update Feathr client and registry, and how to pass credentials by @xiaoyongzhu in #818
- Add a custom pytest marker config to pyproject.toml by @loomlike in #786
- fix broken link by @xiaoyongzhu in #874
- Separate out snowflake source by @aabbasi-hbo in #836
- Decouple build feature code by @xiaoyongzhu in #838
- Refining example, add utilities, and fix xdist test error by @loomlike in #794
- Add 'format' arg to get_result_df by @loomlike in #885
- Fix test failure by @jaymo001 in #904
- UI add feature the deletion for projects/features/dataSource by @Fendoe in #909
- Fixing Bugs reported during oncall by @jainr in #908
- Ignore 'registry_utils' in test coverage by @enya-yx in #907
- Update azure_resource_provision.json by @xiaoyongzhu in #912
- Exclude pegasus jars and release version by @rakeshkashyap123 in #913
- Exclude pegasus data files explicitly by @rakeshkashyap123 in #916
- Fix auto-tz casting bug by @bozhonghu in #905
- Support printing features and returning keys when getting features from registry by @enya-yx in #886
- Adding Continuous deployment == ON flag to webapp settings by @jainr in #927
- Add use_env_var flag to client by @loomlike in #923
- Format docs and add tech talks by @xiaoyongzhu in #931
- Add is_synapse() by @loomlike in #929
- publish fat jar in maven update action by @Yuqing-cat in #935
- Add Spark Configuration Not Applied content by @Yuqing-cat in #928
- Support get online features by composite keys by @enya-yx in #919
- React best practice implementations in ui code by @blrchen in #938
- fix numpy version conflict w/ pyspark by @loomlike in #940
- Add presentation for Feathr community talk by @xiaoyongzhu in #941
- Add FIRST aggregation for look up feature by @jaymo001 in #917
- Disable auth if Feathr is deployed with RBAC set to off by @Fendoe in #925
- Update sample notebooks (fraud detection and recommendation examples) to use latest feathr api by @loomlike in #921
- Windoze/spark sql source by @windoze in #839
- Updates on docs to remove out-of-date contents by @blrchen in #950
- Lock python deps version for registry projects by @blrchen in #946
- Add pytest cases and check test coverage for sql-registry and purview… by @enya-yx in #937
- Fix delete entity function bug for #852 by @Yuqing-cat in #952
- Remove hadoop dependency by @rakeshkashyap123 in #949
- Create new feature form by @Fendoe in #936
- set purview name environment variable in workflow by @enya-yx in #956
- Bug fix - Fraud detection sample notebook chart error. by @loomlike in #948
- Update registry start script to support
REACT_APP_ENABLE_RBAC==false
cases by @Yuqing-cat in #954 - Improve error message for checking feature keys by @enya-yx in #959
- Changed !pip to %pip by @ahlag in #963
- Fix registry to support Source with types other than HDFS by @windoze in #965
- Create Datasource UI and update create feature form by @Fendoe in #969
- Bump version to 0.10.4-rc1 for rc bug bash by @Yuqing-cat in #971
- Registry client support SparkSqlSource by @windoze in #970
- [BUG] zsh: no matches found: feathr[notebook] during pip install feathr[notebook] by @ahlag in #961
- Add a Sandbox for Feathr by @xiaoyongzhu in #966
- Add Bacpac creation doc by @xiaoyongzhu in #957
- Print out logs for local spark if the job is failed by @xiaoyongzhu in #973
- Fix maven publish workflow always skip Gradle publish by @Yuqing-cat in #974
- Lock python deps versions in sql registry by @blrchen in #977
- Updating automated maven publish workflow to work with gradle creds by @jainr in #979
- Fixing Maven Automated Publish by @jainr in #983
- Fix config generation by @windoze in #986
- Fix environment variable read issues and potential bugs by @xiaoyongzhu in #980
- Add docs for Sandbox by @xiaoyongzhu in #981
- Fix local submission issues. by @xiaoyongzhu in #988
- Fix Dimension Type field is required by @Fendoe in #989
- Publish Fat Jar to Blob Storage Steps Decouple and Fix by @Yuqing-cat in #990
- Fix duplicated source check by @windoze in #995
- Adding azure-cosmos-package to databricks spark config by @jainr in #972
- Enable docker workflow to be triggered manually. by @windoze in #997
- NYC sample notebook feature re-name by @loomlike in #991
- Add docs about registry server test and label by @enya-yx in #993
- Revert "Adding azure-cosmos-package to databricks spark config" by @blrchen in #1002
- Update fraud detection sample notebook by @loomlike in #984
- Add additional maven repo url to databricks submission param by @Yuqing-cat in #1001
- Hard code json-schema dependency with comment as hot fix by @Yuqing-cat in #1004
- Remove use_env_vars arg from FeathrCient by @loomlike in #999
- Resolve pyspark / numpy conflicts by @loomlike in #992
- Fix Hadoop S3 Key settings by @aabbasi-hbo in #1007
- Add retry when data loading fails by @jaymo001 in #930
- Fixing documentation error in Synapse sample notebook by @jainr in #1014
- Adding CosmosDb as supported Online Store from deployment template by @jainr in #939
- Adding documentation for azure.cosmos.spark package and managing spark libraries by @jainr in #1013
- Adding dummy entries to feathr_config.yaml file for additional library entries by @jainr in #1012
- Adding guidance for picking optimal SKU for Azure SQL DB by @jainr in #1015
- Refactor config helper to be cleaner by @loomlike in #1011
- Support external/alien MVEL UDF by @jaymo001 in #1018
- Updating various sections of documentation based on customer feedback. by @jainr in #1016
- Revert "Add retry when data loading fails (#930)" by @jaymo001 in #1021
- Skip features optionally if feature data is missing by @rakeshkashyap123 in #1019
- Fix bug when SWA feature datapath does not end in daily/hourly by @rakeshkashyap123 in #1027
- Skip Anchored and derived features if the feature data is unavailable by @rakeshkashyap123 in #1026
- Upgrade log4j from 1.x to 2.x by @Yuqing-cat in #1031
- Add optional retry mechanism if data loading fails by @rakeshkashyap123 in #1034
- Auto Delete Empty Anchors and Sources in SQL Registry...
v1.0.0-rc4
What's Changed
- Clean up some links that's referring to LinkedIn by @xiaoyongzhu in #872
- Insert test coverage check for python client into github pipeline by @enya-yx in #862
- Create docs on how to update Feathr client and registry, and how to pass credentials by @xiaoyongzhu in #818
- Add a custom pytest marker config to pyproject.toml by @loomlike in #786
- fix broken link by @xiaoyongzhu in #874
- Separate out snowflake source by @aabbasi-hbo in #836
- Decouple build feature code by @xiaoyongzhu in #838
- Refining example, add utilities, and fix xdist test error by @loomlike in #794
- Add 'format' arg to get_result_df by @loomlike in #885
- Fix test failure by @jaymo001 in #904
- UI add feature the deletion for projects/features/dataSource by @Fendoe in #909
- Fixing Bugs reported during oncall by @jainr in #908
- Ignore 'registry_utils' in test coverage by @enya-yx in #907
- Update azure_resource_provision.json by @xiaoyongzhu in #912
- Exclude pegasus jars and release version by @rakeshkashyap123 in #913
- Exclude pegasus data files explicitly by @rakeshkashyap123 in #916
- Fix auto-tz casting bug by @bozhonghu in #905
- Support printing features and returning keys when getting features from registry by @enya-yx in #886
- Adding Continuous deployment == ON flag to webapp settings by @jainr in #927
- Add use_env_var flag to client by @loomlike in #923
- Format docs and add tech talks by @xiaoyongzhu in #931
- Add is_synapse() by @loomlike in #929
- publish fat jar in maven update action by @Yuqing-cat in #935
- Add Spark Configuration Not Applied content by @Yuqing-cat in #928
- Support get online features by composite keys by @enya-yx in #919
- React best practice implementations in ui code by @blrchen in #938
- fix numpy version conflict w/ pyspark by @loomlike in #940
- Add presentation for Feathr community talk by @xiaoyongzhu in #941
- Add FIRST aggregation for look up feature by @jaymo001 in #917
- Disable auth if Feathr is deployed with RBAC set to off by @Fendoe in #925
- Update sample notebooks (fraud detection and recommendation examples) to use latest feathr api by @loomlike in #921
- Windoze/spark sql source by @windoze in #839
- Updates on docs to remove out-of-date contents by @blrchen in #950
- Lock python deps version for registry projects by @blrchen in #946
- Add pytest cases and check test coverage for sql-registry and purview… by @enya-yx in #937
- Fix delete entity function bug for #852 by @Yuqing-cat in #952
- Remove hadoop dependency by @rakeshkashyap123 in #949
- Create new feature form by @Fendoe in #936
- set purview name environment variable in workflow by @enya-yx in #956
- Bug fix - Fraud detection sample notebook chart error. by @loomlike in #948
- Update registry start script to support
REACT_APP_ENABLE_RBAC==false
cases by @Yuqing-cat in #954 - Improve error message for checking feature keys by @enya-yx in #959
- Changed !pip to %pip by @ahlag in #963
- Fix registry to support Source with types other than HDFS by @windoze in #965
- Create Datasource UI and update create feature form by @Fendoe in #969
- Bump version to 0.10.4-rc1 for rc bug bash by @Yuqing-cat in #971
- Registry client support SparkSqlSource by @windoze in #970
- [BUG] zsh: no matches found: feathr[notebook] during pip install feathr[notebook] by @ahlag in #961
- Add a Sandbox for Feathr by @xiaoyongzhu in #966
- Add Bacpac creation doc by @xiaoyongzhu in #957
- Print out logs for local spark if the job is failed by @xiaoyongzhu in #973
- Fix maven publish workflow always skip Gradle publish by @Yuqing-cat in #974
- Lock python deps versions in sql registry by @blrchen in #977
- Updating automated maven publish workflow to work with gradle creds by @jainr in #979
- Fixing Maven Automated Publish by @jainr in #983
- Fix config generation by @windoze in #986
- Fix environment variable read issues and potential bugs by @xiaoyongzhu in #980
- Add docs for Sandbox by @xiaoyongzhu in #981
- Fix local submission issues. by @xiaoyongzhu in #988
- Fix Dimension Type field is required by @Fendoe in #989
- Publish Fat Jar to Blob Storage Steps Decouple and Fix by @Yuqing-cat in #990
- Fix duplicated source check by @windoze in #995
- Adding azure-cosmos-package to databricks spark config by @jainr in #972
- Enable docker workflow to be triggered manually. by @windoze in #997
- NYC sample notebook feature re-name by @loomlike in #991
- Add docs about registry server test and label by @enya-yx in #993
- Revert "Adding azure-cosmos-package to databricks spark config" by @blrchen in #1002
- Update fraud detection sample notebook by @loomlike in #984
- Add additional maven repo url to databricks submission param by @Yuqing-cat in #1001
- Hard code json-schema dependency with comment as hot fix by @Yuqing-cat in #1004
- Remove use_env_vars arg from FeathrCient by @loomlike in #999
- Resolve pyspark / numpy conflicts by @loomlike in #992
- Fix Hadoop S3 Key settings by @aabbasi-hbo in #1007
- Add retry when data loading fails by @jaymo001 in #930
- Fixing documentation error in Synapse sample notebook by @jainr in #1014
- Adding CosmosDb as supported Online Store from deployment template by @jainr in #939
- Adding documentation for azure.cosmos.spark package and managing spark libraries by @jainr in #1013
- Adding dummy entries to feathr_config.yaml file for additional library entries by @jainr in #1012
- Adding guidance for picking optimal SKU for Azure SQL DB by @jainr in #1015
- Refactor config helper to be cleaner by @loomlike in #1011
- Support external/alien MVEL UDF by @jaymo001 in #1018
- Updating various sections of documentation based on customer feedback. by @jainr in #1016
- Revert "Add retry when data loading fails (#930)" by @jaymo001 in #1021
- Skip features optionally if feature data is missing by @rakeshkashyap123 in #1019
- Fix bug when SWA feature datapath does not end in daily/hourly by @rakeshkashyap123 in #1027
- Skip Anchored and derived features if the feature data is unavailable by @rakeshkashyap123 in #1026
- Upgrade log4j from 1.x to 2.x by @Yuqing-cat in #1031
- Add optional retry mechanism if data loading fails by @rakeshkashyap123 in #1034
- Auto Delete Empty Anchors and Sources in SQL Registry by @Yuqing-cat in #1023
- Exclude
test_feathr_materialize_to_cosmosdb
from ci Databricks test by @blrchen in ...
v0.10.4-rc3
What's Changed
- Clean up some links that's referring to LinkedIn by @xiaoyongzhu in #872
- Insert test coverage check for python client into github pipeline by @enya-yx in #862
- Create docs on how to update Feathr client and registry, and how to pass credentials by @xiaoyongzhu in #818
- Add a custom pytest marker config to pyproject.toml by @loomlike in #786
- fix broken link by @xiaoyongzhu in #874
- Separate out snowflake source by @aabbasi-hbo in #836
- Decouple build feature code by @xiaoyongzhu in #838
- Refining example, add utilities, and fix xdist test error by @loomlike in #794
- Add 'format' arg to get_result_df by @loomlike in #885
- Fix test failure by @jaymo001 in #904
- UI add feature the deletion for projects/features/dataSource by @Fendoe in #909
- Fixing Bugs reported during oncall by @jainr in #908
- Ignore 'registry_utils' in test coverage by @enya-yx in #907
- Update azure_resource_provision.json by @xiaoyongzhu in #912
- Exclude pegasus jars and release version by @rakeshkashyap123 in #913
- Exclude pegasus data files explicitly by @rakeshkashyap123 in #916
- Fix auto-tz casting bug by @bozhonghu in #905
- Support printing features and returning keys when getting features from registry by @enya-yx in #886
- Adding Continuous deployment == ON flag to webapp settings by @jainr in #927
- Add use_env_var flag to client by @loomlike in #923
- Format docs and add tech talks by @xiaoyongzhu in #931
- Add is_synapse() by @loomlike in #929
- publish fat jar in maven update action by @Yuqing-cat in #935
- Add Spark Configuration Not Applied content by @Yuqing-cat in #928
- Support get online features by composite keys by @enya-yx in #919
- React best practice implementations in ui code by @blrchen in #938
- fix numpy version conflict w/ pyspark by @loomlike in #940
- Add presentation for Feathr community talk by @xiaoyongzhu in #941
- Add FIRST aggregation for look up feature by @jaymo001 in #917
- Disable auth if Feathr is deployed with RBAC set to off by @Fendoe in #925
- Update sample notebooks (fraud detection and recommendation examples) to use latest feathr api by @loomlike in #921
- Windoze/spark sql source by @windoze in #839
- Updates on docs to remove out-of-date contents by @blrchen in #950
- Lock python deps version for registry projects by @blrchen in #946
- Add pytest cases and check test coverage for sql-registry and purview… by @enya-yx in #937
- Fix delete entity function bug for #852 by @Yuqing-cat in #952
- Remove hadoop dependency by @rakeshkashyap123 in #949
- Create new feature form by @Fendoe in #936
- set purview name environment variable in workflow by @enya-yx in #956
- Bug fix - Fraud detection sample notebook chart error. by @loomlike in #948
- Update registry start script to support
REACT_APP_ENABLE_RBAC==false
cases by @Yuqing-cat in #954 - Improve error message for checking feature keys by @enya-yx in #959
- Changed !pip to %pip by @ahlag in #963
- Fix registry to support Source with types other than HDFS by @windoze in #965
- Create Datasource UI and update create feature form by @Fendoe in #969
- Bump version to 0.10.4-rc1 for rc bug bash by @Yuqing-cat in #971
- Registry client support SparkSqlSource by @windoze in #970
- [BUG] zsh: no matches found: feathr[notebook] during pip install feathr[notebook] by @ahlag in #961
- Add a Sandbox for Feathr by @xiaoyongzhu in #966
- Add Bacpac creation doc by @xiaoyongzhu in #957
- Print out logs for local spark if the job is failed by @xiaoyongzhu in #973
- Fix maven publish workflow always skip Gradle publish by @Yuqing-cat in #974
Full Changelog: v0.9.0...v0.10.4-rc3
0.10.4-rc2
What's Changed
- Clean up some links that's referring to LinkedIn by @xiaoyongzhu in #872
- Insert test coverage check for python client into github pipeline by @enya-yx in #862
- Create docs on how to update Feathr client and registry, and how to pass credentials by @xiaoyongzhu in #818
- Add a custom pytest marker config to pyproject.toml by @loomlike in #786
- fix broken link by @xiaoyongzhu in #874
- Separate out snowflake source by @aabbasi-hbo in #836
- Decouple build feature code by @xiaoyongzhu in #838
- Refining example, add utilities, and fix xdist test error by @loomlike in #794
- Add 'format' arg to get_result_df by @loomlike in #885
- Fix test failure by @jaymo001 in #904
- UI add feature the deletion for projects/features/dataSource by @Fendoe in #909
- Fixing Bugs reported during oncall by @jainr in #908
- Ignore 'registry_utils' in test coverage by @enya-yx in #907
- Update azure_resource_provision.json by @xiaoyongzhu in #912
- Exclude pegasus jars and release version by @rakeshkashyap123 in #913
- Exclude pegasus data files explicitly by @rakeshkashyap123 in #916
- Fix auto-tz casting bug by @bozhonghu in #905
- Support printing features and returning keys when getting features from registry by @enya-yx in #886
- Adding Continuous deployment == ON flag to webapp settings by @jainr in #927
- Add use_env_var flag to client by @loomlike in #923
- Format docs and add tech talks by @xiaoyongzhu in #931
- Add is_synapse() by @loomlike in #929
- publish fat jar in maven update action by @Yuqing-cat in #935
- Add Spark Configuration Not Applied content by @Yuqing-cat in #928
- Support get online features by composite keys by @enya-yx in #919
- React best practice implementations in ui code by @blrchen in #938
- fix numpy version conflict w/ pyspark by @loomlike in #940
- Add presentation for Feathr community talk by @xiaoyongzhu in #941
- Add FIRST aggregation for look up feature by @jaymo001 in #917
- Disable auth if Feathr is deployed with RBAC set to off by @Fendoe in #925
- Update sample notebooks (fraud detection and recommendation examples) to use latest feathr api by @loomlike in #921
- Windoze/spark sql source by @windoze in #839
- Updates on docs to remove out-of-date contents by @blrchen in #950
- Lock python deps version for registry projects by @blrchen in #946
- Add pytest cases and check test coverage for sql-registry and purview… by @enya-yx in #937
- Fix delete entity function bug for #852 by @Yuqing-cat in #952
- Remove hadoop dependency by @rakeshkashyap123 in #949
- Create new feature form by @Fendoe in #936
- set purview name environment variable in workflow by @enya-yx in #956
- Bug fix - Fraud detection sample notebook chart error. by @loomlike in #948
- Update registry start script to support
REACT_APP_ENABLE_RBAC==false
cases by @Yuqing-cat in #954 - Improve error message for checking feature keys by @enya-yx in #959
- Changed !pip to %pip by @ahlag in #963
- Fix registry to support Source with types other than HDFS by @windoze in #965
- Create Datasource UI and update create feature form by @Fendoe in #969
- Bump version to 0.10.4-rc1 for rc bug bash by @Yuqing-cat in #971
- Registry client support SparkSqlSource by @windoze in #970
- [BUG] zsh: no matches found: feathr[notebook] during pip install feathr[notebook] by @ahlag in #961
- Add a Sandbox for Feathr by @xiaoyongzhu in #966
- Add Bacpac creation doc by @xiaoyongzhu in #957
Full Changelog: v0.9.0...v0.10.4-rc2
0.10.4-rc1
What's Changed
- Clean up some links that's referring to LinkedIn by @xiaoyongzhu in #872
- Insert test coverage check for python client into github pipeline by @enya-yx in #862
- Create docs on how to update Feathr client and registry, and how to pass credentials by @xiaoyongzhu in #818
- Add a custom pytest marker config to pyproject.toml by @loomlike in #786
- fix broken link by @xiaoyongzhu in #874
- Separate out snowflake source by @aabbasi-hbo in #836
- Decouple build feature code by @xiaoyongzhu in #838
- Refining example, add utilities, and fix xdist test error by @loomlike in #794
- Add 'format' arg to get_result_df by @loomlike in #885
- Fix test failure by @jaymo001 in #904
- UI add feature the deletion for projects/features/dataSource by @Fendoe in #909
- Fixing Bugs reported during oncall by @jainr in #908
- Ignore 'registry_utils' in test coverage by @enya-yx in #907
- Update azure_resource_provision.json by @xiaoyongzhu in #912
- Exclude pegasus jars and release version by @rakeshkashyap123 in #913
- Exclude pegasus data files explicitly by @rakeshkashyap123 in #916
- Fix auto-tz casting bug by @bozhonghu in #905
- Support printing features and returning keys when getting features from registry by @enya-yx in #886
- Adding Continuous deployment == ON flag to webapp settings by @jainr in #927
- Add use_env_var flag to client by @loomlike in #923
- Format docs and add tech talks by @xiaoyongzhu in #931
- Add is_synapse() by @loomlike in #929
- publish fat jar in maven update action by @Yuqing-cat in #935
- Add Spark Configuration Not Applied content by @Yuqing-cat in #928
- Support get online features by composite keys by @enya-yx in #919
- React best practice implementations in ui code by @blrchen in #938
- fix numpy version conflict w/ pyspark by @loomlike in #940
- Add presentation for Feathr community talk by @xiaoyongzhu in #941
- Add FIRST aggregation for look up feature by @jaymo001 in #917
- Disable auth if Feathr is deployed with RBAC set to off by @Fendoe in #925
- Update sample notebooks (fraud detection and recommendation examples) to use latest feathr api by @loomlike in #921
- Windoze/spark sql source by @windoze in #839
- Updates on docs to remove out-of-date contents by @blrchen in #950
- Lock python deps version for registry projects by @blrchen in #946
- Add pytest cases and check test coverage for sql-registry and purview… by @enya-yx in #937
- Fix delete entity function bug for #852 by @Yuqing-cat in #952
- Remove hadoop dependency by @rakeshkashyap123 in #949
- Create new feature form by @Fendoe in #936
- set purview name environment variable in workflow by @enya-yx in #956
- Bug fix - Fraud detection sample notebook chart error. by @loomlike in #948
- Update registry start script to support
REACT_APP_ENABLE_RBAC==false
cases by @Yuqing-cat in #954 - Improve error message for checking feature keys by @enya-yx in #959
- Changed !pip to %pip by @ahlag in #963
- Fix registry to support Source with types other than HDFS by @windoze in #965
- Create Datasource UI and update create feature form by @Fendoe in #969
- Bump version to 0.10.4-rc1 for rc bug bash by @Yuqing-cat in #971
- Registry client support SparkSqlSource by @windoze in #970
- [BUG] zsh: no matches found: feathr[notebook] during pip install feathr[notebook] by @ahlag in #961
- Add a Sandbox for Feathr by @xiaoyongzhu in #966
Full Changelog: v0.9.0...v0.10.4-rc1
Feathr Online Transform (alpha)
Overview
As of v0.10.0, Feathr team is happy to introduce Online Transform (alpha). You might want to try Online Transform in the following scenarios:
- Featurization source for transformation is only available at inference time.
- Pre-compute features with offline transform might be a waste of storage and compute resources.
- Decouple featurization work off the upstream online system.
- Write Python programs to use the online transformation functions.
Note: Only use alpha build for evaluation. Alpha build is intented for early access and feedback, it might not as stable as GA build and might be changed at any time before GA without notice. Please use alpha build at your risk.
Running Feathr Online Transform locally
Feathr Online Transform can be run locally with docker. Please check out details on README.
Running Feathr Online Transform on Azure
To deploy Feathr Online Transform to Azure, you need to have a AKS cluseter setup on Azure first. Then follow README to deploy helm chart to AKS cluster.
Using Python library
Feathr Online Transform also has a Python library for further development, check out details from the project site
Roadmap
- Define transformation once for both online and offline consumption.
- Registry integration
- Java Library
Additional Note
Source code for Online Transform (alpha) are hosted under personal accounts, Feathr team is working on moving these source codes to the official repo but it might take some time.
v0.9.0
Breaking Changes
We have changed the execution engine for derived features to Spark SQL so this might introduce a little bit breaking changes for users who is not running the up-to-date sample notebooks. Specifically, they might face this failure:
Preprocessed DataFrames are:
{'feature_user_age,feature_user_gift_card_balance,feature_user_has_valid_credit_card,feature_user_tax_rate': JavaObject id=o243}
Traceback (most recent call last):
File "feathr_pyspark_driver.py", line 107, in <module>
submit_spark_job(feature_names_funcs)
File "feathr_pyspark_driver.py", line 85, in submit_spark_job
py4j_feature_job.mainWithPreprocessedDataFrame(job_param_java_array, new_preprocessed_df_map)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in __call__
return_value = get_return_value(
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
pyspark.sql.utils.AnalysisException: Undefined function: 'toBoolean'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 84
)
Users should change:
feature_user_purchasing_power = DerivedFeature(name="feature_user_purchasing_power",
key=user_id,
feature_type=FLOAT,
input_features=[
feature_user_gift_card_balance, feature_user_has_valid_credit_card],
transform="feature_user_gift_card_balance + if_else(toBoolean(feature_user_has_valid_credit_card), 100, 0)")
to
feature_user_purchasing_power = DerivedFeature(name="feature_user_purchasing_power",
key=user_id,
feature_type=FLOAT,
input_features=[
feature_user_gift_card_balance, feature_user_has_valid_credit_card],
transform="feature_user_gift_card_balance + if(boolean(feature_user_has_valid_credit_card), 100, 0)")
What's Changed
- Fix a feature type bug by @jaymo001 in #701
- Fix wheel building problem in Windows by @xiaoyongzhu in #702
- Fix Purview+RBAC registry web app issue by @Yuqing-cat in #700
- Remove hard coded resources in docs by @enya-yx in #696
- Add e2e test for purview registry and rbac registry by @blrchen in #689
- Update tests use runtime jar from maven for spark submission to cover Databricks by @blrchen in #706
- Enhance databricks submission error message by @enya-yx in #710
- Enhance purview registry error messages by @blrchen in #709
- [WIP] hot fix databricks es dependency issue by @Yuqing-cat in #713
- Fix materialize to sql e2e test failure by @blrchen in #717
- Add Data Models in Feathr by @hyingyang-linkedin in #659
- Revert "Enhance purview registry error messages (#709)" by @blrchen in #720
- Improve Avro GenericRecord and SpecificRecord based row-level extractor performance by @jaymo001 in #723
- Fix lookup feature missing issue when converting feature definition to HOCON files by @jaymo001 in #732
- Fix function string parsing by @loomlike in #725
- Apply a same credential within each sample [ Docs ] by @enya-yx in #718
- Enable incremental for HDFS sink by @enya-yx in #695
- #492 fix, fail only if different sources have same name by @windoze in #733
- Remove unused credentials and deprecated purview settings by @enya-yx in #708
- Revoke adb token submitted by mistaken by @blrchen in #730
- Fix synapse errors not print out issue by @enya-yx in #734
- Spark config passing bug fix for local spark submission by @loomlike in #729
- Fix direct purview client missing transformation by @YihuiGuo in #736
- Support SQL expression in derived feature transformation by @jaymo001 in #731
- Support SWA with groupBy to 1d tensor conversion by @jaymo001 in #748
- Rijai/armfix by @jainr in #742
- bump version to 0.8.2 by @Yuqing-cat in #722
- Added latest deltalake version by @ahlag in #735
- Fix #474 Disable local mode by @windoze in #738
- Allow recreating entities for PurView registry by @windoze in #691
- Adding DevSkim linter to Github actions by @jainr in #657
- Fix icons in UI cannot auto scale (#737) by @Fendoe in #744
- Expose 'timePartitionPattern' in Python API [ WIP ] by @enya-yx in #714
- Setting up component governance pipeline by @jainr in #655
- Add docs to explain on feature materialization behavior by @xiaoyongzhu in #688
- Fix protobuf version by @enya-yx in #711
- Add some notes based on on-call issues by @enya-yx in #753
- Refine spark runtime error message by @Yuqing-cat in #755
- Serialization bug due to version incompatibility between azure-core and msrest by @jainr in #763
- Unify Python SDK Build Version and decouple Feathr Maven Version by @Yuqing-cat in #746
- Replace hard code string in notebook and align with others by @Yuqing-cat in #765
- Add flag to enable generation non-agg features by @windoze in #719
- roll back 0.8.2 version bump by @Yuqing-cat in #771
- Refactor Product Recommendation sample notebook by @jainr in #743
- Update role-management page in UI (#751) by @Fendoe in #764
- Create Feature less module in UI code and import alias by @Fendoe in #768
- Add extra dependencies to setup.py by @loomlike in #773
- Fix Windows compatibility issues by @xiaoyongzhu in #776
- UI: Replace logo icon by @Fendoe in #778
- Refine example notebooks by @loomlike in #756
- UI: Display version by @Fendoe in #779
- Add nightly Notification to PR Test GitHub Action by @Yuqing-cat in #783
- Fix broken links for #743 by @Yuqing-cat in #789
- Update notebook image links for github rendering by @loomlike in #787
- Revert 756 by @blrchen in #798
- remove unnecessary spark job from registry test by @Yuqing-cat in #790
- Revert "Expose 'timePartitionPattern' in Python API [ WIP ]" by @blrchen in #799
- Update CONTRIBUTING.md with committers information by @hangfei in #793
- Fix test_azure_spark_maven_e2e ci test error by @blrchen in #800
- Add failure warning and run link to daily notification by @Yuqing-cat in #802
- Minor documentation update to add info about maven automated workflow by @jainr in #795
- Fix doc dead links by @blrchen in #805
- Fix more dead links on docs by @blrchen in #807
- Improve UI experience and clean up ui code warnings by @Fendoe in #801
- Add release instructions for Release Candidate by @blrchen in #809
- Bump version to 0.9.0-rc1 by @blrchen in #810
- Fix bug in empty array dense tensor default value by @bozhonghu in #806
- Fix sql-based derived feature by @jaymo001 in #812
- Replacing webapp-deploy action with workflow-webhook action. by @jainr in #813
- Fix passthrough feature reference in sql-based derived feature by @jaymo001 in #815
- Revert databricks example notebook until fixing issues by @loomlike in #814
- Add retry logic for purview project-ids logic by @Yuqing-cat ...
v0.8.0
Highlighted Features
- UI: Add data source detail page by @ahlag in #620
- Add aerospike sink by @YihuiGuo in #632
- Local Spark Provider to submit feature join job in local env by @Yuqing-cat in #644
Improvements
- Updates on github PR/Issue templates by @blrchen in #642
- Adding documentation for maven publishing automation by @jainr in #646
- Add OSS Badge in README by @xiaoyongzhu in #649
- Fix broken doc links by @xiaoyongzhu in #658
- Added _scproxy necessary for MacOS by @ahlag in #651
- Add docs for consuming features in online environment by @xiaoyongzhu in #609
- Clean up after moving to LFAI by @xiaoyongzhu in #665
- Updating docker version in ARM template to use latest release tagged image by @jainr in #668
- Added prettier documentation by @ahlag in #672
- Remove reference to aerospike JAR in sbt by @YihuiGuo in #680
- Extend RBAC to support project id as input by @Yuqing-cat in #673
- Fixing issue with docker image on demo apps not getting updated by @jainr in #686
- Lock python dependency versions by @xiaoyongzhu in #690
- Apply 'aggregation_features' parameter to merge dataframes by @enya-yx in #667
- Fix data source detail page in rbac registry by @Yuqing-cat in #698
- Fix multi-keyed feature in anchor (direct purview) by @YihuiGuo in #676
- Fix path with #LATEST by @jaymo001 in #684
- Fix Feature value adaptor and UDF adaptor on Spark executors by @jaymo001 in #660
- Enhance SQL Registry Error Messages by @windoze in #674
Full Changelog: v0.7.2...v0.8.0
v0.7.2 Enhancement on supported stores and Web UI
Highlighted Features
- Generic Input/Output of DataFrames by @windoze in #475, read more
- UDF plugin API by @d4ve in #507
- UI: Add home page and project list page by @blrchen in #595
Improvements
- Extend Access Control Management APIs to Project Admins by @Yuqing-cat in #535
- Handle http response 403 forbidden error by @enya-yx in #522
- Various documentation fix by @xiaoyongzhu in #477
- Updating instructions for publishing to maven. by @blee1234 in #542
- DataPathHandler Bug Fix by @blee1234 in #531
- Add management menu bar by @enya-yx in #528
- Improve product recommendation notebook by @hangfei in #532
- Extend Access Control Registry APIs to protect
post
ones by @Yuqing-cat in #551 - Move feathr ui live demo section to root README.md by @blrchen in #550
- Updates on registry endpoints related doc by @blrchen in #541
- Read existing custom tags from databricks config template by @esadler-hbo in #555
- Fix FeatureJoinJob parameters not correctly printed in spark job log by @blrchen in #553
- Clear Feathr UDF state and configuration template in work directory by @xiaoyongzhu in #557
- Quick update on rbac docs and input hint by @Yuqing-cat in #563
- Update documentation for Feathr UI, registry, and architecture. by @xiaoyongzhu in #534
- Add guidance for setting up aerospike local env by @YihuiGuo in #572
- Clarifies on the usage for BackfillTime. by @xiaoyongzhu in #568
- UI: Display feature key column in feature table by @blrchen in #569
- Optimize purview search logic by @enya-yx in #564
- Fixing Managed Identity Issue allowing connection between App Service and Azure Purview by @jainr in #579
- Create Azure machine learning related docs by @xiaoyongzhu in #574
- Add concept doc for rbac by @Yuqing-cat in #571
- Create code structure by @xiaoyongzhu in #573
- Enable always register types with optimized logic. by @YihuiGuo in #582
- Fix purview registry bug by @YihuiGuo in #583
- Misc issue fixes by @jaymo001 in #581
- Add delimiter option for reading CSV files for Feathr by @ahlag in #307
- Optimize purview query for getting features by @enya-yx in #584
- Remove extra dependencies to avoid dead loop of page re-rendering by @enya-yx in #588
- Add docs for input and output format and expected behaviors by @xiaoyongzhu in #575
- Update docs to allow Feathr use Spark UDFs by @xiaoyongzhu in #585
- Add COUNT_DISTINCT aggregation by @esadler-hbo in #594
- Support navigation back to feature list page by link by @enya-yx in #577
- Fix graph edge errors when calling SQL APIs by @enya-yx in #593
- Build docker image for feathr by @YihuiGuo in #554
- Fix source entity missing time window bug. by @YihuiGuo in #600
- Fix duplicate feature key bug by @YihuiGuo in #601
- Support 'enabled' configurations for offline stores by @enya-yx in #545
- Decode log error message by @enya-yx in #602
- Use maven based submission by default for samples by @xiaoyongzhu in #603
New Contributors
- @esadler-hbo made their first contribution in #555
Full Changelog: v0.6.0...v0.7.0
v0.6.0 Improved Deployment experience, containerized UI and backend in one simple container, support for Azure SQL as registry backend and RBAC
Highlighted Features:
- Containerized deployment with docker image of UI and registry backend by @blrchen in #379
- SQL backend Feature Registry Service by @windoze in #311
- Purview backend Feature Registry Service by @YihuiGuo #404
- Feathr UI with Registry API backend by @blrchen #303
- Fixed bugs and improved documentation to make deployment easier by @jainr in #398 #443
- Project level access control for registry APIs by @Yuqing-cat in #409
- Added a new fraud detection sample by @t-curiekim in #515
Improvements:
- Enable submitting Feathr jar from maven by @windoze in #211
- UI Style improvements by @blrchen and @t-curiekim in #427 #440 #491 #455
- Documentation update and removed duplicates by @xiaoyongzhu and @Yuqing-cat in #447 #403 #454 #383
- Set up eslint and prettier and included the checks in github action by @donegjookim and @blrchen in #483 #502
New Contributors
- @iemejia made their first contribution in #346
- @shivamsanju made their first contribution in #400
- @t-curiekim made their first contribution in #455
- @donegjookim made their first contribution in #479
- @enya0405 made their first contribution in #501
- @SangamSwadiK made their first contribution in #510
Full Changelog: v0.5.1...v0.6.0