-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
examples/ibis
and Ibis plugin
#725
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me!
- Reviewed the entire pull request up to 85d7554
- Looked at
471
lines of code in6
files - Took 49 seconds to review
More info
- Skipped
5
files when reviewing. - Skipped posting
1
additional comments because they didn't meet confidence threshold of50%
.
1. examples/ibis/model_training.py:1
:
- Assessed confidence :
100%
- Grade:
0%
- Comment:
The PR lacks tests for the new functionality. Please add tests to ensure the new functionality works as expected. - Reasoning:
The PR adds support for Ibis in the Hamilton library. It includes examples of how to use Hamilton with Ibis for feature engineering and machine learning model training. The code seems to follow good practices, and the logic seems sound. However, there are no tests included in the PR, which is a concern. The author should add tests to ensure the new functionality works as expected.
Workflow ID: wflow_7SBEevi6qQlLm96O
Not what you expected? You can customize the content of the reviews using rules. Learn more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me!
- Performed an incremental review on 2dadf51
- Looked at
480
lines of code in6
files - Took 4 minutes and 53 seconds to review
More info
- Skipped
1
files when reviewing. - Skipped posting
6
additional comments because they didn't meet confidence threshold of50%
.
1. hamilton/plugins/ibis_extensions.py:69
:
- Assessed confidence :
10%
- Comment:
Theibis_extensions.py
file is adding support for Ibis tables and columns in the Hamilton framework. It defines a newSchemaValidatorIbis
class for schema validation of Ibis tables. Theget_column_ibis
function is registered to handle column extraction from Ibis tables. The code seems to follow the best practices and there are no apparent logical or performance issues. - Reasoning:
Theibis_extensions.py
file is adding support for Ibis tables and columns in the Hamilton framework. It defines a newSchemaValidatorIbis
class for schema validation of Ibis tables. Theget_column_ibis
function is registered to handle column extraction from Ibis tables. The code seems to follow the best practices and there are no apparent logical or performance issues.
2. hamilton/function_modifiers/base.py:37
:
- Assessed confidence :
10%
- Comment:
Thebase.py
file in thefunction_modifiers
directory has been modified to include 'ibis' in the list of plugin modules. This change is necessary to enable the use of the Ibis plugin that has been added in this PR. The change is correctly implemented and there are no apparent logical or performance issues. - Reasoning:
Thebase.py
file in thefunction_modifiers
directory has been modified to include 'ibis' in the list of plugin modules. This change is necessary to enable the use of the Ibis plugin that has been added in this PR. The change is correctly implemented and there are no apparent logical or performance issues.
3. examples/ibis/table_dataflow.py:48
:
- Assessed confidence :
10%
- Comment:
Thetable_dataflow.py
file in theibis
examples directory provides an example of table-level feature engineering using Ibis and Hamilton. Theraw_table
function reads a CSV file into an Ibis table and renames the columns to snake_case. Thefeature_table
function adds new feature columns to the table. Thefeature_set
function selects feature columns and filters rows. The code seems to follow the best practices and there are no apparent logical or performance issues. - Reasoning:
Thetable_dataflow.py
file in theibis
examples directory provides an example of table-level feature engineering using Ibis and Hamilton. Theraw_table
function reads a CSV file into an Ibis table and renames the columns to snake_case. Thefeature_table
function adds new feature columns to the table. Thefeature_set
function selects feature columns and filters rows. The code seems to follow the best practices and there are no apparent logical or performance issues.
4. examples/ibis/run.py:83
:
- Assessed confidence :
10%
- Comment:
Therun.py
file in theibis
examples directory is the main script to run the Ibis examples. It imports the necessary modules and defines themain
function which builds the Hamilton driver with the appropriate dataflow components based on the command-line arguments. Themain
function also visualizes the execution of the driver and prints the keys of the result. The code seems to follow the best practices and there are no apparent logical or performance issues. - Reasoning:
Therun.py
file in theibis
examples directory is the main script to run the Ibis examples. It imports the necessary modules and defines themain
function which builds the Hamilton driver with the appropriate dataflow components based on the command-line arguments. Themain
function also visualizes the execution of the driver and prints the keys of the result. The code seems to follow the best practices and there are no apparent logical or performance issues.
5. examples/ibis/model_training.py:172
:
- Assessed confidence :
10%
- Comment:
Themodel_training.py
file in theibis
examples directory provides an example of machine learning model training using Ibis and Hamilton. It defines several functions for model training, data preprocessing, and cross-validation. Thebase_model__linear
,base_model__random_forest
, andbase_model__boosting
functions define the base models for linear regression, random forest regression, and gradient boosting regression, respectively. Thepreprocessing_recipe
function defines the preprocessing steps. Thedata_split
function generates indices for train/validation splits. Theprepare_data
function splits the data and applies the preprocessing recipe. Thecross_validation_fold
function trains the model and makes predictions on the validation set. Thecross_validation_fold_collection
function collects the results from cross-validation folds. Theprediction_table
function creates a table with cross-validation predictions. Thestore_predictions
function stores the cross-validation predictions table. Thetrain_full_model
function trains a model on the full dataset for inference. The code seems to follow the best practices and there are no apparent logical or performance issues. - Reasoning:
Themodel_training.py
file in theibis
examples directory provides an example of machine learning model training using Ibis and Hamilton. It defines several functions for model training, data preprocessing, and cross-validation. Thebase_model__linear
,base_model__random_forest
, andbase_model__boosting
functions define the base models for linear regression, random forest regression, and gradient boosting regression, respectively. Thepreprocessing_recipe
function defines the preprocessing steps. Thedata_split
function generates indices for train/validation splits. Theprepare_data
function splits the data and applies the preprocessing recipe. Thecross_validation_fold
function trains the model and makes predictions on the validation set. Thecross_validation_fold_collection
function collects the results from cross-validation folds. Theprediction_table
function creates a table with cross-validation predictions. Thestore_predictions
function stores the cross-validation predictions table. Thetrain_full_model
function trains a model on the full dataset for inference. The code seems to follow the best practices and there are no apparent logical or performance issues.
6. examples/ibis/column_dataflow.py:61
:
- Assessed confidence :
10%
- Comment:
Thecolumn_dataflow.py
file in theibis
examples directory provides an example of column-level feature engineering using Ibis and Hamilton. Theraw_table
function reads a CSV file into an Ibis table and renames the columns to snake_case. Thehas_children
,has_pet
, andis_summer_brazil
functions define new feature columns based on the existing columns. Thefeature_table
function adds the new feature columns to the table. Thefeature_set
function selects feature columns and filters rows. The code seems to follow the best practices and there are no apparent logical or performance issues. - Reasoning:
Thecolumn_dataflow.py
file in theibis
examples directory provides an example of column-level feature engineering using Ibis and Hamilton. Theraw_table
function reads a CSV file into an Ibis table and renames the columns to snake_case. Thehas_children
,has_pet
, andis_summer_brazil
functions define new feature columns based on the existing columns. Thefeature_table
function adds the new feature columns to the table. Thefeature_set
function selects feature columns and filters rows. The code seems to follow the best practices and there are no apparent logical or performance issues.
Workflow ID: wflow_Ne78jOX24ONR8E0K
Not what you expected? You can customize the content of the reviews using rules. Learn more here.
Example on how to use Hamilton + Ibis for feature engineering and machine learning model training.
Changes
examples/ibis
directory with table-level and column-level feature engineeringhamilton.plugins.ibis_extensions
to support column-level operations on Ibis tablesibis_extensions
also has aSchemaValidatorIbis
that uses ibisSchema().equals()
(docs)How I tested this
hamilton.plugins.vaex_extensions
Notes
docs/integrations
Ibis #722 depends on this PR andhamilton.plugins.ibis_extensions
Checklist
Summary:
This PR introduces an example of using Hamilton with Ibis for feature engineering and machine learning model training, adds a new plugin for supporting column-level operations on Ibis tables, and modifies
hamilton/function_modifiers/base.py
to include 'ibis' in the list of plugin modules.Key points:
hamilton/function_modifiers/base.py
to include 'ibis' in the list of plugin modules.examples/ibis
directory with scripts for table-level and column-level feature engineering.SchemaValidatorIbis
in theibis_extensions
for schema validation.Generated with ❤️ by ellipsis.dev