-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split content of examples/ibis
into examples/ibisml
#741
Conversation
- Added a README with the basic instructions to run the example and additional resources links - removed the schema validator from `table_dataflow.py` because it's causing some unexpected bugs; currently investigating - the IbisML team is saying their API is under refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me!
- Reviewed the entire pull request up to 7ab2b5d
- Looked at
363
lines of code in7
files - Took 1 minute and 47 seconds to review
More info
- Skipped
4
files when reviewing. - Skipped posting
4
additional comments because they didn't meet confidence threshold of50%
.
1. examples/ibis/table_dataflow.py:25
:
- Assessed confidence :
50%
- Comment:
The schema validator has been removed from thefeature_set
function. Please ensure to add it back once the potential bug has been investigated. - Reasoning:
The PR description mentions that the schema validator was temporarily removed to investigate a potential bug. This is confirmed by the removal of the@check_output
decorator intable_dataflow.py
. This could potentially lead to issues if the output of thefeature_set
function does not match the expected schema. However, since this is a temporary measure and the author is aware of it, it's not a major concern.
2. examples/ibisml/model_training.py:102
:
- Assessed confidence :
50%
- Comment:
Thecross_validation_fold
function now returns the actual validation set instead of the indices. Please ensure that all parts of the codebase that use this function have been updated to handle the new return type. - Reasoning:
Thecross_validation_fold
function inmodel_training.py
has been modified to return the actual validation set instead of the indices. This is a logical change that should improve the function's usability. However, it's important to ensure that the rest of the codebase that uses this function has been updated to handle the new return type.
3. examples/ibisml/model_training.py:82
:
- Assessed confidence :
0%
- Comment:
Good job on ensuring that they_train
andy_val
outputs are converted to numpy arrays and reshaped. This ensures compatibility with the scikit-learn models used later in the code. - Reasoning:
Theprepare_data
function inmodel_training.py
has been updated to convert they_train
andy_val
outputs to numpy arrays and reshape them. This is likely done to ensure compatibility with the scikit-learn models used later in the code. This is a good practice as it ensures that the data is in the correct format for the model.
4. examples/ibisml/run.py:19
:
- Assessed confidence :
0%
- Comment:
Therun.py
script is well-structured and follows good practices such as using argparse for command-line arguments and using a main function to encapsulate the script's functionality. - Reasoning:
Therun.py
file in theibisml
directory has been added as part of this PR. It appears to be a script to run the machine learning model training process using the Hamilton library. The script seems to be well-structured and follows good practices such as using argparse for command-line arguments and using a main function to encapsulate the script's functionality.
Workflow ID: wflow_xwTqm5G8DH2oQfTM
Not what you expected? You can customize the content of the reviews using rules. Learn more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me!
- Performed an incremental review on 26dc7c8
- Looked at
164
lines of code in3
files - Took 9 minutes and 25 seconds to review
More info
- Skipped
0
files when reviewing. - Skipped posting
1
additional comments because they didn't meet confidence threshold of50%
.
1. examples/ibisml/model_training.py:89
:
- Assessed confidence :
100%
- Grade:
0%
- Comment:
Thedata_split
parameter is not used in this function. If it's not needed, consider removing it to avoid confusion.
def cross_validation_fold(
X_train: pd.DataFrame,
X_val: pd.DataFrame,
y_train: pd.DataFrame,
y_val: pd.DataFrame,
base_model: BaseEstimator,
) -> dict:
"""Train model and make predictions on validation"""
model = clone(base_model)
model.fit(X_train, y_train)
y_val_pred = model.predict(X_val)
score = mean_squared_error(y_val, y_val_pred)
return dict(y_true=y_val, y_pred=y_val_pred, score=score)
- Reasoning:
The code seems to be well-structured and follows good practices. However, there is a minor issue in themodel_training.py
file. The functioncross_validation_fold
has a parameterdata_split
which is not used in the function. This could be a leftover from a previous version of the code and should be removed if it's not needed.
Workflow ID: wflow_4OCqSWQtPEhyUO0e
Not what you expected? You can customize the content of the reviews using rules. Learn more here.
Split the code from
examples/ibis
andexamples/ibisml
.This includes a few fixes:
Changes
How I tested this
Notes
Checklist
Summary:
This PR splits the
examples/ibis
intoexamples/ibisml
, removescolumn_dataflow.py
, makes changes tomodel_training.py
andrun.py
, fixes requirements, changes cross-validation dataflow structure, keeps only table-level dataflow for ibisml example, and temporarily removes schema validator.Key points:
examples/ibis
intoexamples/ibisml
column_dataflow.py
fromexamples/ibisml
model_training.py
andrun.py
Generated with ❤️ by ellipsis.dev