-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jupyter Notebook for Data Loaders Example (Data loaders #1122) #1174
Conversation
@eoagyen thanks! Some feedback -- would you be able to use the %%cell_to_module magic for the notebook? So shouldn't be a big change -- just removing the import cell, using %%cell_to_module and adding the right imports. e.g. do something like this at the top of the respective cells that define Hamilton functions %%cell_to_module load_data_csv --display
# Loading csv data
import pandas as pd
from hamilton.function_modifiers import load_from, does, extract_columns, parameterize, source, value You can then remove the import of |
"driver = hamilton.driver.Driver(\n", | ||
" {\"db_path\": \"./test_data/database.duckdb\"}, load_data_duckdb, prep_data\n", | ||
")\n", | ||
"print(driver.execute(VARS))\n", | ||
"duckdb_execution_graph = driver.visualize_execution(VARS)\n", | ||
"display(duckdb_execution_graph)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the newer builder pattern please?
e.g.
from hamilton import base
import hamilton.driver
driver = (hamilton.driver.Builder()
.with_modules(load_data_duckdb, prep_data)
.with_adapters(base.PandasDataFrameResult()
.build()
)
then for the visualization pass in inputs with the dictionary:
duckdb_execution_graph = driver.visualize_execution(VARS, inputs={"db_path": "./test_data/database.duckdb"})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @skrawcz, I have implemented the changes suggested.
used. Implementing newer builder pattern and addition of further docummentation using markdown cells.
Thanks @eoagyen ! |
This PR adds a Jupyter Notebook for the
data_loaders
example to provide a less intimidating entry point for data scientists who prefer using notebooks over Python scripts. The notebook mirrors the functionality of the existingrun.py
file, enabling users to experiment and try things out interactively.Changes
A Jupyter notebook was added for the
data_loaders
example. The notebook mirrors the functionality ofrun.py
.How I tested this
Tested the notebook by running all cells and verifying that the outputs match those from
run.py
. Ensured that all functions and processes work as expected within the notebook environment.Notes
run.py
.Checklist