Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas XML Reader and Writer Class #356

Merged
merged 5 commits into from
Sep 20, 2023
Merged

Conversation

JoJo10Smith
Copy link
Contributor

@JoJo10Smith JoJo10Smith commented Sep 18, 2023

This pull request adds XML read and write functionality for ticket #352

Changes

This code adds the functionality to read and write using the Pandas XML functions to Hamilton.
read_xml: https://pandas.pydata.org/docs/reference/api/pandas.read_xml.html
to_xml: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_xml.html

The changes to the files are as described below:

pandas_extensions.py: Added the functions to read and write XML files
my_script.py: Added the example of XML materialization and optional to add Hamilton to PATH with directory
notebook.ipynb: Added the example of XML materialization and optional to add Hamilton to PATH with directory
requirements-test.txt: Added the module lxml since it is a requirement for the parser argument when reading XML files
test_pandas_extensions.py: Added test cases for the read_xml and to_xml functions that are called PandasXmlReader and PandasXmlWriter respectively
test_load_from_data.xml: A simple XML file with example data to test functionality

How I tested this

I used the pre-commit package and all tests passed before committing, there is also the circleci tests. A screenshot of the most recent commit is below:

pre-commit:

image

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

…ding the lxml module to the requirements-test.txt file, adding unit tests to the test_pandas_extensions.py file, and an example file in XML format
@JoJo10Smith JoJo10Smith mentioned this pull request Sep 18, 2023
@JoJo10Smith
Copy link
Contributor Author

@skrawcz I managed to get the my_script.py file to work and produced the following output. I'm still struggling with the Jupyter Notebook but have shown that the code works, just need to add the example in the notebook.

image

@JoJo10Smith JoJo10Smith changed the title Pandas xml Pandas XML Reader and Writer Class Sep 18, 2023
@skrawcz
Copy link
Collaborator

skrawcz commented Sep 18, 2023

@skrawcz I managed to get the my_script.py file to work and produced the following output. I'm still struggling with the Jupyter Notebook but have shown that the code works, just need to add the example in the notebook.

image

Why don't you join Hamilton slack and ping me -- happy to find 15 mins this week to help.

…needed to update my pandas in jupyter to get it to work and added code to add the hamilton package to the path as an option for development
@skrawcz skrawcz merged commit 1565f21 into DAGWorks-Inc:main Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants