Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract XLSX logic + Add Data Dictionary + Add Artifacts #1263

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

alexrichey
Copy link
Contributor

@alexrichey alexrichey commented Nov 21, 2024

FYI, I'm going to do a real writeup on this tomorrow, and add docstrings to make this more readable. If you get there beforehand, the commits are quite atomic and could be easily read that way. Also, there are two tests that we'd expect to fail atm, since they rely on changes in the metadata repo here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think with formatted cells and rows, we're really talking about a Workbook or Sheet or Worksheet here more so than a doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know part of the idea is to abstract from that a bit, but I don't think we can quite escape it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe its a "formatted table"? "tabular report"? The crux of it seems to be these things that are undoubtedly worksheet tables, with some decoration/metadata that then a specific implementation gets to decide exactly how to dump it out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like tabular report. However, I do think you could pretty easily render this as HTML, so while it's currently just Excel workbooks, I think that might be overly specific.

@fvankrieken
Copy link
Contributor

Not to muddy the waters but have you looked at xlsxwriter? Seems to have a bit more support for formatting. Though it seems like you've managed to make things work pretty well with openpyxl

Copy link

codecov bot commented Dec 2, 2024

Codecov Report

Attention: Patch coverage is 89.75155% with 33 lines in your changes missing coverage. Please review.

Project coverage is 70.27%. Comparing base (65b2420) to head (ac0fcc4).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
dcpy/lifecycle/package/abstract_doc.py 84.61% 9 Missing and 7 partials ⚠️
dcpy/lifecycle/package/xlsx_writer.py 88.34% 6 Missing and 6 partials ⚠️
dcpy/lifecycle/package/assemble.py 25.00% 3 Missing ⚠️
dcpy/lifecycle/package/_cli.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1263      +/-   ##
==========================================
+ Coverage   69.67%   70.27%   +0.59%     
==========================================
  Files         111      115       +4     
  Lines        5913     6142     +229     
  Branches      659      700      +41     
==========================================
+ Hits         4120     4316     +196     
- Misses       1661     1682      +21     
- Partials      132      144      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexrichey alexrichey force-pushed the ar-new-metadata-tab branch 4 times, most recently from f1e3a7a to b88fd84 Compare December 3, 2024 15:28
@alexrichey alexrichey marked this pull request as ready for review December 3, 2024 18:43
@alexrichey alexrichey changed the title Data Dictionary WIP for XLSX Extract XLSX logic + Add Data Dictionary + Add Artifacts Dec 3, 2024
@alexrichey alexrichey force-pushed the ar-new-metadata-tab branch 2 times, most recently from 8564331 to 2109c2c Compare December 3, 2024 22:06
@alexrichey
Copy link
Contributor Author

Not to muddy the waters but have you looked at xlsxwriter? Seems to have a bit more support for formatting. Though it seems like you've managed to make things work pretty well with openpyxl

@fvankrieken Yeah, I'd started with openpyxl because we were starting from an XSLX provided by OTI, whereas xlsxwriter can't do that. This is less of a consideration now that we're effectively generating everything, though I think it's nice to enable a template XLSX that could be used as a starter. On the flipside, openpyxl hasn't had a commit in nine years (maybe it didn't need any) however, it'd be pretty trivial at this point to switch over, as almost all the logic now lives in abstract_docs.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

2 participants