generate styled dataset PDF #1075

damonmcc · 2024-08-14T19:02:49Z

related to #944, #561

This is a first pass at styling our data dictionary PDFs, not an attempt to use the exact styling that the Design fellows proposed in fIgma.

changes

refactor tests related to generating PDF and XLSX data dictionaries
use weasyprint to generate styled PDFs via templated HTML and a CSS stylesheet

old vs new PDF

Old

New

codecov · 2024-09-18T16:08:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.83%. Comparing base (6c3d2bb) to head (4bd6a8d).
Report is 7 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1075   +/-   ##
=======================================
  Coverage   68.83%   68.83%           
=======================================
  Files         108      108           
  Lines        5513     5513           
  Branches      810      809    -1     
=======================================
  Hits         3795     3795           
  Misses       1592     1592           
  Partials      126      126

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alexrichey · 2024-10-02T17:20:34Z

dcpy/lifecycle/package/generate_metadata_assets.py



-def generate_pdf_from_html(output_html_path: Path, output_pdf_path: Path) -> Path:
+def generate_pdf_from_html(
+    output_html_path: Path,


change to input_html_path?

changed all of em to be shorter and consistent with write_oti_xlsx

alexrichey · 2024-10-02T17:20:51Z

dcpy/lifecycle/package/generate_metadata_assets.py

    subprocess.run(
        [
-            "pandoc",
+            "weasyprint",


alexrichey · 2024-10-02T17:22:46Z

dcpy/lifecycle/package/generate_metadata_assets.py



-def generate_pdf_from_html(output_html_path: Path, output_pdf_path: Path) -> Path:
+def generate_pdf_from_html(


Want to add a typer command for this?

was tempted to but I'd rather do that after considering whether this file generate_metadata_assets.py and oti_xlsx.py should even be separate files

if they're worth combining, that'd influence the existing/new typer commands

alexrichey · 2024-10-02T17:28:30Z

dcpy/test/lifecycle/package/test_generate_data_dictionary.py

+    yaml_file_path = TEST_METADATA_YAML_PATH
+    output_html_path = TEMP_DATA_PATH / "metadata.html"
+    output_pdf_path = TEMP_DATA_PATH / "metadata.pdf"
+    output_xlsx_path = TEMP_DATA_PATH / "my_data_dictionary.pdf"


hopefully still an xlsx after this refactor?

alexrichey

Good stuff! Happy to see weasyprint proven out. A few nits, but no need to re-request review if you get to them.

change in tests seem to have caused "indirect changes" in test coverage

damonmcc · 2024-10-02T17:58:39Z

dcpy/lifecycle/package/generate_metadata_assets.py

@@ -8,16 +8,21 @@
 DEFAULT_DATA_DICTIONARY_TEMPLATE_PATH = (
    RESOURCES_PATH / "data_dictionary_template.jinja"
 )
+DEFAULT_DATA_DICTIONARY_STYLESHEET_PATH = RESOURCES_PATH / "data_dictionary.css"


noting that I'm not sure if these DEFAULT_* variables are the best approach, just wanted to minimize changes in this PR

damonmcc force-pushed the dm-package-readme branch 7 times, most recently from 9ddbc03 to 32e4f18 Compare August 16, 2024 02:47

damonmcc force-pushed the dm-package-readme branch 6 times, most recently from 9879a4e to 684eb5c Compare August 26, 2024 15:24

alexrichey linked an issue Sep 4, 2024 that may be closed by this pull request

Opendata: Packaging - Data Dictionary / README generation from metadata #944

Closed

5 tasks

damonmcc force-pushed the dm-package-readme branch from 684eb5c to 34b969c Compare September 18, 2024 14:27

damonmcc removed a link to an issue Sep 18, 2024

Opendata: Packaging - Data Dictionary / README generation from metadata #944

Closed

5 tasks

damonmcc force-pushed the dm-package-readme branch 4 times, most recently from b498749 to f7bb6cc Compare September 18, 2024 16:03

damonmcc force-pushed the dm-package-readme branch 4 times, most recently from 27ae351 to 9c06034 Compare September 23, 2024 14:24

damonmcc force-pushed the dm-package-readme branch 3 times, most recently from c8292bf to a8d30a1 Compare October 1, 2024 23:38

damonmcc changed the title ~~add readme to dataset packages~~ generate styled dataset PDF Oct 2, 2024

damonmcc force-pushed the dm-package-readme branch from a8d30a1 to 0ecb76e Compare October 2, 2024 13:42

damonmcc force-pushed the dm-package-readme branch 2 times, most recently from e16f93f to da37bc1 Compare October 2, 2024 13:52

damonmcc added 2 commits October 2, 2024 12:39

use a conftest file in lifecycle.package tests

c481b07

use TestCase class for data dictionary tests

408f03b

damonmcc force-pushed the dm-package-readme branch from da37bc1 to 4378a19 Compare October 2, 2024 16:42

damonmcc marked this pull request as ready for review October 2, 2024 16:53

damonmcc assigned fvankrieken and alexrichey Oct 2, 2024

alexrichey reviewed Oct 2, 2024

View reviewed changes

dcpy/lifecycle/package/generate_metadata_assets.py

subprocess.run(

[

"pandoc",

"weasyprint",

Copy link

Contributor

alexrichey Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💪

alexrichey reviewed Oct 2, 2024

View reviewed changes

alexrichey approved these changes Oct 2, 2024

View reviewed changes

damonmcc added 3 commits October 2, 2024 13:50

combine all data dictionary generation tests

443ad22

fix newly-untested lines

5045202

change in tests seem to have caused "indirect changes" in test coverage

add dcp logo for package pdfs

1f50c24

damonmcc force-pushed the dm-package-readme branch from 4378a19 to d56e5ee Compare October 2, 2024 17:50

damonmcc commented Oct 2, 2024

View reviewed changes

damonmcc added 2 commits October 2, 2024 14:00

use weasyprint to generate pdfs from html + css

2979224

better generate_html_from_yaml parameter names

4bd6a8d

damonmcc force-pushed the dm-package-readme branch from d56e5ee to 4bd6a8d Compare October 2, 2024 18:00

damonmcc merged commit 6322eb2 into main Oct 2, 2024
20 checks passed

damonmcc deleted the dm-package-readme branch October 2, 2024 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generate styled dataset PDF #1075

generate styled dataset PDF #1075

damonmcc commented Aug 14, 2024 •

edited

Loading

codecov bot commented Sep 18, 2024 •

edited

Loading

alexrichey Oct 2, 2024

damonmcc Oct 2, 2024

alexrichey Oct 2, 2024

alexrichey Oct 2, 2024

damonmcc Oct 2, 2024

alexrichey Oct 2, 2024

damonmcc Oct 2, 2024

alexrichey left a comment

damonmcc Oct 2, 2024



		def generate_pdf_from_html(output_html_path: Path, output_pdf_path: Path) -> Path:
		def generate_pdf_from_html(

generate styled dataset PDF #1075

generate styled dataset PDF #1075

Conversation

damonmcc commented Aug 14, 2024 • edited Loading

changes

old vs new PDF

codecov bot commented Sep 18, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexrichey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

damonmcc commented Aug 14, 2024 •

edited

Loading

codecov bot commented Sep 18, 2024 •

edited

Loading