Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix path metadata #686

Merged
merged 2 commits into from
Feb 9, 2024
Merged

Fix path metadata #686

merged 2 commits into from
Feb 9, 2024

Conversation

elijahbenizzy
Copy link
Collaborator

Fixes path metadata for s3/other URLs.

Changes

How I tested this

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

@elijahbenizzy elijahbenizzy force-pushed the fix-path-metadata branch 2 times, most recently from 8345e45 to fc31a71 Compare February 8, 2024 21:12
@elijahbenizzy elijahbenizzy requested a review from skrawcz February 8, 2024 21:14
if isinstance(path, Path):
path = str(path)
parsed = parse.urlparse(path)
size = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be None or -1. 0 could be a real value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I think None is probably fine

Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add test that checks scheme & notes

Pandas and other tools can read from, say, s3. Our get_file_metadata
will currently break. This fixes it, adding default values + a note that
says that metadata is not supported. We will want to add more metadata
gatherer plugins later, but this is not necessary.
@elijahbenizzy elijahbenizzy merged commit 720e79b into main Feb 9, 2024
22 checks passed
@elijahbenizzy elijahbenizzy deleted the fix-path-metadata branch February 9, 2024 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants