Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#5507] feat(python): Support Azure blob storage for GVFS python client #5538

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

yuqi1129
Copy link
Contributor

What changes were proposed in this pull request?

Support GVFS python client to access ADSL fileset.

Why are the changes needed?

This is a subsequent PR for #5508

Fix: # (issue)

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

IT locally.

@yuqi1129
Copy link
Contributor Author

This PR is not ready for review until #5508 is merged.

@yuqi1129 yuqi1129 self-assigned this Nov 12, 2024
@yuqi1129 yuqi1129 changed the title [#5507] feat(python): Support Azure block storage for Gravitino server and GVFS Java client [#5507] feat(python): Support Azure blob storage for Gravitino server and GVFS Java client Nov 12, 2024
@yuqi1129 yuqi1129 changed the title [#5507] feat(python): Support Azure blob storage for Gravitino server and GVFS Java client [#5507] feat(python): Support Azure blob storage for GVFS python client Nov 12, 2024
@jerryshao
Copy link
Contributor

@yuqi1129 can you please rebase PR.

ops = infer_storage_options(storage_location)
if "username" not in ops or "host" not in ops or "path" not in ops:
raise GravitinoRuntimeException(
f"Storage location:{storage_location} doesn't support now. as the username,"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Username, host...are required"

actual_prefix = f"{StorageType.ABS.value}://{ops['username']}@{ops['host']}{ops['path']}"

# For ABS, the actual path should be the same as the virtual path is like
# 'wasbs//bucket1@xiaoyu123.blob.core.windows.net/test_gvfs_catalog6588/test_gvfs_schema/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we specify "wasbs"? Also what's the meaning here?

@@ -801,6 +818,12 @@ def _strip_storage_protocol(storage_type: StorageType, path: str):
if storage_type == StorageType.LOCAL:
return path[len(f"{StorageType.LOCAL.value}:") :]

## We need to remove the protocol and host from the path for instance
# 'wsabs://container@account/path' to 'container/path'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "wasbs" here?


@unittest.skipUnless(azure_abs_is_prepared(), "Azure Blob Storage is not prepared.")
class TestGvfsWithABS(TestGvfsWithHDFS):
# Before running this test, please set the make sure aliyun-azure-x.jar has been
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aliyun?

)

self.check_mkdir(modified_dir, modified_actual_dir, fs)
# S3 only supports getting the `object` modify time, so the modified time will be None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fully check the code and comment to avoid copy typos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants