Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compatibility with pathlib by implementing more common methods #230

Merged
merged 16 commits into from
May 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# cloudpathlib Changelog

## v0.9.0 (UNRELEASED)
- Added `absolute` to `CloudPath` (does nothing as `CloudPath` is always absolute) ([PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Added `resolve` to `CloudPath` (does nothing as `CloudPath` is resolved in advance) ([Issue #151](https://github.com/drivendataorg/cloudpathlib/issues/151), [PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Added `relative_to` to `CloudPath` which returns a PosixPath ([Issue #149](https://github.com/drivendataorg/cloudpathlib/issues/149), [PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Added `is_relative_to` to `CloudPath` ([Issue #149](https://github.com/drivendataorg/cloudpathlib/issues/149), [PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Added `is_absolute` to `CloudPath` (always true as `CloudPath` is always absolute) ([PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Accept and delegate `read_text` parameters to cached file ([PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Add `exist_ok` parameter to `touch` ([PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))
- Add `missing_ok` parameter to `unlink`, which defaults to True. This diverges from pathlib to maintain backward compatibility ([PR #230](https://github.com/drivendataorg/cloudpathlib/pull/230))

## v0.8.0 (2022-05-19)

- Fixed pickling of `CloudPath` objects not working. ([Issue #223](https://github.com/drivendataorg/cloudpathlib/issues/223), [PR #224](https://github.com/drivendataorg/cloudpathlib/pull/224))
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,16 @@ Most methods and properties from `pathlib.Path` are supported except for the one

| Methods + properties | `AzureBlobPath` | `S3Path` | `GSPath` |
|:-----------------------|:------------------|:-----------|:-----------|
| `absolute` | ✅ | ✅ | ✅ |
| `anchor` | ✅ | ✅ | ✅ |
| `as_uri` | ✅ | ✅ | ✅ |
| `drive` | ✅ | ✅ | ✅ |
| `exists` | ✅ | ✅ | ✅ |
| `glob` | ✅ | ✅ | ✅ |
| `is_absolute` | ✅ | ✅ | ✅ |
| `is_dir` | ✅ | ✅ | ✅ |
| `is_file` | ✅ | ✅ | ✅ |
| `is_relative_to` | ✅ | ✅ | ✅ |
| `iterdir` | ✅ | ✅ | ✅ |
| `joinpath` | ✅ | ✅ | ✅ |
| `match` | ✅ | ✅ | ✅ |
Expand All @@ -137,8 +140,10 @@ Most methods and properties from `pathlib.Path` are supported except for the one
| `parts` | ✅ | ✅ | ✅ |
| `read_bytes` | ✅ | ✅ | ✅ |
| `read_text` | ✅ | ✅ | ✅ |
| `relative_to` | ✅ | ✅ | ✅ |
| `rename` | ✅ | ✅ | ✅ |
| `replace` | ✅ | ✅ | ✅ |
| `resolve` | ✅ | ✅ | ✅ |
| `rglob` | ✅ | ✅ | ✅ |
| `rmdir` | ✅ | ✅ | ✅ |
| `samefile` | ✅ | ✅ | ✅ |
Expand All @@ -152,19 +157,16 @@ Most methods and properties from `pathlib.Path` are supported except for the one
| `with_suffix` | ✅ | ✅ | ✅ |
| `write_bytes` | ✅ | ✅ | ✅ |
| `write_text` | ✅ | ✅ | ✅ |
| `absolute` | ❌ | ❌ | ❌ |
| `as_posix` | ❌ | ❌ | ❌ |
| `chmod` | ❌ | ❌ | ❌ |
| `cwd` | ❌ | ❌ | ❌ |
| `expanduser` | ❌ | ❌ | ❌ |
| `group` | ❌ | ❌ | ❌ |
| `home` | ❌ | ❌ | ❌ |
| `is_absolute` | ❌ | ❌ | ❌ |
| `is_block_device` | ❌ | ❌ | ❌ |
| `is_char_device` | ❌ | ❌ | ❌ |
| `is_fifo` | ❌ | ❌ | ❌ |
| `is_mount` | ❌ | ❌ | ❌ |
| `is_relative_to` | ❌ | ❌ | ❌ |
| `is_reserved` | ❌ | ❌ | ❌ |
| `is_socket` | ❌ | ❌ | ❌ |
| `is_symlink` | ❌ | ❌ | ❌ |
Expand All @@ -173,8 +175,6 @@ Most methods and properties from `pathlib.Path` are supported except for the one
| `lstat` | ❌ | ❌ | ❌ |
| `owner` | ❌ | ❌ | ❌ |
| `readlink` | ❌ | ❌ | ❌ |
| `relative_to` | ❌ | ❌ | ❌ |
| `resolve` | ❌ | ❌ | ❌ |
| `root` | ❌ | ❌ | ❌ |
| `symlink_to` | ❌ | ❌ | ❌ |
| `with_stem` | ❌ | ❌ | ❌ |
Expand Down
11 changes: 8 additions & 3 deletions cloudpathlib/azure/azblobclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,19 +205,24 @@ def _move_file(

return dst

def _remove(self, cloud_path: AzureBlobPath) -> None:
if self._is_file_or_dir(cloud_path) == "dir":
def _remove(self, cloud_path: AzureBlobPath, missing_ok: bool = True) -> None:
file_or_dir = self._is_file_or_dir(cloud_path)
if file_or_dir == "dir":
blobs = [
b.blob for b, is_dir in self._list_dir(cloud_path, recursive=True) if not is_dir
]
container_client = self.service_client.get_container_client(cloud_path.container)
container_client.delete_blobs(*blobs)
elif self._is_file_or_dir(cloud_path) == "file":
elif file_or_dir == "file":
blob = self.service_client.get_blob_client(
container=cloud_path.container, blob=cloud_path.blob
)

blob.delete_blob()
else:
# Does not exist
if not missing_ok:
raise FileNotFoundError(f"File does not exist: {cloud_path}")

def _upload_file(
self, local_path: Union[str, os.PathLike], cloud_path: AzureBlobPath
Expand Down
4 changes: 3 additions & 1 deletion cloudpathlib/azure/azblobpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,10 @@ def mkdir(self, parents=False, exist_ok=False):
# not possible to make empty directory on blob storage
pass

def touch(self):
def touch(self, exist_ok: bool = True):
if self.exists():
if not exist_ok:
raise FileExistsError(f"File exists: {self}")
self.client._move_file(self, self)
else:
tf = TemporaryDirectory()
Expand Down
2 changes: 1 addition & 1 deletion cloudpathlib/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def _move_file(
pass

@abc.abstractmethod
def _remove(self, path: BoundedCloudPath) -> None:
def _remove(self, path: BoundedCloudPath, missing_ok: bool = True) -> None:
"""Remove a file or folder from the server.

Parameters
Expand Down
48 changes: 36 additions & 12 deletions cloudpathlib/cloudpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,15 +254,13 @@ def __ge__(self, other: Any) -> bool:
return self.parts >= other.parts

# ====================== NOT IMPLEMENTED ======================
# absolute - no cloud equivalent; all cloud paths are absolute already
# as_posix - no cloud equivalent; not needed since we assume url separator
# chmod - permission changing should be explicitly done per client with methods
# that make sense for the client permission options
# cwd - no cloud equivalent
# expanduser - no cloud equivalent
# group - should be implemented with client-specific permissions
# home - no cloud equivalent
# is_absolute - no cloud equivalent; all cloud paths are absolute already
# is_block_device - no cloud equivalent
# is_char_device - no cloud equivalent
# is_fifo - no cloud equivalent
Expand All @@ -273,8 +271,6 @@ def __ge__(self, other: Any) -> bool:
# lchmod - no cloud equivalent
# lstat - no cloud equivalent
# owner - no cloud equivalent
# relative to - cloud paths are absolute
# resolve - all cloud paths are absolute, so no resolving
# root - drive already has the bucket and anchor/prefix has the scheme, so nothing to store here
# symlink_to - no cloud equivalent

Expand Down Expand Up @@ -302,7 +298,7 @@ def mkdir(self, parents: bool = False, exist_ok: bool = False):
pass

@abc.abstractmethod
def touch(self):
def touch(self, exist_ok: bool = True):
"""Should be implemented using the client API to create and update modified time"""
pass

Expand Down Expand Up @@ -472,12 +468,13 @@ def samefile(self, other_path: "CloudPath") -> bool:
# all cloud paths are absolute and the paths are used for hash
return self == other_path

def unlink(self):
def unlink(self, missing_ok=True):
# Note: missing_ok defaults to False in pathlib, but changing the default now would be a breaking change.
pjbull marked this conversation as resolved.
Show resolved Hide resolved
if self.is_dir():
raise CloudPathIsADirectoryError(
f"Path {self} is a directory; call rmdir instead of unlink."
)
self.client._remove(self)
self.client._remove(self, missing_ok)

def write_bytes(self, data: bytes):
"""Open the file in bytes mode, write to it, and close the file.
Expand Down Expand Up @@ -535,20 +532,47 @@ def _dispatch_to_path(self, func, *args, **kwargs):
self._new_cloudpath(_resolve(p)) for p in path_version if _resolve(p) != p.root
)

# when pathlib something else, we probably just want that thing
# when pathlib returns something else, we probably just want that thing
# cases this should include: str, empty sequence, sequence of str, ...
else:
return path_version

def __truediv__(self, other):
if not isinstance(other, (str,)):
raise TypeError(f"Can only join path {repr(self)} with strings.")
if not isinstance(other, (str, PurePosixPath)):
raise TypeError(f"Can only join path {repr(self)} with strings or posix paths.")

return self._dispatch_to_path("__truediv__", other)

def joinpath(self, *args):
return self._dispatch_to_path("joinpath", *args)

def absolute(self):
return self

def is_absolute(self):
return True

def resolve(self, strict=False):
return self

def relative_to(self, other):
# We don't dispatch regularly since this never returns a cloud path (since it is relative, and cloud paths are
# absolute)
if not isinstance(other, CloudPath):
Gilthans marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError(f"{self} is a cloud path, but {other} is not")
if self.cloud_prefix != other.cloud_prefix:
raise ValueError(
f"{self} is a {self.cloud_prefix} path, but {other} is a {other.cloud_prefix} path"
)
return self._path.relative_to(other._path)

def is_relative_to(self, other):
try:
self.relative_to(other)
return True
except ValueError:
return False

@property
def name(self):
return self._dispatch_to_path("name")
Expand Down Expand Up @@ -630,8 +654,8 @@ def stat(self):
def read_bytes(self):
return self._dispatch_to_local_cache_path("read_bytes")

def read_text(self):
return self._dispatch_to_local_cache_path("read_text")
def read_text(self, *args, **kwargs):
return self._dispatch_to_local_cache_path("read_text", *args, **kwargs)

# =========== public cloud methods, not in pathlib ===============
def download_to(self, destination: Union[str, os.PathLike]) -> Path:
Expand Down
11 changes: 8 additions & 3 deletions cloudpathlib/gs/gsclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,17 +196,22 @@ def _move_file(self, src: GSPath, dst: GSPath, remove_src: bool = True) -> GSPat

return dst

def _remove(self, cloud_path: GSPath) -> None:
if self._is_file_or_dir(cloud_path) == "dir":
def _remove(self, cloud_path: GSPath, missing_ok: bool = True) -> None:
file_or_dir = self._is_file_or_dir(cloud_path)
if file_or_dir == "dir":
blobs = [
b.blob for b, is_dir in self._list_dir(cloud_path, recursive=True) if not is_dir
]
bucket = self.client.bucket(cloud_path.bucket)
for blob in blobs:
bucket.get_blob(blob).delete()
elif self._is_file_or_dir(cloud_path) == "file":
elif file_or_dir == "file":
bucket = self.client.bucket(cloud_path.bucket)
bucket.get_blob(cloud_path.blob).delete()
else:
# Does not exist
if not missing_ok:
raise FileNotFoundError(f"File does not exist: {cloud_path}")

def _upload_file(self, local_path: Union[str, os.PathLike], cloud_path: GSPath) -> GSPath:
bucket = self.client.bucket(cloud_path.bucket)
Expand Down
4 changes: 3 additions & 1 deletion cloudpathlib/gs/gspath.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,10 @@ def mkdir(self, parents=False, exist_ok=False):
# not possible to make empty directory on cloud storage
pass

def touch(self):
def touch(self, exist_ok: bool = True):
if self.exists():
if not exist_ok:
raise FileExistsError(f"File exists: {self}")
self.client._move_file(self, self)
else:
tf = TemporaryDirectory()
Expand Down
9 changes: 7 additions & 2 deletions cloudpathlib/local/localclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,11 @@ def _move_file(
shutil.copy(self._cloud_path_to_local(src), self._cloud_path_to_local(dst))
return dst

def _remove(self, cloud_path: "LocalPath") -> None:
def _remove(self, cloud_path: "LocalPath", missing_ok: bool = True) -> None:
local_storage_path = self._cloud_path_to_local(cloud_path)
if not missing_ok and not local_storage_path.exists():
raise FileNotFoundError(f"File does not exist: {cloud_path}")

if local_storage_path.is_file():
local_storage_path.unlink()
elif local_storage_path.is_dir():
Expand All @@ -121,8 +124,10 @@ def _stat(self, cloud_path: "LocalPath") -> os.stat_result:
)
)

def _touch(self, cloud_path: "LocalPath") -> None:
def _touch(self, cloud_path: "LocalPath", exist_ok: bool = True) -> None:
local_storage_path = self._cloud_path_to_local(cloud_path)
if local_storage_path.exists() and not exist_ok:
raise FileExistsError(f"File exists: {cloud_path}")
local_storage_path.parent.mkdir(exist_ok=True, parents=True)
local_storage_path.touch()

Expand Down
4 changes: 2 additions & 2 deletions cloudpathlib/local/localpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ def stat(self):
)
return meta

def touch(self):
self.client._touch(self)
def touch(self, exist_ok: bool = True):
self.client._touch(self, exist_ok)
5 changes: 4 additions & 1 deletion cloudpathlib/s3/s3client.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ def _move_file(self, src: S3Path, dst: S3Path, remove_src: bool = True) -> S3Pat
self._remove(src)
return dst

def _remove(self, cloud_path: S3Path) -> None:
def _remove(self, cloud_path: S3Path, missing_ok: bool = True) -> None:
try:
obj = self.s3.Object(cloud_path.bucket, cloud_path.key)

Expand All @@ -250,6 +250,9 @@ def _remove(self, cloud_path: S3Path) -> None:
# resp will be [], so no need to check success
if resp:
assert resp[0].get("ResponseMetadata").get("HTTPStatusCode") == 200
else:
if not missing_ok:
raise FileNotFoundError(f"File does not exist: {cloud_path}")

def _upload_file(self, local_path: Union[str, os.PathLike], cloud_path: S3Path) -> S3Path:
obj = self.s3.Object(cloud_path.bucket, cloud_path.key)
Expand Down
4 changes: 3 additions & 1 deletion cloudpathlib/s3/s3path.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,10 @@ def mkdir(self, parents=False, exist_ok=False):
# not possible to make empty directory on s3
pass

def touch(self):
def touch(self, exist_ok: bool = True):
if self.exists():
if not exist_ok:
raise FileExistsError(f"File exists: {self}")
self.client._move_file(self, self)
else:
tf = TemporaryDirectory()
Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ mike
mkdocs>=1.2.2
mkdocs-jupyter
mkdocs-material>=7.2.6
mkdocstrings>=0.15
mkdocstrings[python]>=0.19.0
mypy
pandas
pillow
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,5 +60,5 @@ def load_requirements(path: Path):
"Source Code": "https://github.com/drivendataorg/cloudpathlib",
},
url="https://github.com/drivendataorg/cloudpathlib",
version="0.8.0",
version="0.9.0",
)
5 changes: 5 additions & 0 deletions tests/mock_clients/mock_s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,10 +179,15 @@ def limit(self, n):
return self.s3_obj_paths[:n]

def delete(self):
any_deleted = False
for p in self.full_paths:
if Path(p).exists():
any_deleted = True
Path(p).unlink()
delete_empty_parents_up_to_root(Path(p), self.root)

if not any_deleted:
return []
return [{"ResponseMetadata": {"HTTPStatusCode": 200}}]


Expand Down
8 changes: 7 additions & 1 deletion tests/test_cloudpath_file_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,13 @@ def test_file_discovery(rig):
assert not p2.exists()
p2.touch()
assert p2.exists()
p2.unlink()
p2.touch(exist_ok=True)
with pytest.raises(FileExistsError):
p2.touch(exist_ok=False)
p2.unlink(missing_ok=False)
p2.unlink(missing_ok=True)
with pytest.raises(FileNotFoundError):
p2.unlink(missing_ok=False)

p3 = rig.create_cloud_path("dir_0/")
assert p3.exists()
Expand Down
Loading