Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: resolve relative datasample spec path #392

Merged
merged 7 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- Paths are now resolved on DatasampleSpec objects. Which means that users can pass relative paths ([#392](https://github.com/Substra/substra/pull/392))

## [0.49.0](https://github.com/Substra/substra/releases/tag/0.49.0) - 2023-10-18

### Added
Expand Down
19 changes: 18 additions & 1 deletion substra/sdk/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,14 +147,31 @@ def is_many(self):
return self.paths and len(self.paths) > 0

@pydantic.model_validator(mode="before")
def exclusive_paths(cls, values): # noqa: N805
def exclusive_paths(cls, values: typing.Any) -> typing.Any: # noqa: N805
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove the# noqa: N805?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nop, its because it does not like that the first arg is not "self" :/

Copy link
Contributor

@guilhem-barthes guilhem-barthes Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method receive a class as first argument, pydantic recommends decorate with @classmethod for a proper type checking (model_validator)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice thx :)

"""Check that one and only one path(s) field is defined."""
if "paths" in values and "path" in values:
raise ValueError("'path' and 'paths' fields are exclusive.")
if "paths" not in values and "path" not in values:
raise ValueError("'path' or 'paths' field must be set.")
Comment on lines 153 to 156
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could they define both keys but with value None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get what you're refering too. Could you provide an example plz ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we allow None as value on these fields, therefore if a user explicitely define only one of this key to None (e.g. DataSampleSpec(path=None, data_manager_keys=[]), the check "paths" not in values and "path" not in values will return False (as the key is present, but the value is None).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see. I added some tests, and it appears that this case is dealts with in the resolve path, as None will raise a TypeError. This is not an explicit solution but I added the tests to ensure that this behavior is taken into account. What do you think ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if TypeError is the clearest for the user, but maybe it is worth opening its own PR to review how we manage this edge case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep True... I rm the tests, and +1 to deal with this in a dedicated PR

return values

@pydantic.model_validator(mode="before")
def resolve_paths(cls, values: typing.Any) -> typing.Any: # noqa: N805
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove the # noqa: N805? It should be handled by the typing

"""Resolve given path is relative."""
if "paths" in values:
paths = []
for path in values["paths"]:
path = pathlib.Path(path)
paths.append(path.resolve())

values["paths"] = paths

elif "path" in values:
path = pathlib.Path(values["path"])
values["path"] = path.resolve()

Copy link

@jeandut jeandut Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else: values["path"]=path

return values

@contextlib.contextmanager
def build_request_kwargs(self, local):
# redefine kwargs builder to handle the local paths
Expand Down
20 changes: 20 additions & 0 deletions tests/sdk/test_schemas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import pathlib
import uuid

import pytest

from substra.sdk.schemas import DataSampleSpec


@pytest.mark.parametrize("path", [pathlib.Path() / "data", "./data", pathlib.Path().cwd() / "data"])
def test_datasample_spec_resolve_path(path):
datasample_spec = DataSampleSpec(path=path, data_manager_keys=[str(uuid.uuid4())])

assert datasample_spec.path == pathlib.Path().cwd() / "data"


def test_datasample_spec_resolve_paths():
paths = [pathlib.Path() / "data", "./data", pathlib.Path().cwd() / "data"]
datasample_spec = DataSampleSpec(paths=paths, data_manager_keys=[str(uuid.uuid4())])

assert all([path == pathlib.Path().cwd() / "data" for path in datasample_spec.paths])
Loading