Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pathlib.Path values for dvcyaml argument in Live(). #842

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

AbdullahMakhdoom
Copy link
Contributor

…cyaml` argument.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@@ -82,7 +82,7 @@ def __init__(
resume: bool = False,
report: Literal["md", "notebook", "html", None] = None,
save_dvc_exp: bool = True,
dvcyaml: Optional[str] = "dvc.yaml",
dvcyaml: Optional[str, Path] = "dvc.yaml",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dvcyaml: Optional[str, Path] = "dvc.yaml",
dvcyaml: Optional[str, os.Pathlike[str]] = "dvc.yaml",

Use os.PathLike[str], which is more generic than Path.

Copy link
Contributor Author

@AbdullahMakhdoom AbdullahMakhdoom Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @skshetry , thanks for reviewing this.
So this PR fixes this issue, for dvcyaml argument to support path.Pathlib value. Will the use of os.PathLike[str] be more appropriate here instead of pathlib.Path?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AbdullahMakhdoom Yes, I think the suggestion from @skshetry is just a best practice for typing hinting paths. For example, a PurePath will fail against pathlib.Path but not against os.PathLike[str]. Everything else in the PR LGTM.

@AbdullahMakhdoom
Copy link
Contributor Author

Fixes Issue#769

@@ -265,10 +265,14 @@ def _init_dvc(self): # noqa: C901
self._include_untracked.append(self.dir)

def _init_dvc_file(self) -> str:
if isinstance(self._dvcyaml, Path):
self._dvcyaml = str(self._dvcyaml)
if isinstance(self._dvcyaml, str):
if os.path.basename(self._dvcyaml) == "dvc.yaml":
return self._dvcyaml
raise InvalidDvcyamlError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop this line if we are adding the exception below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense.

@@ -220,12 +221,13 @@ def test_get_exp_name_duplicate(tmp_dir, mocked_dvc_repo, mocker, caplog):
assert msg in caplog.text


def test_test_mode(tmp_dir, monkeypatch, mocked_dvc_repo):
@pytest.mark.parametrize("dvcyaml_path", ["dvc.yaml", Path("dvcyaml/dvc.yaml")])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not getting back to you on this earlier, but I would vote to put this in a separate test in tests/test_make_dvcyaml.py. It's probably enough to add a parameter here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this makes sense.

@dberenbaum
Copy link
Collaborator

Thanks @AbdullahMakhdoom!

@shcheklein
Copy link
Member

@AbdullahMakhdoom it seems mypy failed, could you please take a look. Thanks!

@codecov-commenter
Copy link

codecov-commenter commented Sep 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.19%. Comparing base (602c053) to head (838d9d9).
Report is 33 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #842      +/-   ##
==========================================
- Coverage   95.47%   95.19%   -0.29%     
==========================================
  Files          57       55       -2     
  Lines        3889     3909      +20     
  Branches      353      350       -3     
==========================================
+ Hits         3713     3721       +8     
- Misses        124      138      +14     
+ Partials       52       50       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

return "dvc.yaml"
if self._dvcyaml is None:
return "dvc.yaml"
if isinstance(self._dvcyaml, bool):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so is it something new? do we want to support bool as well?

we need to update docs then?

what happens if it's False (sounds like dvc.yaml should not be produced at all then?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On True/False/None, it defaults to "dvc.yaml".
I had to make this change to pass an already written test case, which testsTrue, False values to dvcyaml argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AbdullahMakhdoom could you please point me to that test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shcheklein, you can read the docstring for dvcyaml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, so:

                If `False`, DVCLive will not write to "dvc.yaml" (useful if you are
                tracking DVCLive metrics, plots, and parameters independently and
                want to avoid duplication).

it means we can't default to dvc.yaml - we should probably not running this function at all then if it's False.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shcheklein The code is maybe a bit of a mess here, but this sets self._dvcfile, which shouldn't matter for whether the dvc.yaml file is written. The code still checks self._dvcyaml to decide whether to write dvc.yaml here. This is tested here:

def test_make_dvcyaml_false(tmp_dir, mocker):
dvclive = Live("logs", dvcyaml=False)
dvclive.end()

It shouldn't matter much if self._dvcfile is set in this scenario, but the slight justification for still setting it to dvc.yaml is that if someone has set Live(dvcyaml=False) but then manually calls make_dvcyaml(), it will still write to dvc.yaml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear why using False is necessary here, as setting dvcyaml to None appears to produce the same outcome. Both tests ensure that the dvc.yaml file is not created when dvcyaml is either False or None, indicating that these options are functionally equivalent in this context.


def test_make_dvcyaml_false(tmp_dir, mocker):
    dvclive = Live("logs", dvcyaml=False)
    dvclive.end()

    assert not os.path.exists("dvc.yaml")


def test_make_dvcyaml_none(tmp_dir, mocker):
    dvclive = Live("logs", dvcyaml=None)
    dvclive.end()

    assert not os.path.exists("dvc.yaml")
    

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, apologize for the delayed responses, I've been quite busy with my day job. Please let me know the best way to move forward, and I'll make sure to do my best.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem @AbdullahMakhdoom, thanks for keeping on top of this!

dvcyaml used to be a bool argument and support for bools was left to avoid making a breaking change. We should probably have dropped this during the last major release but missed it.

src/dvclive/live.py Outdated Show resolved Hide resolved
@shcheklein
Copy link
Member

@skshetry please approve and merge if everything looks good to you.

@skshetry
Copy link
Member

Apologies for multiple approval. GH bugged out. :(

if self._dvcyaml is None:
return "dvc.yaml"
if isinstance(self._dvcyaml, bool):
return "dvc.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay with returning dvc.yaml in case of dvcyaml=False?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is unrelated to this PR. We were already doing that.

Copy link
Contributor Author

@AbdullahMakhdoom AbdullahMakhdoom Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry, as pointed out by @dberenbaum, "the slight justification for still setting it to dvc.yaml is that if someone has set Live(dvcyaml=False) but then manually calls make_dvcyaml(), it will still write to dvc.yaml. "dvc.yaml" is being returned ".
Also, there was a test case for backward compatability of accepting bool values of dvc_yaml argument.

@skshetry skshetry merged commit f69d198 into iterative:main Sep 19, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants