Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kedro-telemetry dependency breaking packaged projects without pyproject.toml #84

Merged
merged 6 commits into from
Dec 7, 2022

Conversation

jmholzer
Copy link
Contributor

@jmholzer jmholzer commented Nov 28, 2022

Description

Resolves #83.

Development notes

Currently, when kedro-telemetry is installed in a packaged project that is deployed without a pyproject.toml file, a RuntimeError is thrown which prevents Kedro from being used at all in this way. It is currently necessary to read pyproject.toml to get three pieces of information about the project, which are then sent to heap:

  • Package name
  • Project name
  • Project version

In the case that no pyproject.toml is present, I believe there is no other guaranteed 'source of truth' for these values. Therefore, my approach is to set these values to the string undefined in this case.

Instead of using pyproject.toml to find these values, this PR makes the following changes:

  • Package name is found using the import from kedro.framework.project import PACKAGE_NAME.
  • Project name is a leftover that is not important to store if we already know the package name and since there is currently no easy way of accessing this value, it is discarded.
  • Project version is found using the import from kedro import __version__. This is the version of Kedro being used to run the project, which I feel is more important than the version of Kedro used to create the project. Feedback would be appreciated on this point.

pyproject.toml is therefore no longer read at all.

Note: in the case that kedro.framework.project.PACKAGE_NAME is None, hashed_package_name will be set to undefined. I am not sure when this would ever happen, but I think it would be better to have some value here that will at least stop the plugin from breaking the project if this case ever occurs.

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes

…m dependencies

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
@jmholzer jmholzer changed the title Add RuntimeError handling to _get_project_metadata call and downstrea… Fix kedro-telemetry dependency breaking packaged projects without pyproject.toml Nov 28, 2022
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
@jmholzer jmholzer self-assigned this Nov 28, 2022
@noklam noklam self-requested a review November 29, 2022 10:23
Comment on lines 143 to 152
def _get_project_properties(
hashed_username: str, project_metadata: ProjectMetadata
hashed_username: str, project_metadata: Optional[ProjectMetadata] = None
) -> Dict:

hashed_package_name = _hash(project_metadata.package_name)
hashed_project_name = _hash(project_metadata.project_name)
project_version = project_metadata.project_version
if project_metadata:
hashed_package_name = _hash(project_metadata.package_name)
hashed_project_name = _hash(project_metadata.project_name)
project_version = project_metadata.project_version
else:
hashed_package_name = hashed_project_name = project_version = "undefined"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Ivan mentioned yesterday, it may be possible to get these metadata via kedro.framework.project. package_name and project_name are almost always the same.

Another out-of-scope question is, does the KedroCLITelemetry work with the packaged project then?

…roject_name

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
…e imported

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
@jmholzer jmholzer marked this pull request as ready for review November 29, 2022 16:18
@noklam
Copy link
Contributor

noklam commented Nov 29, 2022

Project version is found using the import from kedro import version. This is the version of Kedro being used to run the project, which I feel is more important than the version of Kedro used to create the project. Feedback would be appreciated on this point.

To be more precise, this is supposed to be the version of the "starter" used. This is a good point though maybe the version of Kedro being used is more useful. In general, I would be ok with using either definition, but it should be consistent.

@jmholzer
Copy link
Contributor Author

In general, I would be ok with using either definition, but it should be consistent.

You're right about this, @yetudada it would be great to hear your thoughts on whether we should upload either:

  1. the version of the starter used to create a Kedro project or
  2. the version of Kedro being used to run the Kedro project

to Heap. The context for this question is that currently option 1 is implemented, though option 2. is easier and safer and is currently what this PR will do.

@merelcht
Copy link
Member

merelcht commented Nov 29, 2022

The version used to create the project and the version of Kedro being used to run the project are the same. If you used an older version of Kedro to create the project e.g. 0.17.7 and then try running it with 0.18.3 you will get an error. You will then either need to downgrade your version of Kedro or upgrade the project to make it compatible with 0.18.3.

@jmholzer
Copy link
Contributor Author

The version used to create the project and the version of Kedro being used to run the project are the same.

Hmm, what about the case where the starter is e.g. 0.18.2 and the version of Kedro that is installed is 0.18.3? This is the case locally in one of my projects and I think it is valid.

@merelcht
Copy link
Member

Hmm, what about the case where the starter is e.g. 0.18.2 and the version of Kedro that is installed is 0.18.3? This is the case locally in one of my projects and I think it is valid.

Ah my bad! I thought the check was more strict and checking for a full version match.

In that case I agree with you that we should just send the version of Kedro that's being used to run the project.

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I think using the Kedro version is fine!

Copy link
Contributor

@SajidAlamQB SajidAlamQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense thanks for looking into this. 🌟

@merelcht
Copy link
Member

merelcht commented Dec 5, 2022

Don't forget to update the release notes with this fix! ✍️

@jmholzer jmholzer merged commit 1c17421 into main Dec 7, 2022
@jmholzer jmholzer deleted the fix/telemetry-breaks-packaged-projects branch December 7, 2022 12:19
jmholzer added a commit that referenced this pull request Dec 7, 2022
jmholzer added a commit that referenced this pull request Dec 7, 2022
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
jmholzer added a commit that referenced this pull request Dec 7, 2022
* Update RELEASE.md for changes in #84

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix tense

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version to 0.2.3

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kedro Telemetry breaks packaged projects due to wrongly assuming pyproject.toml exists
4 participants