Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularized client for state management #114

Merged
merged 11 commits into from
Jun 21, 2024

Conversation

RulerOfCakes
Copy link
Contributor

This PR adds a Manager class to act as a state encapsulation object for managing the state of micropip in a local Manager instance.

The initial goal of this PR is to have the exposed commands API be usable through the Manager, enabling having multiple instances of it with different local state. The current global API will remain supported, but the 'global state' should be independent of other alive Manager instances.

Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @RulerOfCakes!

def uninstall(self):
pass

def set_index_urls(self, urls: List[str] | str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def set_index_urls(self, urls: List[str] | str):
def set_index_urls(self, urls: list[str] | str):

from Python 3.9, standard types can be used as type helpers.

Copy link
Contributor Author

@RulerOfCakes RulerOfCakes Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was because of the name conflict with the list() method, unlike the _list export from _commands.list.py I couldn't alias the method name to something else. Both my LSP and mypy won't let me make this change, but i'm not sure how I can avoid the warning from mypy for using List. Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that's funny. It was probably a bad decision to call it list at the first time. Anyway, it's not an issue. So let's keep this as-is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment next to the from typing import List please!

self.index_urls = package_index.DEFAULT_INDEX_URLS

# TODO: initialize the compatibility layer
self.repodata_packages: dict[str, dict[str, Any]] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repodata is a legacy name. We used a lock file called repodata.json, but we've changed its name to pyodide-lock.json, so it would probably be better to rename this variable too.

(no action required in this PR)

@RulerOfCakes
Copy link
Contributor Author

I presume the next logical step would be to encapsulate the compatibility layer states(repodata_packages, repodata_info) within the Manager class's initialization and inject them in the dependent remaining commands? I initially had left this PR as draft because I didn't want to make too many changes at once before replacing the global state with a default Manager instance.

Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encapsulate the compatibility layer states(repodata_packages, repodata_info) within the Manager class's initialization and inject them in the dependent remaining commands?

Yes, that would be great for the next step. I am okay with merging this PR with this dummy class so you can continue next steps (please check the pre-commit issue though).

I think micropip.freeze can be a good candidate for the first command to implement. It only relies on two compatibility objects: repodata_packages and repodata_info.

micropip/manager.py Outdated Show resolved Hide resolved
@RulerOfCakes RulerOfCakes marked this pull request as ready for review June 18, 2024 15:58
@RulerOfCakes RulerOfCakes changed the title Draft: Modularized client Modularized client for state management Jun 18, 2024
Copy link
Member

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RulerOfCakes seems like a reasonable starting point.

from micropip import package_index


class Manager:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we like the name Manager? It's a bit vague but seems alright to me. We should consider whether we can come up with something better. @jaraco maybe you have some thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love "Manager", but it's adequate.

I notice below the repodata_ prefix on a couple of member variables, suggesting that maybe this object is representing RepoData or State or RepositoryState or Repository or Repo (and maybe the repodata_ prefix can be removed). I also see that "repodata" is a legacy artifact and "lock" is relevant, so maybe LockState or some variant.

Since I see actions about .install, and .freeze, I'm thinking this is relevant to an Environment or Site.

I'm liking Repository or Environment best, but I'd be okay with any option. Feel free to pick/promote one you like best.

Am I right in thinking this class is an internal implementation detail (not part of a public API)? If so, the naming isn't critical and can be changed later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were that eventually this class will become representative of the current public API, with additional extensibility such as dependency injection as discussed in #112, although I would guess having a separate exposed layer and keeping the Manager internal is also a feasible choice. (Assuming that the most common case of a user wanting to instantiate their own Manager without accessing the current public APIs would be to utilize such extensibility)

I was thinking that Client or Micropip could suit it better but Manager seemed closest to what it was going to do: state management. I do agree its a vague name, and Repository also sounds alright too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I plan to make this class public. People would do mgr = micropip.Manager(); mgr.install(...).

I prefer Manager or Environment. The former emphasizes that this is a package manager of sorts, while the latter recognizes that each instance is like a virtual environment of sorts.

Let's go with Manager for now. We still have time before implementing all these APIs and making this class public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it PackageManager then? or ???Environment? I feel like we could use an extra word in the name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageManager sounds good to me.

Comment on lines 10 to 11
Each Manager instance holds its own local state that is
independent of other instances, including the global state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

including the global state.

Well it's not really global then. It's a bit unclear how this works since the result of installing stuff with micropip is generally files placed in the file system. I guess different Manager instances can have different install directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to phrase it in a way that the global manager instance(representing the 'global state' for the exposed commands) is also included in the statement, but maybe it isn't necessary to specify it as such.

I would guess that the 'right way' to keep things independent would be to also have different install directories, but I'm not sure how nicely that would play out with sitepackages, or perhaps a mechanism to prevent managers having duplicate install directories. Could you provide your opinion on this? cc: @ryanking13

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess different Manager instances can have different install directories?

Ideally, we may try to have each instance behave completely independently as if each instance becomes a virtual environment. For example, by combining micropip and subinterpreter, we may be able to support different interpreters to install and run different sets of packages.

However, I think most of the time, people will end up using one instance, and we would recommend doing so, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super useful for testing purposes but probably not as useful for real world stuff. How does importlib manage this? It seems to me that it depends on sys.path and sys.meta_path for everything so it can't handle introspecting multiple environments without monkeypatching these. Presumably there are pypi packages that can inspect virtual environments that aren't active.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know importlib.metadata.distributions() takes an optional path argument, which defaults to sys.path, allowing a caller to inspect metadata for a set of paths. That could be pointed at a site-packages for a virtualenv to get the metadata for those packages, but it won't honor other paths that might get added by the site config (such as a venv --system-site-packages or a .pth file in the site-packages that might extend the path). In my experience, the only technique I've seen for inspecting the true packaging environment for an interpreter is to launch that interpreter (in a subprocess) and then inspect that interpreter's environment.

That said, I know recent versions of tox have been doing some innovative things around detecting installed packages (and diffing them against an install spec), so there may be some good ideas in that codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super useful for testing purposes but probably not as useful for real world stuff.

My main goal of making micropip modular is to allow us (or others) to customize some of its behavior. It's nice to have each instance behave independently, but it's not really something I'm very interested in right now (of course we can benefit from it in testing purpose).

def uninstall(self):
pass

def set_index_urls(self, urls: List[str] | str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment next to the from typing import List please!

@ryanking13
Copy link
Member

Could you please check the linter issues? I think we can just add #noqa : ... for now.

micropip/_commands/install.py:133: error: Argument "ctx" to "Transaction" has incompatible type "Environment"; expected "dict[str, str]"  [arg-type]

This one is not related to your PR but to packaging update I guess. I think we can ignore it for now too.

@ryanking13
Copy link
Member

Thanks @RulerOfCakes!

@ryanking13 ryanking13 merged commit e107d89 into pyodide:main Jun 21, 2024
5 checks passed
@hoodmane
Copy link
Member

Thanks @RulerOfCakes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants