Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

SEP - Deprecate, and remove, the usage of the __utils__ loader #66

Closed
wants to merge 4 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 47 additions & 3 deletions 0065-drop-dunder-utils.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,14 @@ Deprecate and remove all usage of the ``__utils__`` Salt loader.
# Motivation
[motivation]: #motivation

## Clarification
[clarification]: #clarification

Even though I'm part of the core team of Salt, I'm proposing this SEP as an individual
and not as part of Salt.
The reason being that I'm a strong opponent to ``__utils__`` and that should not be
confused as the view of the whole Salt core team.

## Why are we doing this? What use cases does it support? What is the expected outcome?

The Salt loader is a **controlled** hack around Python's import system which allows
Expand Down Expand Up @@ -42,6 +50,10 @@ See the following for background:

Just to name a few.

Some time ago, [Max Arnold](https://github.com/max-arnold) also compiled a
[list of issue](https://salt.tips/patching-salt-modules/#the-dreaded-utils-namespace)
related to this matter.

For additional context, a python module which should be loaded by the Salt loader
should implement functions, and that's it. In rare cases, some of those modules
might be defining some helper classes, but it's advisable to move that helper
Expand All @@ -61,7 +73,11 @@ were not meant to load.

The implementation of this SEP should be done by:

* Create a way to deprecate the ``__utils__`` loader usage without breaking
existing code. Being addressed in [#62222](https://github.com/saltstack/salt/issues/62222).
* Deprecate the ``__utils__`` loader and issue warnings about that usage.
Being a huge task, a [project](https://github.com/orgs/saltstack/projects/40)
was created to aggregate all of the work required.
* Create an automated way to remove the ``__utils__`` loader usage and prevent
from said usage to creep in. Done in
[salt-rewrite](https://github.com/s0undt3ch/salt-rewrite/blob/2.0.0/src/saltrewrite/salt/fix_dunder_utils.py).
Expand All @@ -85,20 +101,48 @@ There are some things which we just can't control, see:
## Unresolved questions
[unresolved]: #unresolved-questions

None so far.
### Many others may also be using ``salt://_utils/``. That can't be replaced with a simple import. That code would need to be packaged and installed separately into the minion's environment. — By [James Howe](https://github.com/OrangeDog)

What exactly would be a use case for the ``salt://_utils/`` usage?


## Resolved questions
[resolved]: #resolved-questions

### Is this loader accessible anywhere else? e.g. via utils in templates? That significantly increases the surface of the breaking change. But that's not necessarily a bad thing.
### Is this loader accessible anywhere else? e.g. via utils in templates? That significantly increases the surface of the breaking change. But that's not necessarily a bad thing. — By [James Howe](https://github.com/OrangeDog)

Thank god no.

### Breaking change on public API for custom modules (and extension modules?)
### Breaking change on public API for custom modules (and extension modules?) — By [James Howe](https://github.com/OrangeDog)

During the deprecation period, ``__utils__`` shall be available, although a warning about it's
deprecation will be emmitted.

### But still, is there a cleaner way to solve the original use-case (distribute common code, or override built-in utils)? — By [Max Arnold](https://github.com/max-arnold)

Distributing common code has been solved in the python ecosystem, ie, packaged libraries.
As for overriding built-in utils, this is one of the major reasons why the utils loader should have never been implemented.
One thing is being able to override an execution module, or state module, the exposure surface is small, as for
utility code, the surface is unmeasureable because that code is meant to be reused accross the whole code base. One small change
in the overridden code has the potential of bringing a Salt installation to the ground.

### I guess shipping custom utils through salt extensions will be also broken? — By [Max Arnold](https://github.com/max-arnold)

A salt extension is a packaged python library that Salt can use, so, why not just import the utilitary code and just use it.

### I do not mind to remove ``__utils__``, but what mechanism should I use now? — By [Alberto Planas](https://github.com/aplanas)

Packaged libraries distributed to the minions. Salt can even distribute and install these on the minions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see some inconveniences with this approach:

  1. __utils__ is a kind of box for side functions, that do not have too much consistency to be packed via RPM or a wheel. Is supporting code, usually very ad-hoc, that we do no want to export with a public API (as in a salt module)
  2. This makes the high state a two stage process if __utils__ is used in the Jinja template (first a installation of the dependencies, and later apply the highstate)

It is also asymmetric: the utils in the core can be used as support and will be synchronized for use, but the user cannot extend them.

IMHO packaging them and expecting to be there before the highstate is not equivalent.

If the modules and states code can deliver submodules, then we will have an equivalent replacement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I see your point, but I believe it's not a reason with enough weight to remove a problematic implementation.
  2. These can be done on the same high state call. Install dependencies, issue a module reload, carry on.

It is also asymmetric: the utils in the core can be used as support and will be synchronized for use, but the user cannot extend them.

As any library code, sure the user can extend the utilities by piggy-backing on other code/implementation, it's not __utils__ that enables this.

If the modules and states code can deliver submodules, then we will have an equivalent replacement.

Agree, but as I mentioned on one of the replies, it's not the current target of this SEP. It should be proposed and implemented separately, and so far, I don't, personally, see it as a requirement for this SEP.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This makes the high state a two stage process if utils is used in the Jinja template (first a installation of the dependencies, and later apply the highstate)
  1. These can be done on the same high state call. Install dependencies, issue a module reload, carry on.

Uhmm I am not sure that I understand. If the code is used in the Jijna template it will fail very early, in the rendering stage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmm I am not sure that I understand. If the code is used in the Jijna template it will fail very early, in the rendering stage.

Wait, Salt allows access to __utils__ in a jinja template?!

🤦‍♂️

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Salt allows access to __utils__ in a jinja template?!

Templates have access to execution modules, and they have access to __utils__.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone asked on Slack how to call an utility function from Jinja, so I went ahead and implemented that as a custom module: https://gist.github.com/max-arnold/48b8cae9b4bb4d6d5f5479922184bc68

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Templates have access to execution modules, and they have access to __utils__.

Thank god there's no direct access to __utils__ from templates.

Someone asked on Slack how to call an utility function from Jinja, so I went ahead and implemented that as a custom module: https://gist.github.com/max-arnold/48b8cae9b4bb4d6d5f5479922184bc68

For this they could register jinja filters, or when using your custom module, have it import the right utils and call it, there's really no need for __utils__.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this they could register jinja filters

Could you please try this? The last time I tried to register a Jinja filter in a custom module, it didn't work. Later I found this thread that confirms it: https://saltstackcommunity.slack.com/archives/C8832QN4U/p1598124878044000


### For example, there are shared code that I do not want to be represented as a module. — By [Alberto Planas](https://github.com/aplanas)

I don't know enough about this particular usage or it's reasonings, so, why not?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is a supporting function, with a non stable API. One example is when we have one code reused multiple times in different modules and we do not want to replicate code (and is not big enough to have entity to be an RPM package)

Another example is when we have some code that makes very complicated simulation, and to abstract it we need to use classes and stuff.

Basically is a bunch of internal code, that is used by the external API when needed as a normal Python code later.

There is no natural way to express this code as a module, unless I create some weird interface with some documentation with a warning of "do not use this". I think that I remember this exact case in some of the salt modules: some function with this kind of warning, that was designed to be used internally by another salt module.

Removing the __utils__ without a functional alternative will create more of those public functions that should not be used by any user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is a supporting function, with a non stable API. One example is when we have one code reused multiple times in different modules and we do not want to replicate code (and is not big enough to have entity to be an RPM package)

Again, my personal opinion is that this is not enough to invalidate this SEP.

Another example is when we have some code that makes very complicated simulation, and to abstract it we need to use classes and stuff.

Also, my personal opinion is that this is not enough to invalidate this SEP.

Basically is a bunch of internal code, that is used by the external API when needed as a normal Python code later.

So, package it, and ship it, no matter how small it is.

There is no natural way to express this code as a module, unless I create some weird interface with some documentation with a warning of "do not use this". I think that I remember this exact case in some of the salt modules: some function with this kind of warning, that was designed to be used internally by another salt module.

Yes, and those that were identified got renamed. Just putting a _ prefix of the function name marks it as a non loadable function by the Salt loader.
Anyone can still import and use it.
But if it's widely used, why do it in a module, it's a utility function, move it to a utils module, import, and use it.

Removing the __utils__ without a functional alternative will create more of those public functions that should not be used by any user.

Not necessarily.
Our team is small, so, inevitably some thing go in that not all of us agree on.
However, these days, we have CI code which fails the build if a publicly available function does not include a CLI Example.
Of course, someone could add a bogus CLI Example just to pass CI, but that should not pass the review process before getting merged.


### Will any Python submodule inside a salt module be recollected and send to the minion? — By [Alberto Planas](https://github.com/aplanas)

I consider this a feature request.
Although it could relate to this SEP, its work/implementation should be separate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But at least this RFC should order both to guarantee that there is an alternative before the removal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But at least this RFC should order both to guarantee that there is an alternative before the removal.

To me, there is, package your utils code.

I keep mentioning that this is my personal view on the issue, a harsh one because I think __utils__ should have never been implemented.

But do note that, I'm not enough to approve this SEP, and a few of my teammates might not even share this harsh perspective on this subject.

I do value your feedback, and even though it hasn't moved my position, it could move someone else's position on the core team.
This SEP can still be rejected, or forced to change to avoid being rejected.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmm so many bold text makes me wonder if there is a miscommunication here.

I am not opposed to the idea. IMHO all __XXXX__ is an undesirable hack, I do not see __utils__ special here.

I think utils should have never been implemented.

Even tho I do not understand the specific issue about __utils__, I trust your opinion here.

I make an effort to go to some of the issues linked in the document, to try to figure out what is so bad specifically for __utils__, but I only saw issues in the modules and in how the generic loader works (that is also what I see problematic). Maybe clarifying the specifics about __utils__ in the RFC can help to gain fast agreement.

Copy link
Author

@s0undt3ch s0undt3ch Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmm so many bold text makes me wonder if there is a miscommunication here.

Sorry for the bold 😀
I'm just trying to state that this is not, at least yet, the opinion of the majority of core team, just me.

... IMHO all __XXXX__ is an undesirable hack ...

Yes, it is. I also have ideas about all of that, but I might not even propose a SEP for that, it touches TOO much, and any deprecation path or compat layer might just be too complicated to implement.

I make an effort to go to some of the issues linked in the document, to try to figure out what is so bad specifically for utils, but I only saw issues in the modules and in how the generic loader works (that is also what I see problematic). Maybe clarifying the specifics about utils in the RFC can help to gain fast agreement.

The loader is mean to "store"/"hold" functions(as simple as they can be), which provide user facing functionality.
Utilities are not user facing functionality, that why I say __utils__ should have never been implemented. But it did.
On and off we issues related to __utils__, the most recent ones are related by memory consumption. In sum, because the salt loaders load the same module over and over again, some utility modules relying on the warnings or atexit modules, keep adding to some internal lists of those stdlib modules.

This should never happen, and it doesn't in user facing loader modules, which, __utils__ is not.

Copy link

@max-arnold max-arnold Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When comparing alternative ways to achieve the same need, we need to consider the complexity/usability as well.

Use-case 1: Ship reusable code that can be called from execution modules to minions
Constraint 1: The code should not be callable directly from Salt CLI
Constraint 2: Private functions are not exposed by the loader, so it is not possible to place these custom functions into an execution module itself.

Solution 1: _utils folder and sync_all
Solution 2: Package as a Python module. Downsides: lower development speed, need to mess with Python packaging, pollute PyPI with internal packages or maintain your own PyPI/Devpi server, additional states to install it (as opposed to a sync_all)
Solution 3 (idea): untangle the utils module namespace and __utils__. Allow the latter to invoke only user supplied modules shipped in _utils and not Salt core utils.
Solution 4 (idea): implement some convention to cross-call private execution functions without exposing them to users via CLI and docs

Use-case 2: Override built-in utils.
Why 1: patch an issue that is not yet fixed and released
Why 2: can't upgrade Salt yet, but need a feature/fix ASAP

Solution 1: Override an utility module through _utils. Downsides: sometimes doesn't work, because Salt often imports utils instead of using __utils__ everywhere. Solution: fix Salt to always use __utils__
Solution 2: Make a custom Salt package. Involves a lot of work to maintain and some infrastructure to host
Solution 3: Dramatically speed up bug fixes and release process. This was the original goal of the development process overhaul several years ago (I think a Salt core dev even hinted that it would be possible to achieve monthly/weekly releases with some automation). Apparently, it is no longer a goal (see the LTS SEP)
Solution 4: Significantly simplify the Salt upgrade process. Maybe via Tiamat packages? Ship some built-in modules that can reliably automate that. Significantly decrease number of regressions in Salt releases and improve compatibility, so upgrading is safe and painless

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against the delivery of _utils if we still drop __utils__

If this is technically OK, I think that this can be a nice compromise. We drop some use cases for __utils__, but there is still a clear way to deliver user Python code that is required for modules and states.

I'm not against _utils because it's not adding burden to Salt and it's maintenance, __utils__ is.
We'd need to clearly document how to use modules delivered from _utils though and perhaps provide use cases.

I'd need to check how this would play out, but, I could even consider leaving __utils__ for modules shipped through _utils, but preferably, we would just get rid of all __utils__ usage to avoid confusion.

Copy link

@OrangeDog OrangeDog Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pollute PyPI with internal packages or maintain your own PyPI/Devpi server

Or just use a different installation source. pip can install from git, an archive file (including .whl), a source folder, etc,.

Solution 3 (idea): untangle the utils module namespace and __utils__. Allow the latter to invoke only user supplied modules shipped in _utils and not Salt core utils.

That sounds like a good idea.

how to use modules delivered from _utils

I'm pretty sure the only way it will work is via a loader, which for consistency should be named __utils___. It shouldn't have behaviour that's different from the other synced loaders. Otherwise the reload wouldn't work, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pollute PyPI with internal packages or maintain your own PyPI/Devpi server

Or just use a different installation source. pip can install from git, an archive file (including .whl), a source folder, etc,.

Right.

Solution 3 (idea): untangle the utils module namespace and __utils__. Allow the latter to invoke only user supplied modules shipped in _utils and not Salt core utils.

That sounds like a good idea.

No, still does not sound like a good idea, the memory issues from the stdlib won't just go away and we shouldn't constraint how we write utility functions, we already do that for all of the other loaders...
What would be the alternative, utility functions for the utils loader?!

how to use modules delivered from _utils

I'm pretty sure the only way it will work is via a loader, which for consistency should be named __utils___. It shouldn't have behaviour that's different from the other synced loaders. Otherwise the reload wouldn't work, right?

Hmm, true. I could probably compromise on keeping __utils__ which only has access to whatever is provided from _utils, though, again, the Salt project will likely still have issues filed because of memory consumption, although, no longer directly related to Salt itself.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just use a different installation source. pip can install from git, an archive file (including .whl), a source folder, etc,.

These options are not without downsides. Git installs are slow. Keeping a *.whl binary in a git repo is not a good practice (IMO of course). It is an artifact that is derived from the repo and should be stored elsewhere (like Github releases or Artifactory or S3). And binaries blow up Git history.

Nothing beats the simplicity of keeping custom modules in a Salt state tree as plain *.py files

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping a *.whl binary in a git repo is not a good practice

Which is why I didn't suggest it. It can be anywhere you like - local file, salt fileserver, a web server.

Perhaps that work could, potentially, be a requirement for this SEP, or not.

# Drawbacks
[drawbacks]: #drawbacks
Expand Down