Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formula dependency management #12179

Closed
chrismoos opened this issue Apr 22, 2014 · 54 comments
Closed

Formula dependency management #12179

chrismoos opened this issue Apr 22, 2014 · 54 comments
Labels
Feature new functionality including changes to functionality and code refactors, etc.
Milestone

Comments

@chrismoos
Copy link
Contributor

I think there should be a standard way to manage formula dependencies. It is very natural for a state to be composed of other states but when using third party formula there is no easy way to manage the dependencies, versions, etc,.

A popular tool for Chef is Berkshelf. You put a file in your cookbook root (or state tree root in Salt's case) like this:

source "https://api.berkshelf.com"

metadata

cookbook "mysql"
cookbook "nginx", "~> 2.6"

There is a CLI command so you can fetch and install all of the dependencies, for example.

I propose we do the same thing for Salt's formula:

Saltfile - This will contain the dependencies, sources, etc,.
Saltfile.lock - This will contain all of the active/installed dependencies and their versions.

There will be a tool that will fetch and install the dependences, just like the berkshelf command.

I think that having this feature is really important especially as your state tree gets more advanced and you start pulling in third party formula.

Example Saltfile:

- sources:
    - https://formula.mycompany.com
    - https://formula.saltstack.org
- dependencies:
    nginx:
    redis:
        version: '>= 1.0.5'
    ntp:
        git: 'https://github.com/saltstack-formulas/ntp-formula.git'
@pidah
Copy link

pidah commented Apr 22, 2014

+1

@westurner
Copy link
Contributor

Questions:

  • Should the metadata be stored in a separate JSON/YAML file which does not require code execution?
  • Should the metadata be stored within templated sls files?
  • How to specify/handle GitFS/HgFS branch <-> environment mappings?
  • How easy should it be to diff between forks and communicate changes? [EDIT]
  • See also: ENH: GPG signatures, branch-environment map (GitFS/HgFS workflow) #12183 "ENH: GPG signatures, branch-environment map (GitFS/HgFS workflow)"

Python packaging tools handle dependency graphs:

Conda packages solve for this with many languages:

@ahambrick
Copy link

👍 Would be a great help in implementing a Continuous Delivery Process.

@avimar
Copy link
Contributor

avimar commented Jun 24, 2014

Just to mention: nodejs has a rather awesome npm system for uploading, managing, and using dependencies (from any source!). It even allows them to have their own sub-dependencies of it's own specified version so they won't conflict with other things using a different version of the dependency.

@elmariofredo
Copy link

1+ for simple sub dependency chain, also something like npmjs.org registry for formulas would be nice.

@westurner
Copy link
Contributor

Formula Dependencies

In lieu of a standard way to manage this (e.g. setup.py + pip with $VIRTUAL_ENV/src[/salt-formulas] on sys.path and/or GitFS and/or salt file_roots),
an informal README.rst heading for "Formula Dependencies" may be helpful.

e.g. https://github.com/bechtoldt/iscdhcp-formula/blob/master/README.rst#formula-dependencies :

Formula Dependencies
====================

None

Namespacing

It may be easier to prefix/postfix things with <github-username>. e.g.:

https://github.com/salt-formula/salt-formula
salt-formula-salt-formula

https://github.com/westurner/salt-formula
westurner-salt-formula

@westurner
Copy link
Contributor

Python Packages

Packaging salt formulas as Python packages with setup_requires/requirements.txt dependencies:

Tools

Caveats

  • Specifying dependencies with near-complete URIs / URLs (e.g. Golang) would be great.
  • It's possible to index JSON[-LD] metadata without indexing code
  • Conda is also platform-portable/specifiable: http://conda.pydata.org/docs/spec.html

[EDIT]

@westurner
Copy link
Contributor

@edword
Copy link

edword commented Oct 14, 2014

+1 for some sort of berkshelf or npm like dep management

@arnisoph
Copy link
Contributor

+1

@skylerberg
Copy link
Contributor

I like @westurner's plan of using Python packages. However, we also need to be able to include the installed formulas in salt easily.

I think these could be solved with a pypifs, which would be like gitfs, but for Python packages. So instead of specifying git repos, you would have a list with entries like

  - westurner-salt-formula
  - salt-formula-apache-formula

This would handle finding the packages on your system, and would also find and include all of the dependencies based on requirements.txt.

Thus by editing your salt master's config and restarting the salt master, you could have all of your formulas and not have to worry about dependencies at all.

Finally, you should be able to specify a version just like you would when using pip manually

  - salt-formula-apache-formula==1.0.4

@westurner
Copy link
Contributor

@westurner
Copy link
Contributor

@iggy
Copy link
Contributor

iggy commented Nov 10, 2014

What about something simple like using git submodules. Maybe Salt could even have some magic added that added the top level subdirs of the submodule to the top level path structure.

i.e.

graphite-formula---+----graphite----init.sls
                   |
                   +--- nginx-formula (submodule) --- nginx --- init.sls
                   |

and the graphite and nginx dirs get added to the top level salt dir (somehow, haven't really thought too much about that yet).

@skylerberg
Copy link
Contributor

I think git submodules have several drawbacks compared to packaging.

Git commits do not hold the same semantic meaning that package releases do. For example, if you update a package with a bugfix, then you would have to go into all of the packages that depend on it and change their submodules. With versions you do not require such a specific version, just the same major version must match (unless you need features introduced in a minor version).

Shared dependencies would be duplicated.

I think packages could be handled more elegantly: No having to include .gitmodules, no having to initialize every time you clone, etc.

@iggy
Copy link
Contributor

iggy commented Nov 10, 2014

.gitmodules is worse than a dependencies.txt/SaltFile/whatever.yaml/etc somehow?

And there's nothing that says you can't have a script/shell alias/whatever that does the checkout -> submodule init (in place of pip/npm/etc).

And as far as having to change .gitmodules when you commit fixes, git supports branches for submodules. So maybe each formula has a branch for each upstream release (or just master if it's a fairly generic formula).


I honestly think this is a problem that doesn't need to be solved right now.

The -formulas have enough other problems that the landscape could be complete different by the time we get around to needing real dependency management.

I'm not saying having this discussion is pointless, but I don't think implementing something right now is prudent. And I think too much discussion on the topic takes something away from the real problems that the formulas have.

There is a real problem of developer bandwidth right now. Trying to shoehorn formula dependencies in right now when nobody really knows what formulas will eventually look like is a Bad Idea™

@skylerberg
Copy link
Contributor

I agree that inside the formula, .gitmodules is equivalent to .requirements.txt. However, I would like to see a solution where formula users do not need to have a .gitmodules and have to configure gitfs to point to the submodules. Just change the salt config, not change the salt config and have other files hanging around.

Of course, having the packages and a gitfs like way to include them is a rather large change and as you said, there are more important problems in formulas at the moment.

When we do get to solving this problem, I just want to make sure that we do it right (whatever right ends up being).

@westurner
Copy link
Contributor

Is it possible to pull a specific version with .gitmodules, or just a branch?

How do I avoid push -f'ing over a whole tree?

@westurner
Copy link
Contributor

@jeffrey4l
Copy link
Contributor

I'd like to use the Python Package to manage the formulas. Just like the what python-xstatic[1] does.

E.g. There will be a packages named nginx-salt-formula which can be installed through pip or easy_install

There are several benefit for this.

  1. version manage and dependency are easy. Just change the setup.py/requirements.txt file in the formulas. Then PIP can solve the dependency.
  2. formula may depend on some Python Package in some case. ( for example nginx-salt-formula may has it own _state or _module, which ask for some Python Packages.) This can be solved by pip
  3. installation is easy. Just add the package's name to the salt master configure should be ok.

[1] https://pypi.python.org/pypi/XStatic

@UtahDave
Copy link
Contributor

+1 for using python packages.

@iggy
Copy link
Contributor

iggy commented Nov 12, 2014

Currently we have a hard enough time getting people to contribute their changes back. It's also difficult getting things merged for formulas that the couple people that can commit don't understand.

I'm worried that something along the lines of full pypi packages would make that even worse.

Not to mention the fact that formulas aren't even python code...

If you require strict formula ownership, I see the number of formulas plummeting.

Again, this is as things stand now. I think things will likely be different at some future time.

@whiteinge
Copy link
Contributor

Very interesting discussion so far. Quick note about one remark:

the couple people that can commit

Everyone on the Contributors team should have full commit access on all formulas repos. I know a few of those have slipped through the cracks. If you notice one let me know and I'll add it under the team. On a related note, I have plans to toss a web interface up (soon as work-load permits) that will allow people on the Contributors team to create repos and fork repos into the org.

@iggy
Copy link
Contributor

iggy commented Nov 13, 2014

I more meant that people are reticent to commit changes to formulas they don't use (unless they seem like obvious changes). Making it more difficult for people to contribute at this point in time doesn't seem prudent.

FWIW, I've personally had very good response with getting my PRs committed.

@chrismoos
Copy link
Contributor Author

I really believe that having the formulas reside in a Git repository somewhere is going to be the best. I agree with @iggy that doing pypi packages just raises the barrier to contribute higher. There is plenty of evidence that the the model of forking a git repo to contribute has been very successful. It encourages people to make changes and to push them back upstream.

What's really needed is just a way to manage locating and fetching the formulas that you depend on. I don't think we have to say that We must use Git!, but instead be flexible with where formula dependencies can reside. Look at projects like CocoaPods, Bundler, and Berkshelf. They have some things in common like:

  • Dependencies are not limited to a specific source (i.e you can use git, local file system, etc,.)
  • Dependencies are set forth in a file in the top level directory
  • The tool resolves dependency versioning between components

In addition, all of the aforementioned tools have been wildly successful at what they do and have really provided an easy way for people to collaborate and contribute.

CocoaPods has a central repo, kind of like Homebrew, which lists out the canonical list of all packages and metadata for each version. This gives you the ability to just specify a dependency with a simple name (and an optional version specifier). The central repository is a good one but obviously requires maintenance and people to manage pull requests of people wanting to add their packages to the offical list.

Bundler and Berkshelf also have central listings of packages, albeit a bit different than CocoaPods.

I propose the following high level idea:

  • Formulas have a file in the top level directory containing package metadata, the package metadata file
    • Dependencies
    • Package information
    • etc,.
  • A tool will be developed to facilitate the following (at a minimum)
    • Resolve and install dependencies
    • Generate a sample package metadata file
  • There will be a Git repository created to house all of the available formula
    • Pull requests will be sent to bring in new formula or update existing ones
    • Each formula will be a directory in the repository that contains a version folder, and inside of that folder is the package metadata file.
  • Dependencies can be listed as:
    • A name and version, this will resort to using the official Git repository to locate the package
    • A local file path where the formula is located
    • A Git repository URL and branch/commit specifiers (also, consider supporting hg as well)
    • At a later time, maybe support https + the SHA256 of the formula's tarball

Obviously there is a lot to spec out, but my 2 cents is that the above is the way to go, not pypi.

@TheCatPlusPlus
Copy link

setuptools along with pip/peep already allows for everything listed above, Salt really doesn't have to reinvent the wheel (heh). And you don't necessarily have to make people upload anything to PyPI: just make a custom index that generates package entries from the currently existing GitHub organisation.

@westurner
Copy link
Contributor

For merging each salt-formula into one major (git) repository (as I think @chrismoos is describing):

I don't know how to do this with hg; though I'm sure there's a way. The immutability of hg has always been a selling point for me.

@jeffrey4l
Copy link
Contributor

There is another big issue in the salt-formula repository. There is few version management in current's formulas. It is useless and dangerous for production environment. Because formula may be changed and cause some issue if there is only one master branch.

I think this is a big issue which will block the re-use of formulas.

@samos123
Copy link

samos123 commented Dec 9, 2014

+1 for using Python packages

@arnisoph
Copy link
Contributor

I'm thinking about extending https://github.com/bechtoldt/vcs-gather with dependency resolution support for SaltStack formulas and Puppet modules. The metadata.json file from Puppet (https://docs.puppetlabs.com/puppet/latest/reference/modules_publishing.html#write-a-metadatajson-file) could be acceptable for it.

@westurner
Copy link
Contributor

@bechtoldt https://github.com/westurner/pyrpo (pyrpo -s . -r sh) and/or pypi:vcs and/or https://github.com/conda/conda/tree/master/conda (http://conda.pydata.org/docs/#requirements (pycosat) may be useful).

Conda packages have a meta.yaml file. https://github.com/conda/conda-recipes/blob/master/requests/meta.yaml

Python packages have a pydist.json (PEP 426)

@DanyC97
Copy link

DanyC97 commented Jun 29, 2015

very useful info, is any traction being put on this for next Salt release?
Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.

@westurner
Copy link
Contributor

Salt Formulas work great without automated dependency resolution (formula
dependency management).

Here's one way to do Salt Formulas in separate repos + GItFS:

https://github.com/saltstack-formulas/salt-formula/blob/master/salt/formulas.sls
On Jun 29, 2015 6:52 AM, "Dani Comnea" notifications@github.com wrote:

very useful info, is any traction being put on this for next Salt release?
Asking as i'm at the point where i want to move from states (where i have
parent-child/ inheritance relationship) to formula based but then seeing
this topic i'm worried i'll bum into a bigger problem.


Reply to this email directly or view it on GitHub
#12179 (comment).

@arnisoph
Copy link
Contributor

I'm going to implement arnisoph/GatherGit#3 in a few weeks which will address the ideas of this issue. If you have any further comments, let me know.

@westurner
Copy link
Contributor

That would be cool.

For test cases, you might have a look at some of the:

And a start at a test framework for salt formulas:

@arnisoph
Copy link
Contributor

@westurner salt formula testing is a completely different topic, I'll cover that in arnisoph/formula-docs#4 :)

@westurner
Copy link
Contributor

@bechtoldt Some tests are probably apropriate? (e.g. 'compiles' w/o syntax error, [...])

Should this/these metadata/test skeletons be standard functionality of e.g. salt.formulas or copied into every formula?

An example metadata file in https://github.com/westurner/cookiecutter-saltformula could be helpful.

@arnisoph
Copy link
Contributor

ouh, you mean testing the metadata itself? of course, this will be important.

@westurner
Copy link
Contributor

where/how do I call e.g. check_formula_metadata('./path'),
check_formula_'importable'('name')?

On Tue, Aug 11, 2015 at 4:31 PM, Arnold Bechtoldt notifications@github.com
wrote:

ouh, you mean testing the metadata itself? of course, this will be
important.


Reply to this email directly or view it on GitHub
#12179 (comment).

@arnisoph
Copy link
Contributor

👍

@arnisoph
Copy link
Contributor

The Salt Package Manager might be a solution for this issue in the future. I think it's still in a very early state. I'll file some feature requests.. :)

#24896 (PR)
#25210
#25211

https://docs.saltstack.com/en/develop/topics/spm/

@rallytime
Copy link
Contributor

Good call @bechtoldt. Any addition thoughts about this ^^ @techhat?

@j1n6
Copy link

j1n6 commented May 23, 2016

Here's my thought. The reason we need dependency management is being able to

  1. Easily reproduce formular used in a code base
  2. Explicitly understand it's original reference location
  3. Being able to compare or track down changes for formulars
  4. Being able to collaborate in both formular module development and large complex formular deployment

There are many ways to implement this, the simplest way is to use git to begin with - like Golang community's Godep. There's a great advantage of this:

  • Everything is version and hosted (privately or publicly), no need to worry about storage location
  • Dependency is aiming to produce a working package (combination of formulars), using Godep's approach would help us to get there first (quick win)
  • Introducing "Saltfile" like specification would be a second challenge (medium and long term)

@j1n6
Copy link

j1n6 commented May 23, 2016

btw, spm looks great.

@themalkolm
Copy link
Contributor

Are there any plans to have dependencies in spm?

@aphor
Copy link
Contributor

aphor commented Apr 21, 2017

This ticket has gone quiet, but it's still open so I'm unclear whether there's lack of consensus, or whether there's consensus that SPM has made the problem go away.

Can anyone comment on successes/failures using SPM to manage formula interdependencies?

@techhat
Copy link
Contributor

techhat commented Apr 21, 2017

SPM does have dependency management. An example formula might look like:

name: apache
os: RedHat, Debian, Ubuntu, Suse, FreeBSD
os_family: RedHat, Debian, Suse, FreeBSD
version: 201704
release: 1
summary: Formula for installing Apache
description: Formula for installing Apache web server
dependencies: zlib,pcre
optional: mod_perl
recommended: mod_ssl

If used, the dependencies field contains a list of SPM packages that must be installed before this one is, and SPM will attempt to install them at the same time. optional and recommended are currently informational only; no enforcement currently exists.

SPM has had this since last year, so I'm going to go ahead and close this. Of course, comments are still welcome, but if there are issues with dependencies, I would rather they go in a fresh ticket.

@techhat techhat closed this as completed Apr 21, 2017
@OrangeDog
Copy link
Contributor

Surely the formula should simply include states to manage its own dependencies?

@westurner
Copy link
Contributor

Conflicting ad-hoc approaches to dependency management is suboptimal: if X and Y depend upon Z being installed and configured and manage that dependency themselves, the order of operations determines which e.g. configuration file "wins" (remains on disk after the additive combinations of transformations are applied)

@westurner
Copy link
Contributor

Generally, it's not a safe assumption that - to reference the aforementioned example - both X and Y have utilized idempotent strategies for configuration management, so running both X and Y in any order an arbitrary number of times does not result in the desired outcome. For example, if X replaces httpd.conf and Y simply tries to append a configuration section, the end result is not consistent unless: (1) the execution order is consistent; and (2) Y is only run one time.
https://en..wikipedia.org/wiki/Idempotence

More abstract dependency management could avoid applying multiple conflicting formulas (if it were possible to express that X and Y depend upon an abstract requirement Z, which e.g. Z-org-formula provides)

@aphor
Copy link
Contributor

aphor commented Jul 18, 2020

Abstract dependencies are kicking the can down the road, not problem solving.

People will do it wrong, and the problems that result from well intentioned misunderstandings will be worse than the problems of fixing broken concrete dependencies.

https://twitter.com/Spearhafoc_/status/1028239907628740608?s=20

@westurner
Copy link
Contributor

westurner commented Jul 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature new functionality including changes to functionality and code refactors, etc.
Projects
None yet
Development

No branches or pull requests