-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CEP for MatchSpec minilanguage #82
base: main
Are you sure you want to change the base?
Conversation
I'm seeing myself referring to the "MatchSpec" interface in other CEPs yet this is not standardized, so there we go. Let's open that can of worms. |
This will probably need another CEP on |
- `license` | ||
- `license_family` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license
and license_family
could be used for search packages with a specific license I guess, say with conda search '*[license="Apache-2.0"]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, true, I hadn't considered search
here, only install
-oriented operations. I should rephrase this part a bit to cover this aspect.
|
||
The `MatchSpec` mini language has gone through several iterations. | ||
|
||
The simplest form merely consists of up to three positional arguments: `name [version [build]]`. Only `name` is required. `version` can be any version specifier. `build` can be any string matcher. See "Match conventions" below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest form merely consists of up to three positional arguments: `name [version [build]]`. Only `name` is required. `version` can be any version specifier. `build` can be any string matcher. See "Match conventions" below. | |
The simplest form merely consists of up to three positional arguments: `name [version [build]]`. Only `name` is required. `version` can be any [version specifier](#version-specifier). `build` can be any [string matcher](#string-matching). See [Match conventions](#match-conventions) below. |
Also, should we define what characters are accepted in a package name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this is going to be part of a different CEP, PackageRecord
.
|
||
### Exact matches | ||
|
||
To fully-specify a package record with a full, exact spec, these fields must be given as exact values: `channel` (preferrably by URL), `subdir`, `name`, `version`, `build`. Alternatively, an exact spec can also be given by `*[md5=12345678901234567890123456789012]` or `*[sha256=f453db4ffe2271ec492a2913af4e61d4a6c118201f07de757df0eff769b65d2e]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When matching by checksum, should you also add the subdir? If I'm not mistaken, it's possible for two subdirs to contain a package with the same checksum right? Or is this a corner case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checksums are coming from the compressed artifacts, so in principle they should be unique (even with unique contents, the index.json
file should have "subdir": <subdir>
, I think?).
The hash that conda-build uses for the build_string
doesn't consider the subdir, indeed (and maybe it should).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, rattler does not currently support this. There we require that at least the package name is still specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great write up @jaimergp !
|
||
The simplest form merely consists of up to three positional arguments: `name [version [build]]`. Only `name` is required. `version` can be any version specifier. `build` can be any string matcher. See "Match conventions" below. | ||
|
||
The positional syntax also allows the `=` character as a separator, instead of a space. When this is the case, versions are interpreted differently. `pkg=1.8` will be taken as `1.8.*` (fuzzy), but `pkg 1.8` will give `1.8` (exact). To have fuzzy matches with the space syntax, you need to use `pkg =1.8`. This nuance does not apply if a `build` string is present; both `foo==1.0=*` and `foo=1.0=*` are equivalent (they both understand the version as `1.0`, exact). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is just reporting the current state of affairs but, jucky.
In rattler, this form is no longer allowed when parsing in strict mode. (still accepted in lenient parsing mode).
following conventions: | ||
|
||
- If the string begins with `^` and ends with `$`, it is converted to a regex. | ||
- If the string contains an asterisk (`*`), it is transformed from a glob to a regex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstood what it means to transform a glob to a regex but *cuda
is a valid build string glob right?
> < 0.960923 | ||
> < 1.0 | ||
> < 1.1dev1 # special case 'dev' | ||
> < 1.1_ # appended underscore is special case for openssl-like versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is not part of this CEP but the suffix _
notion is not present in the description above.. It is also another can of worms. 1.0-
is also valid. So is 1.0__
and 1.0--
...
- Exact equality and negated equality: `==`, `!=`. | ||
- Fuzzy equality: `=`, `*`. `=1.0` and `1.0.*` are equivalent, and both would match `1.0.0` and `1.0.1`, but not `1.1` or `0.9`. | ||
- Logical operators: `|` means OR, `,` means AND. `1.0|1.2` would match both `1.0` and `1.2`. `>=1.0,<2.0a0` would match everything between `1.0` and the last version before `2.0a0`. `,` (AND) has higher precedence than `|` (OR). `>=1,<2|>3` means `(>=1,<2)|(>3)`; i.e. greater than or equal to `1` AND less than `2` or greater than `3`, which matches `1`, `1.3` and `3.0`, but not `2.2`. | ||
- Semver-like operator: `~=`. `~=0.5.3` is equivalent to `>=0.5.3, <0.6.0a` and this syntax is preferred for backwards compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not entirely correct, it should be ~=
is equivalent to >=0.5.3, 0.5.*
. This is an important distinction because both 0.6.0_
and 0.6.0dev
are considered smaller than 0.6.0a
so they both would still match >=0.5.3, <0.6.0a
!
|
||
### Exact matches | ||
|
||
To fully-specify a package record with a full, exact spec, these fields must be given as exact values: `channel` (preferrably by URL), `subdir`, `name`, `version`, `build`. Alternatively, an exact spec can also be given by `*[md5=12345678901234567890123456789012]` or `*[sha256=f453db4ffe2271ec492a2913af4e61d4a6c118201f07de757df0eff769b65d2e]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, rattler does not currently support this. There we require that at least the package name is still specified.
|
||
## Reference | ||
|
||
- [`conda.models.match_spec.MatchSpec`](https://github.com/conda/conda/blob/24.5.0/conda/models/match_spec.py) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI this is the rattler implementation: https://github.com/mamba-org/rattler/blob/main/crates/rattler_conda_types/src/match_spec/mod.rs
6. If `channel` is an exact value and `subdir` is an exact value, `subdir` is appended to | ||
`channel` with a `/` separator. Otherwise, `subdir` is included in the key-value brackets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this related to the label channels? e.g. pytorch/label/nightly::libfaiss
?
With the seperator logic this will be assumed to be a subdir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in conda is to take the last component and compare it against known subdirs. As a result, channels cannot be named like subdirs. e. g. I can't register a channel named linux-64
.
Hi Jaime, thank you for this proposal! Do you think we could come up with a ANTRL4 grammar for If we have an exhaustive set of valid instances of |
Closes #80
📝 👓 Markdown preview