-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: identifiers in a file #511
Conversation
To nitpick, |
76cc801
to
f3e1680
Compare
See #496 (comment) And TOML is a real format, while |
f3e1680
to
ba1d73c
Compare
Right! I also says
but fair enough ;-) Would it then make sense to move
"ugly" and "Python 2.7" in the same sentence? You must be joking! ;-) But yes, the overview is completely gone in these files :-( |
c8e697d
to
828aa8a
Compare
I think we should match them up, yes. If we don't want to add yaml as a dependency, what about using json? |
828aa8a
to
bf2a023
Compare
TOML (basically the same as cfg): [cp39-manylinux_aarch64]
version="3.9"
path_str="/opt/python/cp39-cp39" YAML: - identifier: cp39-manylinux_aarch64
version: 3.9
path_str: /opt/python/cp39-cp39 JSON (technically valid YAML too): {"identifier": "cp39-manylinux_aarch64", "version": "3.9", "path_str": "/opt/python/cp39-cp39"}, Currently CPython 2.7 is the only one with two builds from one build identifier ( We could also combine the files into one file, as OS's don't overlap. |
The JSON looks best to me, indeed, the nice thing being that if you have a list of those, they will immediately show you the versions without scrolling between blocks. Funnily enough, none of them look as good as our Python (This might of course also just be me, being used to the current format in Python code, and not yet to the TOML version.) |
This seems more like data to me, always seems odd to have this sitting inside the .py files, especially having to edit them for updates. And if we add the ability to update these files programmatically, that’s much better as a data file! YAML would allow: - {identifier: cp39-manylinux_aarch64, version: 3.9, path_str: /opt/python/cp39-cp39} not a fan of mixing yaml and json styles but would probably be cleanest. But adds a dependency. And likely hard to auto generate for an update. we could shorten identifier or path_str. |
Me neither, but this looks nice.
Well, with something like this we're pretty close, no? {
"linux": [
...
{"identifier": "cp39-manylinux_x86_64", "version": "3.9", "path_str": "/opt/python/cp39-cp39"},
{"identifier": "cp39-manylinux_i686", "version": "3.9", "path_str": "/opt/python/cp39-cp39"},
...
{"identifier": "cp39-manylinux_aarch64", "version": "3.9", "path_str": "/opt/python/cp39-cp39"},
],
...
} I'm indeed not sure whether adding |
For the record, I did ask about |
Questions for @joerick:
If you want it to be a vote, I ordered the format list above by my preferences, json first. |
The PythonConfiguration objects for each platform have different schemas, so I think they should be stored in different files.
I don't think we should be adding another runtime dependency to parse an internal config file. TOML is currently only needed for the cp35 windows workaround, so it might be dropped again after that. I'd be more in favour of configparser or JSON. Generally speaking I don't much enjoy JSON as a user-facing format, it's very brittle to edit directly. I think we should be using a list of configs, not the dict of 'identifier': {...attributes}. Ordering is important, and as you've found, on linux there are two builds with the same identifier. That rules out configparser, because it doesn't support lists. TOML does with So, it's probably back to JSON. But, for a bit of fun, I tried out https://github.com/Instagram/LibCST to see if there was a pure-python way to do it. And honestly, it's not too crazy. Here's a script that patches the urls in macos.py: import libcst as cst
class UpgradeTransformer(cst.CSTTransformer):
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.BaseExpression:
node = updated_node
if not (isinstance(node.func, cst.Name) and node.func.value == 'PythonConfiguration'):
return node
for arg in node.args:
assert arg.keyword is not None
if arg.keyword.value == 'url':
assert isinstance(arg.value, cst.SimpleString)
url = arg.value.evaluated_value
print(url)
# patch the url here
url += '#patched'
node = node.with_deep_changes(arg.value, value=f"'{url}'")
return node
macos_cst = cst.parse_module(open('cibuildwheel/macos.py').read())
macos_cst = macos_cst.visit(UpgradeTransformer())
print(macos_cst.code) This could be used to update the NamedTuples in-place, or another option could be to use something like the above with a standalone configs file, like I'm probably too close to this to tell if it's a good idea, what do you guys think? |
I'd rather a config format (JSON) over editing Python files. I'm not a fan of Python files for config; Python generally should be code, and data should be in a data format. I would expect to be able to take a Python file and add arbitrary Python code, which such a parser as you've shown would choke on.
That's what I mean with "the current workaround" - I was referring to the fact that it was a dict rather than a list. TOML allows an array inside a section, yes, but the values of that array are not allowed to be dicts, I believe - only simple values. Then you'd have to parse again with some other format on the strings. |
I definitely like the "fun" part, but I'm not entirely sure it's the right match here? Also, if you didn't want to introduce another dependency, Not sure I'm convinced of that myself, but we don't need to per se follow a fancy "everything's possible" kind of standard, ofc. Something like
might perfectly server our purpose for now to populate the existing |
This is called |
I know, but I didn't want to pin us to |
Actually, I stand corrected. It's just a oneliner:
Nevermind, this works:
|
Actually, maybe toml does support inline dicts, I just noticed this: https://python-poetry.org/docs/dependency-specification/#expanded-dependency-specification-syntax |
Seems this also works with pytoml yes:
Just a pity of those |
Yes it does. Just not at top level. |
Try: [linux]
vals = [
{identifier = "cp38-manylinux_x86_64", version = "3.8", path_str = "/opt/python/cp38-cp38"},
{identifier = "cp39-manylinux_x86_64", version = "3.9", path_str = "/opt/python/cp39-cp39"},
] I can check later. |
Once again, you're right. I hadn't tired that! Well, I'd be up for TOML, then, actually, if @joerick doesn't mind the extra dependency. If not, I do kind of like the simplicity of the CSV (or similar) version. |
I like JSON, CSV, and this form of TOML pretty equally, with a minor preference on the JSON still. Currently working on a cookie cutter and the specs for it all all JSON. It's not unusable as long as you don't want to add comments to whatever you are doing. :) I'd probably do something like: [tool.cibw.build-platforms]
linux = [
{...}, That way, you could imagine them being specifiable in a single file if someone wanted to. |
3152675
to
f2d4de2
Compare
CSV wouldn't work, because they are variable length. That leaves JSON and TOML. The TOML I have right now is not bad at all, IMO; I dropped the workaround. Though JSON would also work. |
It will if you have multiple files per OS, as @joerick said he preferred anyway? The But fair enough, all my concerns about readability and getting an overview are gone, indeed, in the current form! So that just leaves the dependency concern. |
This TOML is looking very nice, now :) I think I might be swayed over. I also really like the way it slots into the namedtuple using the If those headers are required, (I know this is me changing my mind), perhaps we should put them all in one file? Then the headers actually mean something. |
Excellent question. With some customization and a few extra spaces, it can be done. Will commit an example tomorrow. |
One benefit of json + one file here, by the way, it would make it easy to write a parser for it in javascript that would allow a user to type into a box and see the results live in the docs (never tried that, but should be possible, I think?). TOML or any other format would be harder to parse in JS (though you could always just read it with toml and stick a simple list JSON into the docs during generation, so really, not a big issue). Going to push my toml version soon. |
I've put an example roundtrip (the linux one and a test for now). What do you think? Should I go with this, in a single file? |
Now that we have this nice formatting, I'm up for a single file. I think it could enhance updates, when now, you need to pass 3 files. Though one file is not a must-have for me, either. |
a59ac64
to
c69640d
Compare
e6fa915
to
42621aa
Compare
Touched up types a bit in my last comment, adding exhaustiveness checking on the PlatStr literal, and worked around the fact that |
13e5dfa
to
b1f1f23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we're nearly there with this one! Just a few minor comments on naming etc.
from importlib.resources import files | ||
|
||
|
||
resources_dir = files('cibuildwheel') / 'resources' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what does importlib.resources.files
do that Path(__file__).parent
doesn't? It can't be .egg compatibility, can it? :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It supports directories that are zip files (see https://github.com/scikit-hep/particle/releases/tag/v0.14.0 - that .pyz
file runs directly without installing, python3 particle.pyz
just works), and namespace packages that are split across directories.
Though, however, I'm using it here because it's more elegant and the "right' thing to do. We are accessing a resource.
PS: There's a few tricks when running from a zip file, you can open the path or access the filename, but cannot use the file name to open the file without a little workaround which I didn't attempt to implement.
Fixes #431, should make #496 easier. I did not provide a way to override the files. I am now using importlib.resources here.
instead of messing with the backport, we can wait until #508 and then just use the stdlib.Edit: the 3.9 version is much nicer and less intrusive, using that.