-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
please consider switching back to semantic versioning #2083
Comments
Should note that this solves problems provided that minor and patch versions do not break backwards compatibility. Might be obvious given the definition of semantic versioning but is in practice not always easy nor guaranteed. |
The reason why calendar versioning is used instead of semantic versioning is because the maintenance costs of semantic versioning far outweigh the benefits for us. It is rare that pymatgen introduces breaking changes (other than the recent v2021.3.4, I can't remember really when was the last time this happened). There are also breaking changes that can be introduced by codes that pymatgen depend on, e.g., the recent numpy v1.20 that dropped support for Python < 3.7 and introduced support for the ArrayLike typing object. I am also unclear what the benefit of semantic versioning is. Other than the options you mentioned, you can also specify a range of pymatgen versions that would work (a strategy that a lot of people use for something like numpy) even with calendar versioning. Also, any code that is compatible with a higher version of pymatgen, e.g., 2021.3.4 would usually work with pymatgen<2021.3.4. @mkhorton and I try hard to make sure this is the case. So even if your project decides to make sure that you are compatible with pymatgen>=2021.3.4, it does not have to make it a requirement. In fact, if you set your requirements to say pymatgen>=2020.12.31 or even 2019.12.31, that usually poses no issues at all though you may lose some access to later analysis. It is much rarer that other codes depend on the latest analysis published within pymatgen. Most codes depend on pymatgen for its core functionality, which has rarely changed beyond some minor speed improvements over the past few years. Now, I would like to take this opportunity to explain why the breaking changes of removing root-level imports was implemented. The reason is in fact to better help other developers. We want to eventually move to making pymatgen and certain subpackages (e.g., analysis and io would be main examples) into namespace packages. This will allow us to distribute pymatgen in smaller modular pieces. Developers can also write add-on packages for pymatgen that can be maintained in separate repositories with their own requirements. I will use the example of the package from my group called pymatgen-diffusion, which provides for diffusion analysis. Arguably, this could have been implemented under The above example is a more trivial one and merely one of optical cleanliness. But there are situations where this would be extremely useful. For instance, if someone wanted to introduce an add-on machine learning package for pymatgen that depends on tensorflow. Tensorflow is a horrific dependency that we would not want to impose on everyone using pymatgen for core functionality and does not care about ML. Namespace packages would allow these functionality to be implemented as add-ons. Another example are of course io packages, e.g., an add-on to support a new quantum mechanics software used by its own small community. Finally, I just want to say that while we take pains to minimize such changes, we also try to make sure that the transition is made as painless as possible. I did provide the two commands that if you run them on your repos, it should resolve most of the breaking imports. # for mac
find . -name '*.py' | xargs sed -i "" 's/from pymatgen import MPRester/from pymatgen.ext.matproj import MPRester/g'
find . -name '*.py' | xargs sed -i "" 's/from pymatgen import/from pymatgen.core import/g'
# for linux
find . -name '*.py' | xargs sed -i 's/from pymatgen import MPRester/from pymatgen.ext.matproj import MPRester/g'
find . -name '*.py' | xargs sed -i 's/from pymatgen import/from pymatgen.core import/g' I tried them on the several codes that I maintain and they work for the most part. There may be a few instances, e.g., if you have something like |
Thanks for the detailed reply @shyuep ! Let me just follow up on some of the points you made:
One key benefit of semantic versioning is the ability to say (as a developer of an upstream tool): I am ok with any pymatgen version X or later, as long as it does not introduce breaking changes. Everybody understands that breaking changes are unavoidable from time to time. If we imagine for the moment that you had been using semantic versioning, then you would have increased the major version number together with this change (say, to 6.0). I certainly agree that keeping an eye on breaking changes and reflecting this in the version number is more work than just taking the day of the release. |
I guess the benefit is that with sematic versioning one specify an sensible upper limit of the version, which is not possible with calendar versioning. In both cases, there is no problem to specify a range for already released versions. Taking |
@ltalirz @zhubonan You can also do the same with calendar versioning? E.g., pymatgen>=2019.12.31,<2021.3.4? Of course, I recognize it is not as clean as saying <6.0.0 in the case of semantic versioning. I wouldn't think that the 493 projects that depend on pymatgen would stop working today. In fact, if you pip-installed say matminer or custodian today from Pypi, they will still install an old version of pymatgen and everything would work as per normal. The main people who would in fact be affected are those that install from the Github source and attempted to install the latest pymatgen. I welcome suggestions on how else I can try to make this more painless. We had a lot of discussions internally on how best to move ahead and trust me, we generally try not to implement breaking changes (except those that upstream codes such as numpy cause). |
That approach does indeed work for "past" versions, but it does not work for future versions, since there is no way of knowing whether they will be compatible.
Right - this is because matminer chose option 1. I mentioned above - with the corresponding downside of clashing with any other tool that would take such an approach (it can work for tools that go into the direction of a "distribution", which are never meant to be installed alongside something else). https://github.com/hackingmaterials/matminer/blob/master/requirements.txt
Trying to avoid breaking changes is certainly extremely valuable (and thanks for putting so much effort into this!). However, when the time comes eventually where a breaking change becomes necessary, in my view the most painless way to communicate this is through semantic versioning. |
@shyuep The problem is that on the day of 2021.1.1 I have no idea that pymatgen version 2021.3.4 will introduce breaking changes - so I cannot put On the other hand, if semantic versioning is used, and on 2021.1.1 pymatgen is at 5.3.2 version. I just need to put |
@zhubonan Thanks for clarifying. Good point. Then may I propose a compromise and feel free to tell me this works.
This will help the developers who actually pin to a semantic version. I doubt that a lot of people do that. But I am willing to do this to help everyone along if this is useful. |
Thanks a lot @shyuep ! I think this would be greatly appreciated by many developers relying on pymatgen.
If this turns out to be too difficult (I think it can take time to take a release off pypi), skipping this step is probably also ok, since most people will pick up the latest version v2021.3.5 anyways. |
Ok, thanks all for the input. I have just released v2021.3.5 and v2022.0.0. These will be live once the release Github Actions workflow are done (~30mins). v2021.x development will be on the "legacy" branch while we will continue the main line of development on v2022.0.x for the remainder of 2021. |
I saw @shyuep just closed this. I don't have any concrete suggestions, but I did want to offer some support and encouragement. There are a significant number of contributions to pymatgen from people outside of the core development team and it would take significant resources on their part to maintain proper SemVer. The definition of "breaking change" is really hard to determine - I recently came across Hyrum's Law from this relevant blog post advocating against SemVer:
If pymatgen strictly followed SemVer under a definition of "breaking" similar to Hyrum's Law for the past few years, the major version number increments would be about the same burden on packages downstream of pymatgen as CalVer. While breaking changes like v2021.3.4 can be surprising to downstream packages, I think calendar versioning on the whole has worked well for the pymatgen community. Thank you, pymatgen team, for your hard work and dedication to this package and community. |
I just wanted to add that we've added a page on compatibility and versioning issues to our documentation: https://pymatgen.org/compatibility.html This page is just a first attempt and will be improved, and we do take the issue of backwards compatibility very seriously. I can't remember the last time we renamed or removed a method in pymatgen.core, for example. I do agree with @bocklund that the main issue with semantic versioning is drawing the line where the main version should be incremented. The example of root-level imports is a clear case where it would have been useful to have semver, but these cases are rare. We're also not alone in using date versioning: many large packages do so also (e.g. In any case, I'm also grateful for the discussion and feedback, and sincerely apologize for causing the extra work in this case (as it will for myself as well). Hopefully it will bring sufficient benefits in the long run to be worthwhile. |
Also just catching up with the above messages too, looks like I should update that doc page 🙃 Hopefully this is a good resolution for everyone though! |
Just wanted to say that I appreciate it a lot that you guys responded so quickly and willing to discuss these things. I agree with what has been said before that deciding what exactly constitutes as breaking is not always easy. There are very clear cut and dry cases, but there is often a grey area. I personally think that calendar based versioning definitely has a place, but is probably best used for end-applications. I think that for libraries semantic versioning still is better suited as it allows consumers upstream to better prepare for eventually inevitable breaking changes. Anyway, thanks a lot for the compromise solution 👍 and all the work you put in |
Respectfully 😬, I would just point out that after you said this, multiple breaking changes were made in patch versions; ironically to how you can actually retrieve the version of pymatgen:
So now people have to use a number of try/excepts just to get the version of pymatgen
It sounds like this decision has already been made, but anyhow I would still advise against using this namespace package structure, in favour of just moving/leaving the packages to different repositories. This reminds of ansible's recent restructuring, that now is split into ansible-core and community repositories (and also adopted semantic versioning): |
@shyuep maybe worth removing the |
@chrisjsewell I'd only add that the commitment of "taking backwards compatibility seriously" is not in conflict with also sometimes making mistakes or mis-steps, so I'd ask for patience while we figure this out. Ultimately, pymatgen is a community-driven non-profit/academic code and we're all trying to do our best here. |
@chrisjsewell Yes, there were a few moves from 2022.0.0-2022.0.3. I accept culpability for it since this is the first time I am doing a namespace package and mistakes were made for the first few releases. That was fixed in 2022.0.3. Since SETTINGS and version are primarily used internally within pymatgen, I don't think a lot of people were affected. As a compromise, I have "yanked" 2022.0.0-2022.0.2, i.e., a form of soft-delete that prevents new users from installing them but allows users who have pinned them to continue to use them with warning messages. Hard deletes are frowned upon. I am unclear why you think avoiding namespace packages and simply moving packages to different repos is a better outcome. That would certainly cause more issues for end users. And you are mistaken in thinking this was done merely to break pymatgen into separate components (though we may indeed move certain less frequently used subpackages that are primarily maintained by external collaborators to separate repos). This is a forward move for more decentralized pymatgen functionality development by other developers. It is up to those developers to set their own cadence of release and manage their Issues and PRs. The main pymatgen distribution will focus on core functionality that caters to the majority of use cases by the broader community. This is really no different from the relationship between something like Jupyter or Flask and their various extensions and plugins. No one would think that the flask core developer has any role in setting the cadence of a plugin written by another developer. Finally, the 2022 versions are in fact a compromise towards temporary semantic version for the remainder of this year. I do not want to get into philosophical arguments about the pros and cons of semantic vs calendar versioning. |
Both your examples use multiple repositories over namespace packages, jupyter even specifically has
they would if the plugins were in the same repository as the core package
well I'm going to be interested to see how this works in the same repository; e.g. having release tags for multiple packages in the same git history 😬 But anyhow, I'm just giving my personal opinion, its entirely up to you ✌️ |
The plugins are not going to be in the same repository as the core package. That is the whole point. E.g., pymatgen-diffusion is not in the same repo as pymatgen. If all plugins are in the same repo, there is hardly any point in making namespace packages isn't it? Pymatgen main repo is now 681 Mb. I don't think anyone fancies it getting bigger. |
oh if you are saying that you are planning to move folders out of pymatgen into their own repositories then thats fine 👍 Personally, it feels like namespaces is an over-complication then and the namespace documentation even says:
(Note also, I say this having made a very similar change in one of the packages I maintain: executablebooks/markdown-it-py@bddc882, so I do put my money where my mouth is 😆) |
This is a final update on this channel. I just released a v2022.0.4. There are no breaking changes in this one - just new functionality for Element (ionization energies and electron affinities). I also did a backport to a v2021.3.9. Feel free to let me and @mkhorton know if any issues arise. Otherwise, I think v2022.0.3 onwards is reasonably stable to migrate to as and when you are ready to do so. We will refrain from making any other breaking changes. :-) |
Is your feature request related to a problem? Please describe.
As the github landing page nicely illustrates,
pymatgen
is used as a dependency in many other projects, some of which are again used as components of other projects and so on.pymatgen's calendar versioning approach essentially leaves developers of such projects with two options:
pymatgen==2021.3.4
.This has the disadvantage that your tool becomes incompatible with any other tool following the same strategy because their pymatgen dependency versions would clash.
pymatgen>=2020.3.4
.This means your project can (and likely will) break when pymatgen makes breaking changes (such as the recent 8f097cf )
These problems would be solved by switching back to semantic versioning, which would allow developers of upstream tools to depend on pymatgen without risking to break their projects (e.g.
pymatgen~=5.0
) while still making it possible to work together with other tools that share the pymatgen dependency.I know that pymatgen originally used semantic versioning and then switched to calendar versioning in 51411b1#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7 in 2017.
I would kindly ask the developers to elaborate on the reasons for this decision and, potentially, to reconsider it.
The text was updated successfully, but these errors were encountered: