-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration hash in BuildInfo may be unstable #11777
Comments
I think, such things should be referred to using their fully-qualified name. Then, the extension is responsible for importing it. We need to enforce on our side only JSON-like objects for configuration values. And not allowing arbitrary references. Alternatively, we could change how we serialize functions by hashing their corresponding dissassembly operations (which should technically be unique if it doesn't change, unless they have assertions and one build is with I prefer the second approach where we special-case the hashing of objects. Note: the function name is not necessarily unique (even its fully-qualified name may be distinct). However its disassembly should be unique I think. |
I think that using a Python file for the configuration but then imposing limitations on the allowed types for configuration values might lead to confusion. Apart from that, changing this now would be a breaking change and require a deprecation first, or not?
Do you mean using the
When the code in this function changes, this should be reliably detectable by looking at the output of the disassembly. But when the implementation of
Yes, I think that would be the best way as well. Requiring JSON-like values and using qualified names as option values is equivalent to hashing the qualified name of a function now. Both options are unable to detect whether the behavior of a function changes. They are only able to detect if the qualified function name is the same. My suggestion is to add a special case for hashing functions using their qualified name as a bug fix now. Optionally, a deprecation warning could be added to inform users/maintainers of extensions, that this behavior will not be supported in the future. But I do not see a real benefit in doing that. The unstable hash will already be fixed and I don't know of any other problems that are caused by using arbitrary references here. If we agree on what the best way to fix this is, I can create a pull request to implement the necessary changes. |
Ah yes concerning the dissasembly, I missed that point. Also it could have been a huge dump so not worth it I think. Technically it's possible to have a deterministic hash by unrolling everything but that's the worst idea in practice. I think we could live with just storing the name + module name. But we should probably update the doc for extensions developers and for people interested in reproducible builds. Before implementing anything I would like some comments from other members (probably after New Year's Eve). Maybe they (or people worried about reproducibility) have other ideas. |
Resolved (hopefully) in 959f73f. A |
Describe the bug
The Sphinx HTML builder verifies that the configuration is unchanged by hashing all configuration values and storing this hash in a
.buildinfo
file. On the next run the current and the previous hash are compared to determine whether the configuration has changed. If a change is detected, all source files are marked as out of date and rebuilt.Given that the configuration is a Python file, a reference to a function can be used as a configuration value. Some Sphinx extensions make use of this and let the user define custom functions for specific tasks. One example of this is the Sphinx-Gallery extension.
Quoting one example from the Sphinx-Gallery documentation:
The Sphinx HTML builder hashes the string representation the configuration value object:
sphinx/sphinx/builders/html/__init__.py
Line 88 in 3596590
The string representation of a function is of course something like
<function test at 0x0000020C4644D6C0>
where the memory address is changing for every run.This means, that when using any extension that uses a reference to a function as a configuration value, the documentation will be completely rebuilt on every run.
Issues caused by this:
It is of course debatable, whether this is a bug or not. But given that the configuration file is a Python file, it is not really unexpected that people want to use the possibility of just passing a function object as a configuration value.
It is not reliably possible to actually hash the code within such a function, therefore it is not possible for Sphinx to determine if the function itself has changed.
I see three potential options here:
In my opinion, a user would not expect Sphinx to automatically recognize a change in a user provided function, therefore option 1 or 2 would be acceptable.
How to Reproduce
Add the following content to the end of the default
conf.py
in thesource
directory:Run
make html
and notice that all HTML files will always be rebuilt without making any changes between runs. If you run with the-vv
option (or higher) you will find[build target] did not match: build_info
in the output.Environment Information
Sphinx extensions
Additional context
No response
The text was updated successfully, but these errors were encountered: