Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Environment.file_to_rebuild: pickle to sorted list instead of set #12870

Conversation

jayaddison
Copy link
Contributor

@jayaddison jayaddison commented Sep 8, 2024

Feature or Bugfix

  • Bugfix

Purpose

  • Attempt to resolve nondeterministic toctree rendering, by removing randomisable runtime set content ordering (that may be pickled to .doctree files).

Detail

  • Instead of storing Environment.files_to_rebuild -- a cached/pickled doctree attribute that is used as an ordered iterator by some of the table-of-contents generation code -- using Python's built-in set datatype, use the built-in list type instead.
  • Maintain some performance scalability opportunities by ensuring that the list value is stored in sorted order.
  • May introduce some runtime performance regression. Building ephemeral SortedList instances after deserialization of pickled .doctree contents, and then subsequently re-casting to list before pickling could improve this.
  • Requires an increment to the Sphinx env pickle version number (63 to 64).

Relates

cc @bmwiedemann

@bmwiedemann
Copy link
Contributor

I tried this patch on top of 8.0.2 and still got similar diffs: when comparing builds on 1-core-VM and 4-core-VM: https://rb.zq1.de/other/kernel-docs/kernel-source-compare.out

@bmwiedemann
Copy link
Contributor

also note, that in the diff, not just the order changes, but whole blocks go missing.

@jayaddison
Copy link
Contributor Author

Ok, thank you. I'd suggest we focus on one of the ToC entries that is entirely absent during one of the builds, and try to determine where that inclusion/exclusion originates. One opportunity for that is the Core API Documentation entry, I think.

@jayaddison
Copy link
Contributor Author

My working theory in this branch was that the .doctree could be a causal link in the nondeterminism of the ToC trees; for the record there were three sources of nondeterminism in .doctree writing:

  • The build path is embedded in some filenames in the file. I resolved these by using a fixed build path for each build.
  • Dynamic timestamps are associated with each docname in the file by this line of code -- I resolved this temporarily by using a fixed/constant value, but it would be better handled by using SOURCE_DATE_EPOCH instead.
  • The ordering of some of the data within the file was not deterministic -- this branch resolved that factor, at least for some small-scale test builds (of test-toctree in particular).

@jayaddison
Copy link
Contributor Author

Ok, thank you. I'd suggest we focus on one of the ToC entries that is entirely absent during one of the builds, and try to determine where that inclusion/exclusion originates. One opportunity for that is the Core API Documentation entry, I think.

The findings at #6714 (comment) uncover more about what was happening there: basically, some elements in the HTML navigational menu are included/omitted based on build-time non-determinism.

On further inspection I note that env.numbered_toctrees also appears to be a factor in the traversal order of the toctree nodes. Perhaps it too may require item-order stabilization? (currently it appears to be stored in a Python set). Note that this is not necessarily the same as sorted / lexicographic order; sometimes there are reasons to retain a different ordering.

@jayaddison
Copy link
Contributor Author

Ok, thank you. I'd suggest we focus on one of the ToC entries that is entirely absent during one of the builds, and try to determine where that inclusion/exclusion originates. One opportunity for that is the Core API Documentation entry, I think.

The findings at #6714 (comment) uncover more about what was happening there: basically, some elements in the HTML navigational menu are included/omitted based on build-time non-determinism.

On further inspection I note that env.numbered_toctrees also appears to be a factor in the traversal order of the toctree nodes. Perhaps it too may require item-order stabilization? (currently it appears to be stored in a Python set). Note that this is not necessarily the same as sorted / lexicographic order; sometimes there are reasons to retain a different ordering.

Nope; seems more likely to be env.glob_toctrees that's relevant here, now that I think about it; at least in the case of these navigation menu elements.

@jayaddison
Copy link
Contributor Author

Closing; this changeset doesn't resolve the linked issue as-is, and I'm not currently aware of a pressing need to make the (intermediate output) .doctree files deterministic.

@jayaddison jayaddison closed this Sep 29, 2024
@jayaddison jayaddison deleted the issue-6714/toctree-build-determinism branch September 29, 2024 17:42
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
builder:html dependencies Pull requests that update a dependency file internals:environment internals:toctree python Pull requests that update Python code type:performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sphinx-build -jauto has nondeterministic html results
2 participants