Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

objects.inv is not generated deterministically when there are duplicate references #12001

Open
raboof opened this issue Feb 23, 2024 · 4 comments

Comments

@raboof
Copy link

raboof commented Feb 23, 2024

Describe the bug

When the same sections are found in multiple files (like e.g. in https://gitlab.com/qemu-project/qemu/-/issues/2190), it is not deterministic which reference gets included into objects.inv .

How to Reproduce

Unfortunately I've not been able to trigger this problem in a minimal example yet. In theory it should be nondeterministic with an empty conf.py, and:

index.rst:

.. include:: toinclude.rst.inc

index.rst:

.. include:: toinclude.rst.inc

toinclude.rst.inc

foo
---

bar

... but I haven't seen it produce different objects.inv indexes with this minimal example, so there might be more going on.

Environment Information

Platform:              linux; (Linux-6.7.4-x86_64-with-glibc2.38)
Python version:        3.11.7 (main, Dec  4 2023, 18:10:11) [GCC 13.2.0])
Python implementation: CPython
Sphinx version:        7.2.6
Docutils version:      0.20.1
Jinja2 version:        3.1.3
Pygments version:      2.17.2

Sphinx extensions

No response

Additional context

https://github.com/sphinx-doc/sphinx/blob/bc74a6223caa72c39b8ccad3f17202dcb098c918/sphinx/domains/python.py#L1546C23-L1553 might be relevant

@picnixz
Copy link
Member

picnixz commented Feb 23, 2024

Quick comment: it's probably because of parallel read/merge and the fact that the files could possibly be discovered in a nondeterministic way. I can investigate this tomorrow.

@picnixz picnixz self-assigned this Feb 24, 2024
@picnixz
Copy link
Member

picnixz commented Feb 24, 2024

I can't reproduce this one, even with a somewhat smaller QEMU docs. I think the issue might comes from the fact that the QEMU docs has a lot of internal extensions and maybe some of the mess things up. Also, it appears that there is only one label in intersphinx being created for this driver title (you can inspect the inventory using python -m sphinx.ext.intersphinx FILE_OR_URL).

project.zip

For now, I'll close the issue until you find a MWE (otherwise this issue will likely be opened for years).

@picnixz picnixz closed this as not planned Won't fix, can't repro, duplicate, stale Feb 24, 2024
@jayaddison
Copy link
Contributor

A theory and a suggestion:

I think that a likely cause of this is variance in the order that source documentation files are read from the filesystem during Sphinx project build. That could explain why it's tricky to replicate on a single machine/filesystem -- because in isolation, that filesystem may return results in fairly-or-entirely deterministic order -- and also could mean that it's tricky to write a traditional unit test case for this, because uncovering the problem would be reliant on behaviour outside of the Sphinx codebase.

I'm wondering whether to commence work on a continuous integration test to attempt to smoke this out. If I did -- this is the suggestion part -- I'd probably begin by adding disorderfs to the GitHub Actions unit test workflows - disorderfs is a userspace-filesystem that can return filesystem results in randomized order, and is available as a Debian package.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 27, 2024
@sphinx-doc sphinx-doc unlocked this conversation Sep 29, 2024
@jayaddison jayaddison reopened this Sep 29, 2024
@jayaddison
Copy link
Contributor

Based on recent build reproducibility test results, I believe this bug remains valid - I'll try to confirm that with a minimal example soon; from the linked QEMU bugreport it seems that inclusion/re-use of definitions within multiple pages may be a contributing factor. I think this is separate to the table-of-contents ordering ambiguity tracked under recent investigation in #6714.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants