-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ftplib.cpython-312.pyc
does not appear to be reproducible
#124924
Comments
ftplib.cpython-312.pyc
does not appear to be deterministicftplib.cpython-312.pyc
does not appear to be reproducible
There was a reproducibility issue in 3.13 because of |
Indeed on 3.13.0rc2 I saw what was likely the |
If you know the specific byte code, you can track that to specific source code and then remove lines by bisecting until you find an offending line. Presumably you can then instrument and program the debugger to stop when generating that code and that should be non-deterministic then. Probably such a tool would be of general interest if nobody already built one. |
Like I wrote in the summary:
|
I didn't make a distinction between actual byte code and their addresses. Either way, one way of finding the issue is my approach. It's ofcourse possible you have a better idea. |
Gotcha - those addresses are different pretty much 'across the board' in this file though, there's not any particular offending lines in the source code. There's also not much to bisect, this problem appears to exist since early 3.12 rc's. For example:
It's clear that these correspond to the classes the full diff
I'm not sure where to go from here, what would be your next step? |
What I meant is: Identify one of the outputs as the desired output (that is a mapping from byte address to byte code)/a "file". Open a terminal. In this terminal find the exact dynamic program point at which the differently generated value is generated in a debugger, like gdb. So, that will give you two pairs of type: (address, value) with potentially completely different values. However, you should be able to construct a mathematical expression (and certainly in a time travelling debugger (not sure whether gdb has a good implementation of that)) computing just that one wrong tuple value. By your observations it is impossible for that to be deterministic. I suspect that there are data structures being used without deterministic properties or possibly (but less likely) even parallelism. The mathematical expression can be obtained by going up some frames and writing down what happens in a separate file or on a piece of paper. If you do the previous process in two separate terminals for two different executions resulting in two different outputs at the same time, you can compare frame by frame what's happening and when the first dynamic difference happens, because that's what you are looking for. Then it's just repeating the whole progress for every new "first" dynamic difference, until the problem is resolved. So, the brute force method is writing down the whole mathematical expression and a slightly more informed one is looking into the general structure of the code generator (which is not exactly an example of engineering). I don't think debuggers exist which can extract the expression I discussed automatically, which kind of shows how bad tools still are, but I think gdb could be programmed to output such an expression, if you are feeling adventurous. |
mainly for one-off tests, such as seeing if I can further reproduce python/cpython#124924
mainly for one-off tests, such as seeing if I can further reproduce python/cpython#124924
Bug report
Bug description:
Previously, when buiding cpython 3.11 twice on independent infrastructure (for example when building the binary package for the NixOS linux distribution), with the right parameters this process produced a bit-by-bit deterministic result.
For cpython 3.12 (tested with various versions up to 3.12.6), the result is almost deterministic:
It seems specifically
ftplib.cpython-312.pyc
,ftplib.cpython-312.opt-1.pyc
andftplib.cpython-312.opt-2.pyc
may differ between rebuilds. This is strange, as I don't see anything special about the contents of ftplib.py or about the way it is built. Looking at the bytecode withpydisasm
fromxdis
only shows differences in 'code object' addresses, not in 'actual' bytecode.Does anyone have any idea where this might be coming from?
CPython versions tested on:
3.12
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: