Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ossi --stripped tests are producing files with a different checksum #109

Closed
chenxiaolong opened this issue Jun 28, 2023 · 6 comments · Fixed by #110 or #114
Closed

ossi --stripped tests are producing files with a different checksum #109

chenxiaolong opened this issue Jun 28, 2023 · 6 comments · Fixed by #110 or #114
Assignees

Comments

@chenxiaolong
Copy link
Owner

chenxiaolong commented Jun 28, 2023

Exception: /home/runner/work/avbroot/avbroot/tests/files/ossi/4cacbe5e6a3ab6a6fade68cc40f44d0fa6a2928a.zip.stripped.patched: expected sha256 c6abc228354ee6f4e338fce6860884c0aa6d4d2a72920e411c491ff52278702d, but have 499b28a7facf35636288fa038de0b32ef54b1638504e975a22dc376b01a85ed9

(from https://github.com/chenxiaolong/avbroot/actions/runs/5351406765/jobs/9820732563)

@chenxiaolong chenxiaolong self-assigned this Jun 28, 2023
@chenxiaolong
Copy link
Owner Author

This doesn't seem to be a cache key confusion issue. These both restored the stripped image from the same cache (img-a72311b29e042cea2882e7f8d10ee53e472b127ea4ec7bc974f0d3a47de178cf-ossi).

Succeeded: https://github.com/chenxiaolong/avbroot/actions/runs/5272536671/jobs/9534971215

Failed: https://github.com/chenxiaolong/avbroot/actions/runs/5351406765/jobs/9705197207

I'm not having any luck reproducing this locally yet (Fedora 38 + python3-3.11.3-2.fc38.x86_64).

@chenxiaolong
Copy link
Owner Author

OK, I can reproduce this with:

./tests/tests_containerized.py -d alpine --rebuild -- -d ossi --stripped

but not with any of the other container images. Still investigating.

@chenxiaolong
Copy link
Owner Author

The file sizes are identical and the differing bytes are in the payload.bin local header and the central directory. I'm going to dig through the cpython zipfile implementation to see if something changed recently.

❯ diff -u <(hexdump -C 4cacbe5e6a3ab6a6fade68cc40f44d0fa6a2928a.zip.stripped.patched) <(hexdump -C 4cacbe5e6a3ab6a6fade68cc40f44d0fa6a2928a.zip.stripped.alpine)
--- /proc/self/fd/11    2023-06-28 16:16:01.825188990 -0400
+++ /proc/self/fd/12    2023-06-28 16:16:01.826188988 -0400
@@ -1,5 +1,5 @@
 00000000  50 4b 03 04 2d 00 08 00  00 00 00 00 21 3a 00 00  |PK..-.......!:..|
-00000010  00 00 00 00 00 00 00 00  00 00 0b 00 28 00 70 61  |............(.pa|
+00000010  00 00 ff ff ff ff ff ff  ff ff 0b 00 28 00 70 61  |............(.pa|
 00000020  79 6c 6f 61 64 2e 62 69  6e 01 00 10 00 c4 b2 5f  |yload.bin......_|
 00000030  32 01 00 00 00 c4 b2 5f  32 01 00 00 00 01 00 10  |2......_2.......|
 00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
@@ -3092728,37 +3092728,37 @@
 1323d4890  14 13 9f 56 2e d1 7f ae  a3 03 74 dc cd 76 3e 99  |...V......t..v>.|
 1323d48a0  24 af 49 db e2 30 0b 06  09 60 86 48 01 65 03 04  |$.I..0...`.H.e..|
 1323d48b0  02 01 30 0d 06 09 2a 86  48 86 f7 0d 01 01 01 05  |..0...*.H.......|
-1323d48c0  00 04 82 02 00 8d da 93  55 c3 41 d6 7a cb 8c 9d  |........U.A.z...|
-1323d48d0  c2 4d 4a 20 7b 99 53 0b  1c 92 8a 36 4f 8a 09 9c  |.MJ {.S....6O...|
-1323d48e0  36 ad 2f 35 17 5d 93 f4  1c 88 f8 1c 30 85 16 49  |6./5.]......0..I|
-1323d48f0  5a 2b af 88 0b 0d c4 30  e0 a0 f8 83 27 bc 66 0c  |Z+.....0....'.f.|
-1323d4900  7e 38 6a 1c c9 48 55 94  a0 64 61 34 28 88 ab 63  |~8j..HU..da4(..c|
-1323d4910  d2 8a 18 bc 8d d5 3e cb  f2 20 d4 4e 86 4b ed 92  |......>.. .N.K..|
-1323d4920  c5 80 49 a2 46 8e fb 38  f1 bb 03 5f 63 13 ea 15  |..I.F..8..._c...|
-1323d4930  24 aa 1a 14 55 0b 39 aa  81 98 ee 4e f8 65 4a 7f  |$...U.9....N.eJ.|
-1323d4940  42 f7 85 57 07 63 0c 13  10 27 08 fe d0 b5 71 88  |B..W.c...'....q.|
-1323d4950  6a b4 26 99 bd 83 fb 5b  31 46 7a c3 b3 0d 64 b7  |j.&....[1Fz...d.|
-1323d4960  a2 e1 32 c8 40 55 cc 67  43 88 24 20 a6 02 4b 37  |..2.@U.gC.$ ..K7|
-1323d4970  ca be 61 7d 21 40 47 90  20 c5 8b 54 05 1b 4a 88  |..a}!@G. ..T..J.|
-1323d4980  cf 2d 14 91 13 04 f5 11  45 dd da d8 fa db 49 cb  |.-......E.....I.|
-1323d4990  7a 76 f8 f9 f1 8f 96 f6  7e bd c8 36 e3 95 18 74  |zv......~..6...t|
-1323d49a0  97 25 86 8d 22 07 c4 76  46 f2 49 3c 6c 98 69 51  |.%.."..vF.I<l.iQ|
-1323d49b0  c6 5b 59 6a 0a 3f 9e 04  10 4a 7d 7a 1f a0 47 19  |.[Yj.?...J}z..G.|
-1323d49c0  bd 8e a3 a2 6a 7f 45 53  a5 a6 3e c6 2a 1b 09 48  |....j.ES..>.*..H|
-1323d49d0  0c 01 ad 2a ef 4f dc 4e  2f 07 d4 94 98 d3 78 dc  |...*.O.N/.....x.|
-1323d49e0  0f 27 5c 0d ba 2f b3 10  06 2e c6 50 b7 35 d9 37  |.'\../.....P.5.7|
-1323d49f0  43 59 59 75 09 d1 e0 6b  93 27 f7 5b 3c 01 3d e0  |CYYu...k.'.[<.=.|
-1323d4a00  04 7c ab a0 c6 f8 50 ec  35 cc dc 04 33 17 1a f1  |.|....P.5...3...|
-1323d4a10  89 74 55 97 50 1b 98 58  16 dd bc 6e 8f 5a ff ca  |.tU.P..X...n.Z..|
-1323d4a20  db 37 cd 05 c8 ce 20 66  5c 33 48 ed 62 8e c8 a7  |.7.... f\3H.b...|
-1323d4a30  ae 85 91 8e ab 0a 1f d7  68 82 02 b7 a5 76 b0 e2  |........h....v..|
-1323d4a40  3c d2 0b 2b 52 4a 03 6d  b2 1b 76 ac f2 9f 68 ae  |<..+RJ.m..v...h.|
-1323d4a50  bb 77 c8 c2 ef a8 99 cc  df b7 41 e0 3f c7 9a 78  |.w........A.?..x|
-1323d4a60  c6 bb 2a ac a5 ab 08 9e  7f a6 86 3a 87 8b a6 05  |..*........:....|
-1323d4a70  0d 3c 42 d6 ef f7 45 0a  49 67 f0 28 e4 38 ee 34  |.<B...E.Ig.(.8.4|
-1323d4a80  99 59 22 9a 06 e0 c1 55  f8 1f e5 c5 bb 6a f2 b5  |.Y"....U.....j..|
-1323d4a90  d3 ec fc 0d ef 0a 83 1a  f2 d2 99 92 35 d5 ab 0f  |............5...|
-1323d4aa0  05 d8 cf ac 65 d9 eb 33  c1 f0 7d e8 01 5c 76 6b  |....e..3..}..\vk|
-1323d4ab0  1e d4 25 4d 11 6d c7 12  ce ba 40 66 84 a3 b4 1b  |..%M.m....@f....|
-1323d4ac0  3c 81 16 9f fa c6 07 ff  ff d8 07                 |<..........|
+1323d48c0  00 04 82 02 00 49 2e 68  b3 49 64 fe f5 58 a3 ce  |.....I.h.Id..X..|
+1323d48d0  88 3e 19 3d b0 0f 43 cd  58 fb cf f0 52 85 56 c0  |.>.=..C.X...R.V.|
+1323d48e0  09 33 80 54 8c e6 08 2b  26 31 2d 6d b4 f0 30 43  |.3.T...+&1-m..0C|
+1323d48f0  b5 5a e0 a7 8d b2 0d 56  1e 3f 68 30 72 82 0d c7  |.Z.....V.?h0r...|
+1323d4900  cb e5 b1 a5 b6 5f 6e bf  88 e2 31 33 15 4a 96 e5  |....._n...13.J..|
+1323d4910  b7 21 3a 69 7a 2c 00 9d  04 54 27 ec af 91 fe 38  |.!:iz,...T'....8|
+1323d4920  28 b6 4a 9f f5 3e a4 a9  65 23 0f f7 0d 6d 71 55  |(.J..>..e#...mqU|
+1323d4930  cc 33 8c 62 49 a9 7d 2c  9e 2a 89 40 54 29 70 19  |.3.bI.},.*.@T)p.|
+1323d4940  aa 16 de 80 04 97 81 c2  68 4e 0b f5 65 73 87 76  |........hN..es.v|
+1323d4950  4e 86 fd 75 50 bd 3e ca  ec 92 7f 53 41 66 5c 2e  |N..uP.>....SAf\.|
+1323d4960  2f 5a 0e 6b 49 2b d8 98  9f 47 e1 5e 65 15 ad 3b  |/Z.kI+...G.^e..;|
+1323d4970  9e f1 c3 40 19 4a b9 d3  ae 94 b4 42 85 ad d1 8f  |...@.J.....B....|
+1323d4980  7e 0d 9c 0f fc 73 6f ef  84 de 70 8f b7 0c b5 81  |~....so...p.....|
+1323d4990  67 8c eb 35 6d 44 48 5f  40 0c fb 6c b8 f3 de 27  |g..5mDH_@..l...'|
+1323d49a0  73 df 25 04 25 1d fb 00  a3 9d 7f b6 11 fb be aa  |s.%.%...........|
+1323d49b0  e9 b6 d8 87 d6 ab 28 f6  61 d8 c8 28 78 8b c8 5c  |......(.a..(x..\|
+1323d49c0  3b af 3c 3e e1 51 40 99  ed 5a e1 2b c8 39 d0 d6  |;.<>.Q@..Z.+.9..|
+1323d49d0  06 0e de 03 8d ba 2f 10  79 4c 0b 3a fe 50 30 53  |....../.yL.:.P0S|
+1323d49e0  e8 4c 96 d1 9f bb 6a 18  a7 aa 96 4a 4c be 7f 77  |.L....j....JL..w|
+1323d49f0  93 9f d3 76 57 e7 fc 53  c5 d2 93 58 ea 65 3b 3b  |...vW..S...X.e;;|
+1323d4a00  9b b8 df d1 53 a2 35 a8  9d db a7 06 6f 14 e1 7c  |....S.5.....o..||
+1323d4a10  34 5f b7 6a 98 14 e1 c6  e1 4e 21 9f d3 49 d6 ff  |4_.j.....N!..I..|
+1323d4a20  73 60 6f 71 c6 63 68 cf  a7 cd 90 0a 49 ea 01 39  |s`oq.ch.....I..9|
+1323d4a30  32 be db 31 6b 3a 66 e3  92 a9 df 52 1a fe ae 60  |2..1k:f....R...`|
+1323d4a40  8b ef e4 63 c2 6b e3 54  8e 57 eb 59 ba d3 90 e3  |...c.k.T.W.Y....|
+1323d4a50  5f c0 71 59 b1 1b 51 d9  fc a7 48 e3 a9 2c 64 83  |_.qY..Q...H..,d.|
+1323d4a60  40 76 19 62 3c 84 96 b2  a8 b5 91 b2 b9 6c c8 db  |@v.b<........l..|
+1323d4a70  86 b8 c6 9f 67 d4 5d 61  87 2e 57 40 1b a7 42 9c  |....g.]a..W@..B.|
+1323d4a80  33 4e df dc 2f 31 1d a5  10 61 f0 99 b4 c6 0a 2a  |3N../1...a.....*|
+1323d4a90  11 40 d2 df 3d f2 9a 22  88 3b 01 b4 0f fb 16 27  |.@..=..".;.....'|
+1323d4aa0  09 12 aa 91 db 37 d3 81  13 87 72 a9 dc ac cd dc  |.....7....r.....|
+1323d4ab0  4d 37 ac 0c 40 46 e7 8a  4b 06 7f e5 a3 fb 4c a3  |M7..@F..K.....L.|
+1323d4ac0  0f d6 01 ef cc c6 07 ff  ff d8 07                 |...........|
 1323d4acb

@chenxiaolong
Copy link
Owner Author

chenxiaolong commented Jun 28, 2023

(EDIT: Removed incorrect statement due to forgetting that we are using streaming writes.)

This is caused by: python/cpython@133bf09


I found an additional issue where multiple zip64 extra records are being written. The first one is carried over from the original file, which is incorrect. This second issue is an avbroot bug, not zipfile.

000000000 LOCAL HEADER #1       04034B50
000000004 Extract Zip Spec      2D '4.5'
000000005 Extract OS            00 'MS-DOS'
000000006 General Purpose Flag  0008
          [Bit  3]              1 'Streamed'
000000008 Compression Method    0000 'Stored'
00000000A Last Mod Time         3A210000 'Wed Dec 31 19:00:00 2008'
00000000E CRC                   00000000
000000012 Compressed Length     00000000
000000016 Uncompressed Length   00000000
00000001A Filename Length       000B
00000001C Extra Length          0028
00000001E Filename              'payload.bin'
000000029 Extra ID #0001        0001 'ZIP64'
00000002B   Length              0010
00000002D   Uncompressed Size   00000001325FB2C4
000000035   Compressed Size     00000001325FB2C4
00000003D Extra ID #0002        0001 'ZIP64'
00000003F   Length              0010
000000041   Uncompressed Size   0000000000000000
000000049   Compressed Size     0000000000000000
000000051 PAYLOAD

@chenxiaolong
Copy link
Owner Author

Python 3.11.4's zipfile writing 0xffffffff for the local file header compressed/uncompressed size fields is a regression. As far as I understand, this is a violation of the spec (though most zip readers can handle it). I've reported an issue upstream: python/cpython#106218

In the meantime, to get things byte-for-byte reproducible again, we need to:

  • make sure the local file header's compressed and uncompressed sizes are set to 0x00000000 for zip64 entries
  • remove the duplicate zip64 extra record

chenxiaolong added a commit that referenced this issue Jun 28, 2023
This commit fixes two issues:

* Python 3.11.4 has a regression where it sets the local file header's
  compressed and uncompressed size fields to 0xffffffff when data
  descriptors are used. These should be set to 0 (along with the CRC)
  according to §4.4.4 of the spec. We work around this by monkey
  patching zipfile's local file header serialization function to correct
  the fields.
* avbroot tries to preserve as much metadata from the original zip as
  possible, including the `extra` records. This caused zip64 records
  (0x0001) to be duplicated. The first instance was from the original
  zip (containing invalid size fields) and the second instance was newly
  created by zipfile. We fix this by stripping out all 0x0001 records,
  similar to what we already do for 0xd935.

Fixes: #109

Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>
@chenxiaolong
Copy link
Owner Author

I misunderstood the zip spec: python/cpython#106218 (comment)

The Python 3.11.4+ behavior is correct. I will open a PR that undoes #110 and "backports" the new Python behavior to older Python versions.

@chenxiaolong chenxiaolong reopened this Jul 3, 2023
chenxiaolong added a commit that referenced this issue Jul 3, 2023
header" from §4.4.4 in the spec as referring to the two 32-bit size
fields in the local header. However, the spec considers the zip64 extra
record as part of the local header. The correct behavior for writing
zip64 entries to unseekable files is 0xffffffff in the 32-bit local
header size fields (to enable zip64), 0 in the zip64 extra record (due
to streaming writes), and the actual sizes in the data descriptor.

Fixes: #109

Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>
chenxiaolong added a commit that referenced this issue Jul 3, 2023
PR #110 fixed this incorrectly. I misinterpreted "set to zero in the
local header" from §4.4.4 in the spec as referring to the two 32-bit
size fields in the local header. However, the spec considers the zip64
extra record as part of the local header. The correct behavior for
writing zip64 entries to unseekable files is 0xffffffff in the 32-bit
local header size fields (to enable zip64), 0 in the zip64 extra record
(due to streaming writes), and the actual sizes in the data descriptor.

Fixes: #109

Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant