forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundle URIs Part 4: Advertise URIs from Git server #21
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jun 21, 2022
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
6 times, most recently
from
June 25, 2022 00:39
931e03b
to
c1b2b78
Compare
derrickstolee
changed the base branch from
bundle-redo/list
to
bundle-redo/advertise-base
June 27, 2022 13:10
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
3 times, most recently
from
June 28, 2022 20:01
bc86c7f
to
a1a3dff
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
July 25, 2022 13:37
341f079
to
ae3b7c3
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
July 25, 2022 13:37
a1a3dff
to
5d7b2c7
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
July 25, 2022 15:09
ae3b7c3
to
696bb92
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
July 25, 2022 15:09
5d7b2c7
to
5f6474b
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
July 25, 2022 15:44
696bb92
to
8ce6fd0
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
2 times, most recently
from
July 25, 2022 20:24
3751d07
to
881bdf3
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
August 1, 2022 14:02
089c3ce
to
f650e71
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
August 1, 2022 14:02
881bdf3
to
92068ef
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
August 4, 2022 15:41
f650e71
to
f034d21
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
August 4, 2022 15:41
92068ef
to
6598b7a
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
2 times, most recently
from
August 22, 2022 13:45
7b7f61e
to
7ed4c6a
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise-base
branch
from
October 31, 2022 11:46
5729ff2
to
c03801e
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
5 times, most recently
from
October 31, 2022 19:57
835c4c9
to
1c034bb
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
2 times, most recently
from
November 16, 2022 15:59
d78989d
to
b62b4b1
Compare
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
4 times, most recently
from
December 5, 2022 15:57
01c00af
to
7e18191
Compare
Add a skeleton server-side implementation of a new "bundle-uri" command to protocol v2. This will allow conforming clients to optionally seed their initial clones or incremental fetches from URLs containing "*.bundle" files created with "git bundle create". This change only performs the basic boilerplate of advertising a new protocol v2 capability. The new 'bundle-uri' capability allows a client to request a list of bundles. Right now, the server only returns a flush packet, which corresponds to an empty advertisement. The bundle.* config namespace describes which key-value pairs will be communicated across this interface in future updates. The critical bit right now is that the new boolean uploadPack.adverstiseBundleURIs config value signals whether or not this capability should be advertised at all. An earlier version of this patch [1] used a different transfer format than the "key=value" pairs in the current implementation. The change was made to unify the protocol v2 command with the bundle lists provided by independent bundle servers. Further, the standard allows for the server to advertise a URI that contains a bundle list. This allows users automatically discovering bundle providers that are loosely associated with the origin server, but without the origin server knowing exactly which bundles are currently available. [1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/ The very-deep headings needed to be modified to stop at level 4 due to documentation build issues. These were not recognized in earlier builds since the file was previously in the Documentation/technical/ directory and was built in a different way. With its current location, the heavily-nested details were causing build issues and they are now replaced with a bulletted list of details. Co-authored-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The previous change allowed for a Git server to advertise the 'bundle-uri' command as a capability based on the uploadPack.advertiseBundleURIs config option. Create a set of tests that check that this capability is advertised using 'git ls-remote'. In order to test this functionality across three protocols (file, git, and http), create lib-bundle-uri-protocol.sh to generalize the tests, allowing the other test scripts to set an environment variable and otherwise inherit the setup and tests from this script. The tests currently only test that the 'bundle-uri' command is advertised or not. Other actions will be tested as the Git client learns to request the 'bundle-uri' command and parse its response. To help with URI escaping, specifically for file paths with a space in them, extract a 'sed' invocation from t9199-git-svn-info.sh into a helper function for use here, too. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Set up all the needed client parts of the 'bundle-uri' protocol v2 command, without actually doing anything with the bundle URIs. If the server says it supports 'bundle-uri' teach Git to issue the 'bundle-uri' command after the 'ls-refs' during 'git clone'. The returned key=value pairs are passed to the bundle list code which is tested using a different ingest mechanism in t5750-bundle-uri-parse.sh. At this point, Git does nothing with that bundle list. It will not download any of the bundles. That will come in a later change after these protocol bits are finalized. The no-op client is initially used only by 'git clone' to test the basic functionality, and eventually will bootstrap the initial download of Git objects during a fresh clone. The bundle URI client will not be integrated into other fetches until a mechanism is created to select a subset of bundles for download. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The yet-to-be introduced client support for bundle-uri will always fall back on a full clone, but we'd still like to be able to ignore a server's bundle-uri advertisement entirely. The new transfer.bundleURI config option defaults to 'false', but a user can set it to 'true' to enable checking for bundle URIs from the origin Git server using protocol v2. Co-authored-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The 'got_remote_heads' member of 'struct git_transport_data' was used historically to indicate that the initial server connection was made and the ref advertisement was returned. With protocol v2, that initial handshake does not necessarily include the ref advertisement, so this member is not an accurate name. Thankfully, all uses of the member are only checking to see if the handshake should take place, not whether or not some local data has the ref advertisement. Rename the member to 'finished_handshake' to represent the proper state. Note that the variable is only set to 1 during the handshake() method. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper for issuing protocol v2 "bundle-uri" commands to a server, and to the parsing routines in bundle-uri.c. In the "git clone" case we'll have already done the handshake(), but not here. Add an extra case to check for this handshake in get_bundle_uri() for ease of use for future callers. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Implement the "bundle-uri" protocol v2 capability by populating the key=value packet lines from the local Git config. The list of bundles is provided from the keys beginning with "bundle.". In the future, we may want to filter this list to be more specific to the exact known keys that the server intends to share, but for flexibility at the moment we will assume that the config values are well-formed. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The strbuf_parent_directory() method was added as a static method in contrib/scalar by d0feac4 (scalar: 'register' sets recommended config and starts maintenance, 2021-12-03) and then removed in 65f6a9e (scalar: constrain enlistment search, 2022-08-18), but now there is a need for a similar method in the bundle URI feature. Re-add the method, this time in strbuf.c, but with a new name: strbuf_strip_file_from_path(). The method requirements are slightly modified to allow a trailing slash, in which case nothing is done, which makes the name change valuable. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Bundle providers may want to distribute that data across multiple CDNs. This might require a change in the base URI, all the way to the domain name. If all bundles require an absolute URI in their 'uri' value, then every push to a CDN would require altering the table of contents to match the expected domain and exact location within it. Allow a bundle list to specify a relative URI for the bundles. This URI is based on where the client received the bundle list. For a list provided in the 'bundle-uri' protocol v2 command, the Git remote URI is the base URI. Otherwise, the bundle list was provided from an HTTP URI not using the Git protocol, and that URI is the base URI. This allows easier distribution of bundle data. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
December 7, 2022 15:53
7e18191
to
982e239
Compare
The logic in fetch_bundle_uri() is useful for the --bundle-uri option of 'git clone', but is not helpful when the clone operation discovers a list of URIs from the bundle-uri protocol v2 command. To actually download and unbundle the advertised bundles, we need a different mechanism. Create the new fetch_bundle_list() method which is very similar to fetch_bundle_uri() except that it relies on download_bundle_list() instead of fetch_bundle_uri_internal(). The download_bundle_list() method will recursively call fetch_bundle_uri_internal() if any of the advertised URIs serve a bundle list instead of a bundle. This will also follow the bundle.list.mode setting from the input list: "any" will download only one such URI while "all" will download data from all of the URIs. In an identical way to fetch_bundle_uri(), the bundles are unbundled after all of the bundle lists have been expanded and all necessary URIs. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
December 20, 2022 14:19
982e239
to
9ed06f5
Compare
A previous change introduced the transport methods to acquire a bundle list from the 'bundle-uri' protocol v2 command, when advertised _and_ when the client has chosen to enable the feature. Teach Git to download and unbundle the data advertised by those bundles during 'git clone'. This takes place between the ref advertisement and the object data download, and stateful connections will linger while the client downloads bundles. In the future, we should consider closing the remote connection during this process. Also, since the --bundle-uri option exists, we do not want to mix the advertised bundles with the user-specified bundles. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
derrickstolee
force-pushed
the
bundle-redo/advertise
branch
from
December 21, 2022 16:16
9ed06f5
to
46ab2b0
Compare
derrickstolee
pushed a commit
that referenced
this pull request
Jan 17, 2023
It is possible to trigger an integer overflow when parsing attribute names when there are more than 2^31 of them for a single pattern. This can either lead to us dying due to trying to request too many bytes: blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin) git update-index --add --cacheinfo 100644,$blob,.gitattributes git attr-check --all file ================================================================= ==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0) #0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 #1 0x5563a0a1e3d3 in xcalloc wrapper.c:150 #2 0x5563a058d005 in parse_attr_line attr.c:384 #3 0x5563a058e661 in handle_attr_line attr.c:660 #4 0x5563a058eddb in read_attr_from_index attr.c:769 #5 0x5563a058ef12 in read_attr attr.c:797 #6 0x5563a058f24c in bootstrap_attr_stack attr.c:867 #7 0x5563a058f4a3 in prepare_attr_stack attr.c:902 #8 0x5563a05905da in collect_some_attrs attr.c:1097 #9 0x5563a059093d in git_all_attrs attr.c:1128 #10 0x5563a02f636e in check_attr builtin/check-attr.c:67 #11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183 #12 0x5563a02aa993 in run_builtin git.c:466 #13 0x5563a02ab397 in handle_builtin git.c:721 #14 0x5563a02abb2b in run_argv git.c:788 #15 0x5563a02ac991 in cmd_main git.c:926 #16 0x5563a05432bd in main common-main.c:57 #17 0x7fd3ef82228f (/usr/lib/libc.so.6+0x2328f) ==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1 SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc ==1022==ABORTING Or, much worse, it can lead to an out-of-bounds write because we underallocate and then memcpy(3P) into an array: perl -e ' print "A " . "\rh="x2000000000; print "\rh="x2000000000; print "\rh="x294967294 . "\n" ' >.gitattributes git add .gitattributes git commit -am "evil attributes" $ git clone --quiet /path/to/repo ================================================================= ==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58 WRITE of size 8 at 0x602000002550 thread T0 #0 0x5555559884d4 in parse_attr_line attr.c:393 #1 0x5555559884d4 in handle_attr_line attr.c:660 #2 0x555555988902 in read_attr_from_index attr.c:784 #3 0x555555988902 in read_attr_from_index attr.c:747 #4 0x555555988a1d in read_attr attr.c:800 #5 0x555555989b0c in bootstrap_attr_stack attr.c:882 #6 0x555555989b0c in prepare_attr_stack attr.c:917 #7 0x555555989b0c in collect_some_attrs attr.c:1112 #8 0x55555598b141 in git_check_attr attr.c:1126 #9 0x555555a13004 in convert_attrs convert.c:1311 #10 0x555555a95e04 in checkout_entry_ca entry.c:553 #11 0x555555d58bf6 in checkout_entry entry.h:42 #12 0x555555d58bf6 in check_updates unpack-trees.c:480 #13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040 #14 0x555555785ab7 in checkout builtin/clone.c:724 #15 0x555555785ab7 in cmd_clone builtin/clone.c:1384 #16 0x55555572443c in run_builtin git.c:466 #17 0x55555572443c in handle_builtin git.c:721 #18 0x555555727872 in run_argv git.c:788 #19 0x555555727872 in cmd_main git.c:926 #20 0x555555721fa0 in main common-main.c:57 #21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308 #22 0x555555723f39 in _start (git+0x1cff39) 0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here: #0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x555555d7fff7 in xcalloc wrapper.c:150 #2 0x55555598815f in parse_attr_line attr.c:384 #3 0x55555598815f in handle_attr_line attr.c:660 #4 0x555555988902 in read_attr_from_index attr.c:784 #5 0x555555988902 in read_attr_from_index attr.c:747 #6 0x555555988a1d in read_attr attr.c:800 #7 0x555555989b0c in bootstrap_attr_stack attr.c:882 #8 0x555555989b0c in prepare_attr_stack attr.c:917 #9 0x555555989b0c in collect_some_attrs attr.c:1112 #10 0x55555598b141 in git_check_attr attr.c:1126 #11 0x555555a13004 in convert_attrs convert.c:1311 #12 0x555555a95e04 in checkout_entry_ca entry.c:553 #13 0x555555d58bf6 in checkout_entry entry.h:42 #14 0x555555d58bf6 in check_updates unpack-trees.c:480 #15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040 #16 0x555555785ab7 in checkout builtin/clone.c:724 #17 0x555555785ab7 in cmd_clone builtin/clone.c:1384 #18 0x55555572443c in run_builtin git.c:466 #19 0x55555572443c in handle_builtin git.c:721 #20 0x555555727872 in run_argv git.c:788 #21 0x555555727872 in cmd_main git.c:926 #22 0x555555721fa0 in main common-main.c:57 #23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line Shadow bytes around the buggy address: 0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00 0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa 0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa 0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02 0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03 =>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa 0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==15062==ABORTING Fix this bug by using `size_t` instead to count the number of attributes so that this value cannot reasonably overflow without running out of memory before already. Reported-by: Markus Vervier <markus.vervier@x41-dsec.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 17, 2023
When using a padding specifier in the pretty format passed to git-log(1) we need to calculate the string length in several places. These string lengths are stored in `int`s though, which means that these can easily overflow when the input lengths exceeds 2GB. This can ultimately lead to an out-of-bounds write when these are used in a call to memcpy(3P): ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588 WRITE of size 1 at 0x7f1ec62f97fe thread T0 #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762 #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801 #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429 #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869 #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161 #6 0x5628e95a44c8 in show_log log-tree.c:781 #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117 #8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508 #9 0x5628e922c35b in cmd_log_walk builtin/log.c:549 #10 0x5628e922f1a2 in cmd_log builtin/log.c:883 #11 0x5628e9106993 in run_builtin git.c:466 #12 0x5628e9107397 in handle_builtin git.c:721 #13 0x5628e9107b07 in run_argv git.c:788 #14 0x5628e91088a7 in cmd_main git.c:923 #15 0x5628e939d682 in main common-main.c:57 #16 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f) #17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115 0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839) allocated by thread T0 here: #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85 #1 0x5628e98774d4 in xrealloc wrapper.c:136 #2 0x5628e97cb01c in strbuf_grow strbuf.c:99 #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327 #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761 #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801 #6 0x5628e97cdb24 in strbuf_expand strbuf.c:429 #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869 #8 0x5628e96acd0f in pretty_print_commit pretty.c:2161 #9 0x5628e95a44c8 in show_log log-tree.c:781 #10 0x5628e95a76ba in log_tree_commit log-tree.c:1117 #11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508 #12 0x5628e922c35b in cmd_log_walk builtin/log.c:549 #13 0x5628e922f1a2 in cmd_log builtin/log.c:883 #14 0x5628e9106993 in run_builtin git.c:466 #15 0x5628e9107397 in handle_builtin git.c:721 #16 0x5628e9107b07 in run_argv git.c:788 #17 0x5628e91088a7 in cmd_main git.c:923 #18 0x5628e939d682 in main common-main.c:57 #19 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f) #20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115 SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy Shadow bytes around the buggy address: 0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa] 0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==8340==ABORTING The pretty format can also be used in `git archive` operations via the `export-subst` attribute. So this is what in our opinion makes this a critical issue in the context of Git forges which allow to download an archive of user supplied Git repositories. Fix this vulnerability by using `size_t` instead of `int` to track the string lengths. Add tests which detect this vulnerability when Git is compiled with the address sanitizer. Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Modified-by: Taylor Blau <me@ttalorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
derrickstolee
pushed a commit
that referenced
this pull request
Sep 5, 2024
It was recently reported that concurrent reads and writes may cause the reftable backend to segfault. The root cause of this is that we do not properly keep track of reftable readers across reloads. Suppose that you have a reftable iterator and then decide to reload the stack while iterating through the iterator. When the stack has been rewritten since we have created the iterator, then we would end up discarding a subset of readers that may still be in use by the iterator. The consequence is that we now try to reference deallocated memory, which of course segfaults. One way to trigger this is in t5616, where some background maintenance jobs have been leaking from one test into another. This leads to stack traces like the following one: + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin AddressSanitizer:DEADLYSIGNAL ================================================================= ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0) ==657994==The signal is caused by a READ memory access. #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29 #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170 #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194 #3 0x55f23e54e72e in block_iter_next reftable/block.c:398 #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240 #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355 #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339 #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69 #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123 #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172 #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175 #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464 #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13 #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452 #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623 #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659 #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133 #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432 #18 0x55f23dba7764 in run_builtin git.c:484 #19 0x55f23dba7764 in handle_builtin git.c:741 #20 0x55f23dbab61e in run_argv git.c:805 #21 0x55f23dbab61e in cmd_main git.c:1000 #22 0x55f23dba4781 in main common-main.c:64 #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360 #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27) While it is somewhat awkward that the maintenance processes survive tests in the first place, it is totally expected that reftables should work alright with concurrent writers. Seemingly they don't. The only underlying resource that we need to care about in this context is the reftable reader, which is responsible for reading a single table from disk. These readers get discarded immediately (unless reused) when calling `reftable_stack_reload()`, which is wrong. We can only close them once we know that there are no iterators using them anymore. Prepare for a fix by converting the reftable readers to be refcounted. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Preceded by:
git fetch --bundle-uri
#19git clone --bundle-uri
#18Followed by: