Potential bug in join when build has many duplicated keys #8796
Labels
affects-6.6
affects-7.0
affects-7.1
This bug affects the 7.1.x(LTS) versions.
affects-7.2
affects-7.3
affects-7.4
affects-7.5
This bug affects the 7.5.x(LTS) versions.
component/compute
severity/major
type/bug
The issue is confirmed as a bug.
Bug Report
Please answer these questions before submitting your issue. Thanks!
In join probe, if build has many duplicated key, the intermedia result block could be large, in order to control the overall memory usage, on input block will be probed multiple times, each time only data in [probe_process_info.start_row, probe_process_info.end_row) will be processed.
But some data structures used in join assume whole block data is processed, for example:
https://github.com/pingcap/tiflash/blob/ef90ce478aff27a7499a1a55debac5957f326476/dbms/src/Interpreters/Join.cpp#L870
anti_filter
andoffsets_to_replicate
inhandleOtherConditions
both assumes it contains all the data in block. In order to adopt this, beforehandleOtherConditions
, the caller need to shrinkanti_filter
andoffsets_to_replicate
. Currently, it useassign
. However,shrink
actually should not usememcpy
since there is a chance that dst and src have overlap, and it is undefined behavior formemcpy
is dst and src have overlap1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
3. What did you see instead (Required)
4. What is your TiFlash version? (Required)
The text was updated successfully, but these errors were encountered: