Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize text equality and LIKE #6189

Merged
merged 342 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
342 commits
Select commit Hold shift + click to select a range
b3318b1
reference
akuzm Dec 12, 2023
768daed
fixup
akuzm Dec 12, 2023
7618723
fix
akuzm Dec 12, 2023
71fc792
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 12, 2023
2c43267
remove unused variable
akuzm Dec 12, 2023
2268052
cleanup
akuzm Dec 12, 2023
1f7460a
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 12, 2023
1bbc455
format
akuzm Dec 12, 2023
41fb360
tojson
akuzm Dec 13, 2023
e3e6a1b
cleanups
akuzm Dec 13, 2023
20d8561
fix?
akuzm Dec 13, 2023
44de281
fix
akuzm Dec 13, 2023
e091b35
fix
akuzm Dec 13, 2023
2422e82
this is so tiresome
akuzm Dec 13, 2023
d5df530
directory
akuzm Dec 13, 2023
2d7a60c
path
akuzm Dec 13, 2023
7b0d878
switch
akuzm Dec 13, 2023
f996f80
split out to files
akuzm Dec 13, 2023
df52a4d
fix
akuzm Dec 13, 2023
5d41308
headers
akuzm Dec 13, 2023
af61eaa
headers?
akuzm Dec 13, 2023
5281c9c
headers...
akuzm Dec 13, 2023
2cafdff
ts format
akuzm Dec 13, 2023
b1e5dea
cleanup
akuzm Dec 13, 2023
79db818
cleanup
akuzm Dec 14, 2023
4bdd452
yaml
akuzm Dec 14, 2023
610c6c2
dash
akuzm Dec 14, 2023
67e9fe1
job fixes
akuzm Dec 14, 2023
196dd9f
fix
akuzm Dec 14, 2023
219f5e0
cleanup
akuzm Dec 14, 2023
6fb2e5a
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 14, 2023
99b1dc6
fix the saved cache key
akuzm Dec 14, 2023
5203392
add another path
akuzm Dec 14, 2023
4251e45
Merge akuzm/bulk-type into tmp (using imerge)
akuzm Dec 14, 2023
802a2ed
Merge remote-tracking branch 'origin/main' into tmp
akuzm Dec 14, 2023
3f0103e
tests for nulls
akuzm Dec 14, 2023
0be0a8a
format
akuzm Dec 14, 2023
b8aa1d7
harmonize the varlena check with rowbyrow
akuzm Dec 14, 2023
cb383a0
test case
akuzm Dec 14, 2023
8ba3add
fixes
akuzm Dec 14, 2023
f560295
test the cache (2023-12-14 #2)
akuzm Dec 14, 2023
59a5a6c
try to get the prefix match back
akuzm Dec 14, 2023
53ac9fc
key is required
akuzm Dec 14, 2023
f961cbb
I hate github actions so much
akuzm Dec 14, 2023
cb2f03e
dollar
akuzm Dec 14, 2023
489827c
test the cache (2023-12-14 no. 3)
akuzm Dec 14, 2023
9546a3c
dollar
akuzm Dec 14, 2023
bc450d7
test the cache (2023-12-14 no. 4)
akuzm Dec 14, 2023
971d66c
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 15, 2023
6cfd187
Merge commit '971d66c50' into HEAD
akuzm Dec 15, 2023
3036bae
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 15, 2023
0c6a689
test detoasting in decompression as well
akuzm Dec 15, 2023
b142652
do more things in row_decompressor_close
akuzm Dec 15, 2023
ad57d6f
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 15, 2023
7410ec4
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 15, 2023
11897bb
use the old cache key
akuzm Dec 15, 2023
4dc3c28
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 15, 2023
e8d5940
use "name" as column type w/o bulk decompression
akuzm Dec 15, 2023
93b39a9
tmp
akuzm Dec 15, 2023
93f2aaf
micro-optimizations?
akuzm Dec 15, 2023
cc0c59a
Merge remote-tracking branch 'akuzm/detoaster' into HEAD
akuzm Dec 15, 2023
39ff133
double free
akuzm Dec 15, 2023
51626f7
use after free 2
akuzm Dec 15, 2023
3c94a67
even more straightforward data layout
akuzm Dec 15, 2023
5de7e1e
make arrow_row index a parameter
akuzm Dec 15, 2023
5d2feef
double free
akuzm Dec 15, 2023
f6386ec
use after free 2
akuzm Dec 15, 2023
07f9729
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 15, 2023
7244a99
benchmark bulk text (2023-12-15 #5)
akuzm Dec 15, 2023
f1223d6
fix
akuzm Dec 15, 2023
dace562
benchmark bulk text (2023-12-15 #6)
akuzm Dec 15, 2023
3985663
benchmark detoaster (2023-12-15 no. 8)
akuzm Dec 15, 2023
e6b87d7
cleanup
akuzm Dec 15, 2023
0e21970
fix
akuzm Dec 15, 2023
46ccad2
micro-optimizations
akuzm Dec 18, 2023
0f91ded
fix
akuzm Dec 18, 2023
7aa5c79
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
4ed6123
show the statistics about corpus
akuzm Dec 19, 2023
a15d6f4
Merge remote-tracking branch 'akuzm/detoaster' into HEAD
akuzm Dec 19, 2023
87d70eb
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
0ad7981
typo
akuzm Dec 19, 2023
5e3e157
fix
akuzm Dec 19, 2023
c651eb7
proper error code
akuzm Dec 19, 2023
993698d
fixes
akuzm Dec 19, 2023
a79dd28
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 19, 2023
10311c4
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
fab16ca
finish refactoring of column values
akuzm Dec 19, 2023
70b19e9
swap branches for less diff with main
akuzm Dec 19, 2023
787efb7
test the cache (2023-12-19 no. 9)
akuzm Dec 19, 2023
1d7714e
maybe it works w/o restore-keys?
akuzm Dec 19, 2023
5569d85
no clobber
akuzm Dec 19, 2023
dbfd14f
proper size and checks
akuzm Dec 19, 2023
06be0ed
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 19, 2023
72e2cd5
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
4755975
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
a69b25d
more fixes from the bulk text PR
akuzm Dec 19, 2023
d886913
directory woes
akuzm Dec 19, 2023
39e07b0
rm
akuzm Dec 19, 2023
b9169a2
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 19, 2023
00c9fdc
try release build
akuzm Dec 19, 2023
e42618e
cleanup
akuzm Dec 19, 2023
c9b50eb
Display the welcome message as NOTICE
akuzm Dec 19, 2023
3f11578
save interesting cases
akuzm Dec 19, 2023
b1e18c4
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 19, 2023
0270cfa
forgotten switch
akuzm Dec 19, 2023
a399ee8
fixes
akuzm Dec 19, 2023
9a37838
add more interesting files
akuzm Dec 19, 2023
4e29ddf
more iterations for dictionary
akuzm Dec 19, 2023
a13be8a
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 19, 2023
9a07952
more iterations
akuzm Dec 19, 2023
5b06b36
benchmark decompression cleanup (2023-12-19 no. 11)
akuzm Dec 19, 2023
7ce1a55
fix
akuzm Dec 20, 2023
275a466
Merge remote-tracking branch 'akuzm/fuzzing' into HEAD
akuzm Dec 20, 2023
63df106
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 20, 2023
5f71183
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Dec 21, 2023
b3ed50b
Bulk decompression of text columns
akuzm Dec 21, 2023
deee56c
Vectorize NullTest
akuzm Dec 21, 2023
443fa1f
ref
akuzm Dec 21, 2023
d9a8106
tests
akuzm Dec 21, 2023
a002fb2
reference ordered_append-* ordered_append_join-*
akuzm Jan 3, 2024
23385e3
reference ordered_append-* ordered_append_join-*
akuzm Jan 3, 2024
eea90ca
reference ordered_append-* ordered_append_join-*
akuzm Jan 3, 2024
b5709d8
avoid warnings
akuzm Jan 3, 2024
e1f6e54
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 3, 2024
ab5af98
cleanup
akuzm Jan 3, 2024
8e4a6ce
Vectorized boolean operators
akuzm Jan 3, 2024
2bab5fb
Merge remote-tracking branch 'akuzm/vector-nulltest' into HEAD
akuzm Jan 3, 2024
fdffe18
coverage for not
akuzm Jan 3, 2024
a6e7bf2
not
akuzm Jan 4, 2024
195a865
benchmark boolexpr (2024-01-04 no. 2)
akuzm Jan 4, 2024
50c32f0
early exit
akuzm Jan 4, 2024
c8cb25f
benchmark boolexpr (2024-01-04 no. 1)
akuzm Jan 4, 2024
41270fe
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 5, 2024
3dc7513
Merge commit 'c8cb25ffe' into HEAD
akuzm Jan 5, 2024
f24cd20
cleanup and more tests
akuzm Jan 5, 2024
911a883
fix for default values
akuzm Jan 5, 2024
dfe929e
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 5, 2024
95c3240
benchmark boolexpr (2024-01-05 no. 2)
akuzm Jan 5, 2024
9c38c1d
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 5, 2024
11fb3b4
cleanup
akuzm Jan 5, 2024
5a8103a
Merge remote-tracking branch 'akuzm/vector-nulltest' into HEAD
akuzm Jan 5, 2024
f466867
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 9, 2024
1ddc365
cleanup
akuzm Jan 9, 2024
64ad94c
Merge remote-tracking branch 'akuzm/vector-nulltest' into HEAD
akuzm Jan 9, 2024
8098485
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 9, 2024
46cee6b
cleanup
akuzm Jan 9, 2024
0c311b4
cleanup
akuzm Jan 9, 2024
50ff803
Merge akuzm/decompression-only into tmp (using imerge)
akuzm Jan 9, 2024
1cef484
cleanup
akuzm Jan 9, 2024
08f0228
fixes after merge
akuzm Jan 9, 2024
1889f9d
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 9, 2024
0609b63
Merge remote-tracking branch 'akuzm/decompression-only' into HEAD
akuzm Jan 9, 2024
9b64157
Merge remote-tracking branch 'akuzm/boolexpr' into HEAD
akuzm Jan 9, 2024
929eb99
benchmark text + boolexpr (2024-01-09 no. 3)
akuzm Jan 9, 2024
8af04d5
default value in text column
akuzm Jan 9, 2024
cdf484e
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 16, 2024
1ae2791
review fixes
akuzm Jan 16, 2024
2dd9269
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 19, 2024
8bccc65
like
akuzm Jan 19, 2024
48db1a5
fixup
akuzm Jan 19, 2024
922a92b
fixes
akuzm Jan 19, 2024
f692cf0
benchmark like (2024-01-19 no. 6)
akuzm Jan 19, 2024
a905689
fix
akuzm Jan 19, 2024
11acfe7
Merge remote-tracking branch 'akuzm/decompression-only' into HEAD
akuzm Jan 19, 2024
72b6bce
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 19, 2024
571bb8c
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 19, 2024
be1d96a
simpler code for by-reference fixed-width datums
akuzm Jan 19, 2024
b170e41
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 19, 2024
cd9ca92
clang-tidy
akuzm Jan 19, 2024
629b08f
tests for utf8
akuzm Jan 19, 2024
4125e38
Merge remote-tracking branch 'akuzm/decompression-only' into HEAD
akuzm Jan 25, 2024
7b7822f
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 25, 2024
b400b78
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 29, 2024
3ea19e2
more aggressive early exit
akuzm Jan 29, 2024
1e491c7
fixup
akuzm Jan 29, 2024
49df521
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 29, 2024
979d962
Merge remote-tracking branch 'akuzm/boolexpr' into HEAD
akuzm Jan 29, 2024
a6705c9
Merge remote-tracking branch 'akuzm/bulk-text' into HEAD
akuzm Jan 29, 2024
2545efa
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 30, 2024
f9bda72
cmake woes -- trying to put compression tests into libdir
akuzm Jan 30, 2024
cb312c6
Move files to test lib
akuzm Jan 30, 2024
f770fd3
cleanup
akuzm Jan 30, 2024
1150f50
benchmark bulk text (2024-01-30 no. 1)
akuzm Jan 30, 2024
a8ab077
Merge commit '1150f50fbf9ea6bfdf083901bdfcae3c2a3de88c' into HEAD
akuzm Jan 31, 2024
8cd61e4
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 31, 2024
a71838e
fixes after merge
akuzm Jan 31, 2024
ab20509
some cleanup
akuzm Jan 31, 2024
14994f1
Fix UBSan failure in bulk text decompression
akuzm Jan 31, 2024
1eb9d41
some tweaks
akuzm Jan 31, 2024
017a34e
actions
akuzm Jan 31, 2024
06be871
fix
akuzm Jan 31, 2024
bec9f60
tidy
akuzm Jan 31, 2024
7f23890
better coverage for like
akuzm Jan 31, 2024
498c0de
clang 16
akuzm Jan 31, 2024
645a19b
recompress chunks
akuzm Jan 31, 2024
ff9dedb
test for hashed saop
akuzm Jan 31, 2024
6272635
clang 15
akuzm Jan 31, 2024
26d1d18
fixes + more coverage
akuzm Jan 31, 2024
3db423f
license
akuzm Jan 31, 2024
db2a898
support text inequality
akuzm Jan 31, 2024
8ba5192
tests
akuzm Jan 31, 2024
e052975
does it reproduce?
akuzm Jan 31, 2024
70afac3
more coverage
akuzm Jan 31, 2024
125e184
old config
akuzm Jan 31, 2024
2a49cb9
log path
akuzm Jan 31, 2024
e7ec508
old os
akuzm Jan 31, 2024
6c4fd1c
upload more sanitizer logs to db
akuzm Jan 31, 2024
10b8d29
simplify a function
akuzm Jan 31, 2024
a410955
show sanitizer logs
akuzm Jan 31, 2024
2cfae4e
Revert "does it reproduce?"
akuzm Jan 31, 2024
5366635
check alignment
akuzm Jan 31, 2024
870568f
add the accordion
akuzm Jan 31, 2024
5e392dc
benchmark text predicates (2024-01-31 no. 2)
akuzm Jan 31, 2024
a95f07f
everything is wrong
akuzm Jan 31, 2024
cfbdec9
test fixup
akuzm Jan 31, 2024
55a968a
add another test
akuzm Jan 31, 2024
38d7a7d
forgotten file
akuzm Jan 31, 2024
f10167d
rename the table
akuzm Jan 31, 2024
fa5e5ee
Merge remote-tracking branch 'akuzm/sanitizer' into HEAD
akuzm Jan 31, 2024
3ee047a
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Jan 31, 2024
74eccd2
update actions to node 20
akuzm Jan 31, 2024
e067b49
forgotten reference
akuzm Jan 31, 2024
0cf566e
Merge remote-tracking branch 'akuzm/sanitizer' into HEAD
akuzm Jan 31, 2024
b25f3a5
check liveness after checking the interesting cases
akuzm Jan 31, 2024
86ed6de
Add more interesting cases
akuzm Jan 31, 2024
ae22c1a
add a case
akuzm Feb 1, 2024
4c67734
more test cases
akuzm Feb 1, 2024
3692ee1
cleanup
akuzm Feb 1, 2024
8dfaf4b
more predictable choice of interesting cases
akuzm Feb 1, 2024
a2d9142
Apply suggestions from code review
akuzm Feb 1, 2024
ffdb3df
no need to zero
akuzm Feb 1, 2024
b2da0d5
Merge commit 'a2d9142c5' into HEAD
akuzm Feb 1, 2024
fcb9480
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Feb 1, 2024
b267824
cleanup
akuzm Feb 1, 2024
7626d63
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Feb 5, 2024
74dddef
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Mar 1, 2024
9cd735e
review fixes
akuzm Mar 1, 2024
311fadd
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Mar 8, 2024
e4d2e5d
move the recursion check later
akuzm Mar 8, 2024
0e50ba5
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Mar 18, 2024
8f58261
benchmark vectorized text (2024-03-18 no. 1)
akuzm Mar 18, 2024
15cfdaf
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Mar 25, 2024
ebde574
benchmark vectorized text (2024-03-25 no. 1)
akuzm Mar 25, 2024
43a2bc4
review fixes
akuzm Mar 27, 2024
82f9ab1
comment
akuzm Mar 27, 2024
7ea3ca4
Merge remote-tracking branch 'origin/main' into HEAD
akuzm Mar 27, 2024
3a0763f
comment
akuzm Mar 27, 2024
7ab250f
remove restrict from const objects
akuzm Mar 27, 2024
02918bf
replace restrict with const on read-only objects
akuzm Mar 27, 2024
c355c25
comment
akuzm Mar 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions tsl/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,5 @@ install(TARGETS ${TSL_LIBRARY_NAME} DESTINATION ${PG_PKGLIBDIR})
add_subdirectory(bgw_policy)
add_subdirectory(compression)
add_subdirectory(continuous_aggs)
add_subdirectory(import)
add_subdirectory(nodes)
6 changes: 3 additions & 3 deletions tsl/src/compression/array.c
Original file line number Diff line number Diff line change
Expand Up @@ -507,10 +507,10 @@ text_array_decompress_all_serialized_no_header(StringInfo si, bool has_nulls,
CheckCompressedData(n_total >= n_notnull);

uint32 *offsets =
(uint32 *) MemoryContextAllocZero(dest_mctx,
pad_to_multiple(64, sizeof(*offsets) * (n_total + 1)));
(uint32 *) MemoryContextAlloc(dest_mctx,
pad_to_multiple(64, sizeof(*offsets) * (n_total + 1)));
uint8 *arrow_bodies =
(uint8 *) MemoryContextAllocZero(dest_mctx, pad_to_multiple(64, si->len - si->cursor));
(uint8 *) MemoryContextAlloc(dest_mctx, pad_to_multiple(64, si->len - si->cursor));

uint32 offset = 0;
for (int i = 0; i < n_notnull; i++)
Expand Down
2 changes: 2 additions & 0 deletions tsl/src/import/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
set(SOURCES "")
target_sources(${PROJECT_NAME} PRIVATE ${SOURCES})
208 changes: 208 additions & 0 deletions tsl/src/import/ts_like_match.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
/*
* This file and its contents are licensed under the Timescale License.
* Please see the included NOTICE for copyright information and
* LICENSE-TIMESCALE for a copy of the license.
*/

/*
* This file contains source code that was copied and/or modified from
* the PostgreSQL database, which is licensed under the open-source
* PostgreSQL License. Please see the NOTICE at the top level
* directory for a copy of the PostgreSQL License.
*
* This is a copy of backend/utils/adt/like_match.c.
*/

/*--------------------
* Match text and pattern, return LIKE_TRUE, LIKE_FALSE, or LIKE_ABORT.
*
* LIKE_TRUE: they match
* LIKE_FALSE: they don't match
* LIKE_ABORT: not only don't they match, but the text is too short.
*
* If LIKE_ABORT is returned, then no suffix of the text can match the
* pattern either, so an upper-level % scan can stop scanning now.
*--------------------
*/

#ifdef MATCH_LOWER
#define GETCHAR(t) MATCH_LOWER(t)
#else
#define GETCHAR(t) (t)
#endif

static int
MatchText(const char *t, int tlen, const char *p, int plen)
akuzm marked this conversation as resolved.
Show resolved Hide resolved
{
/* Fast path for match-everything pattern */
if (plen == 1 && *p == '%')
return LIKE_TRUE;

/* Since this function recurses, it could be driven to stack overflow */
check_stack_depth();

/*
* In this loop, we advance by char when matching wildcards (and thus on
* recursive entry to this function we are properly char-synced). On other
* occasions it is safe to advance by byte, as the text and pattern will
* be in lockstep. This allows us to perform all comparisons between the
* text and pattern on a byte by byte basis, even for multi-byte
* encodings.
*/
while (tlen > 0 && plen > 0)
{
if (*p == '\\')
{
/* Next pattern byte must match literally, whatever it is */
NextByte(p, plen);
/* ... and there had better be one, per SQL standard */
if (plen <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_ESCAPE_SEQUENCE),
errmsg("LIKE pattern must not end with escape character")));
if (GETCHAR(*p) != GETCHAR(*t))
return LIKE_FALSE;
}
else if (*p == '%')
{
char firstpat;

/*
* % processing is essentially a search for a text position at
* which the remainder of the text matches the remainder of the
* pattern, using a recursive call to check each potential match.
*
* If there are wildcards immediately following the %, we can skip
* over them first, using the idea that any sequence of N _'s and
* one or more %'s is equivalent to N _'s and one % (ie, it will
* match any sequence of at least N text characters). In this way
* we will always run the recursive search loop using a pattern
* fragment that begins with a literal character-to-match, thereby
* not recursing more than we have to.
*/
NextByte(p, plen);

while (plen > 0)
{
if (*p == '%')
NextByte(p, plen);
else if (*p == '_')
{
/* If not enough text left to match the pattern, ABORT */
if (tlen <= 0)
return LIKE_ABORT;
NextChar(t, tlen);
NextByte(p, plen);
}
else
break; /* Reached a non-wildcard pattern char */
}

/*
* If we're at end of pattern, match: we have a trailing % which
* matches any remaining text string.
*/
if (plen <= 0)
return LIKE_TRUE;

/*
* Otherwise, scan for a text position at which we can match the
* rest of the pattern. The first remaining pattern char is known
* to be a regular or escaped literal character, so we can compare
* the first pattern byte to each text byte to avoid recursing
* more than we have to. This fact also guarantees that we don't
* have to consider a match to the zero-length substring at the
* end of the text.
*/
if (*p == '\\')
{
if (plen < 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_ESCAPE_SEQUENCE),
errmsg("LIKE pattern must not end with escape character")));
firstpat = GETCHAR(p[1]);
}
else
firstpat = GETCHAR(*p);

while (tlen > 0)
{
if (GETCHAR(*t) == firstpat)
{
int matched = MatchText(t, tlen, p, plen);

if (matched != LIKE_FALSE)
return matched; /* TRUE or ABORT */
}

NextChar(t, tlen);
}

/*
* End of text with no match, so no point in trying later places
* to start matching this pattern.
*/
return LIKE_ABORT;
}
else if (*p == '_')
{
/* _ matches any single character, and we know there is one */
NextChar(t, tlen);
NextByte(p, plen);
continue;
}
else if (GETCHAR(*p) != GETCHAR(*t))
{
/* non-wildcard pattern char fails to match text char */
return LIKE_FALSE;
}

/*
* Pattern and text match, so advance.
*
* It is safe to use NextByte instead of NextChar here, even for
* multi-byte character sets, because we are not following immediately
* after a wildcard character. If we are in the middle of a multibyte
* character, we must already have matched at least one byte of the
* character from both text and pattern; so we cannot get out-of-sync
* on character boundaries. And we know that no backend-legal
* encoding allows ASCII characters such as '%' to appear as non-first
* bytes of characters, so we won't mistakenly detect a new wildcard.
*/
NextByte(t, tlen);
NextByte(p, plen);
}

if (tlen > 0)
return LIKE_FALSE; /* end of pattern, but not of text */

/*
* End of text, but perhaps not of pattern. Match iff the remaining
* pattern can match a zero-length string, ie, it's zero or more %'s.
*/
while (plen > 0 && *p == '%')
NextByte(p, plen);
if (plen <= 0)
return LIKE_TRUE;

/*
* End of text with no match, so no point in trying later places to start
* matching this pattern.
*/
return LIKE_ABORT;
} /* MatchText() */

#ifdef CHAREQ
#undef CHAREQ
#endif

#undef NextChar
#undef CopyAdvChar
#undef MatchText

#undef GETCHAR

#ifdef MATCH_LOWER
#undef MATCH_LOWER

#endif
1 change: 1 addition & 0 deletions tsl/src/nodes/decompress_chunk/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ set(SOURCES
${CMAKE_CURRENT_SOURCE_DIR}/detoaster.c
${CMAKE_CURRENT_SOURCE_DIR}/exec.c
${CMAKE_CURRENT_SOURCE_DIR}/planner.c
${CMAKE_CURRENT_SOURCE_DIR}/pred_text.c
${CMAKE_CURRENT_SOURCE_DIR}/pred_vector_array.c
${CMAKE_CURRENT_SOURCE_DIR}/qual_pushdown.c
${CMAKE_CURRENT_SOURCE_DIR}/vector_predicates.c)
Expand Down
Loading
Loading