You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do not attempt to reproduce this on the production SmokeDetector instance. It will very likely break stuff and cause problems.
Summary
Attempting to use !!/watch-force to watch a multiline chat message breaks the watched_keywords.txt file by improperly inserting a newline.
Background information
There was a desire to modify a watch in CHQ, but the watch was longer than the limit for a 'normal' chat message. This isn't inherently an issue if it gets modified without the use of chat.
While multiline messages can get around that restriction, that also introduces a newline character (\n). From the regex perspective, that can be addressed (either by adding {0} directly after the newline or by starting a regex comment before the newline and ending it afterwards, as noted by @makyen).
However, it also breaks SmokeDetector as that inserts a newline directly into the watched_keywords.txt file, which breaks it. This was wisely foreseen by Makyen, hence not testing on production.
Details
For reference, here is the string I tested with. The newline (created with Shift + Enter) has been replaced with \n for this GitHub issue. 000000[trimmed for brevity]000000(?# foo\nbar)
SD can receive multiline chat messages, but at least in !!/bisect, it treats it as a \n. It just handles them problematically for watches (and presumably blacklists).
Attempting to watch a multiline string in chat, as I tried to here, breaks stuff. Specifically, it produced this as the end of watched_keywords.txt.
[06:19:34.736] watched_keywords.txt:65275:not enough values to unpack (expected 3, got 1)
File "/home/ubuntu/SmokeDetector/blacklists.py", line 105, in parse
when, by_whom, what = line.rstrip().split('\t')
^^^^^^^^^^^^^^^^^^^
[06:19:40.636] Global blacklists loaded
Here is the relevant part of the console logs after attempting a bisect of some text that would match the regex.
[06:21:42.072] Command received: !!/bisect 00000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000<span>…</span>
[06:21:50.007] Attempted to send status ping but metasmoke_host is undefined. Not sent.
[06:22:10.883] regex._regex_core.error: missing ) at position 975
2024-08-17 06:22:10.878529 UTC
File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__
result = self.__func__(*processed_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect
results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks
results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex
compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache
return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile
raise error(caught_exception.msg, caught_exception.pattern,
File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__
result = self.__func__(*processed_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect
results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks
results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex
compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache
return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile
raise error(caught_exception.msg, caught_exception.pattern,
What problem has occurred? What issues has it caused?
It broke. Details are above. As this was on a testing instance, it didn't impact the production instance.
On my testing instance, it broke the watched_keywords.txt file, which would need manual intervention to fix. When I tested locally, it failed to push the broken watched_keywords.txt file to GitHub, but I don't know if that's due to my testing setup or this bug. Various Git-related SD commands might be able to restore it to a functional state, although I did not test that, so I cannot confirm if that would be a remedy if this happened to a production SD instance.
What would you like to happen/not happen?
"like to happen"
Reject a multiline chat watch with a helpful message. Something like Multiline chat watches are not supported. Consider using Git (or pinging someone that can) as an alternative. might work*
Accept the multiline watch and correctly input it into the relevant file. This would also require deciding how to treat the newline (leave it as-is, remove the newline, remove everything after & including the first newline, etc.)
*I don't know what the preference is on the best way to request a manual Git-based change, so if that path is chosen, someone should probably confirm that wording.
"like to not happen"
It should not corrupt files.
The text was updated successfully, but these errors were encountered:
First of all, thanks for testing this and taking the effort not to do so in production. I'm in favour of rejecting the multiline chat watch with a helpful message. We already have a system in place to deal with 'wrong' watch suggestions, adding another detection with message seems like the least complicated fix for this bug.
Consider long watches are not necessarily something we want a lot of on the watchlist. If lists of exclusions need to increase in length (which triggered this test) regularly, it must be understood at some point it may be better to split the watch into multiple watches that are more specific and don't require as many exclusions. Watches with many exclusions are too dangerous to upgrade to the blacklist if you don't know when a new exception will pop up.
For the few cases where this isn't possible, making the change through Git (or requesting the change to be made by someone else) seems like the appropriate solution. This also makes it easier to see how a watch has mutated over time, in my opinion.
If this is considered a complicated process, we can consider amending the wiki with instructions. Working with longer regex are somewhat of an advanced part of Charcoal. Getting people who are interested in that more involved on Git isn't a bad thing.
Caution
Do not attempt to reproduce this on the production SmokeDetector instance. It will very likely break stuff and cause problems.
Summary
Attempting to use
!!/watch-force
to watch a multiline chat message breaks thewatched_keywords.txt
file by improperly inserting a newline.Background information
There was a desire to modify a watch in CHQ, but the watch was longer than the limit for a 'normal' chat message. This isn't inherently an issue if it gets modified without the use of chat.
While multiline messages can get around that restriction, that also introduces a newline character (
\n
). From the regex perspective, that can be addressed (either by adding{0}
directly after the newline or by starting a regex comment before the newline and ending it afterwards, as noted by @makyen).However, it also breaks SmokeDetector as that inserts a newline directly into the
watched_keywords.txt
file, which breaks it. This was wisely foreseen by Makyen, hence not testing on production.Details
\n
for this GitHub issue.000000[trimmed for brevity]000000(?# foo\nbar)
!!/bisect
, it treats it as a\n
. It just handles them problematically for watches (and presumably blacklists).watched_keywords.txt
.What problem has occurred? What issues has it caused?
It broke. Details are above. As this was on a testing instance, it didn't impact the production instance.
On my testing instance, it broke the
watched_keywords.txt
file, which would need manual intervention to fix. When I tested locally, it failed to push the brokenwatched_keywords.txt
file to GitHub, but I don't know if that's due to my testing setup or this bug. Various Git-related SD commands might be able to restore it to a functional state, although I did not test that, so I cannot confirm if that would be a remedy if this happened to a production SD instance.What would you like to happen/not happen?
"like to happen"
*I don't know what the preference is on the best way to request a manual Git-based change, so if that path is chosen, someone should probably confirm that wording.
"like to not happen"
It should not corrupt files.
The text was updated successfully, but these errors were encountered: