bot, irc: in `bot.say()`, fill IRC line if possible & add optional truncation indicator #1958

dgw · 2020-10-05T18:07:14Z

Description

By way of killing (or at least gravely wounding) two birds with one stone, this is my valiant attempt to resolve #1328, and also close #1322—which was left open pending a change like this to make use of the stored hostmask when sending messages. (I say "attempt" mostly because certain @sopel-irc/rockstars might disagree with subtle code style decisions I made; the new logic itself is almost fully covered by new or updated unit tests.)

bot.say() now has the ability to calculate a safe_length for messages, and use that instead of the arbitrary default value of 400 from tools.get_sendable_message(). There is also a new optional parameter, trailing, which replicates a common pattern I've seen in my own and other plugins to append some arbitrary "continuation string" to a message that was too long. (Alternative parameter name suggestions welcome; that's just the best I could think up.)

For plugin authors, use of the new trailing parameter replaces manually calling tools.get_sendable_message() with some arbitrary length value that "should be" short enough to avoid truncation by the server, and leaves truncating the message to Sopel—which knows its own nick, ident, and hopefully hostmask. If Sopel's hostmask is not known, a worst-case maximum length is used instead (thanks to @deathbybandaid for #1322 (comment), which got me 90% of the way there). The worst-case maximum tries to be as comprehensive as possible, taking advantage of the USERLEN token in ISUPPORT if sent by the server, RFC maximum lengths for the hostname and ident, and Sopel's configured ident & nickname values.*

This also kind of addresses part of #1559, regarding calculating the penalty based on only the truncated message body, in the case where (max_messages > 1) or trailing would evaluate to True during execution of bot.say().

The new truncation behavior is explicitly not enabled for the most basic usage of bot.say() yet, as noted in the commit log. We are limiting the potential impact of this behavior change in 7.x to existing plugins that already expressed some concern for over-length messages via use of max_messages, and new/updated plugins that opt-in to the new trailing behavior.

Checklist

I have read CONTRIBUTING.md
I can and do license this contribution under the EFLv2
No issues are reported by make qa (runs make quality and make test)
I have tested the functionality of the things this change touches

Next steps

Future PRs may:

enable the safe_length for all calls to bot.say()
move this logic so it can be used for bot.action(), bot.notice(), and possibly even other commands that take text arguments such as KICK/TOPIC
- I readily admit, I was focused on getting this working for the most common messaging function used by plugin authors; making it more generic was left for future-me (or whoever else decides to do it first)
add the optional max_messages and/or trailing parameters to other messaging functions (e.g. bot.notice() or bot.reply(); only trailing would make sense for bot.action(), IMO)
??? (ideas always welcome)

Notes

* — I have considered using the NICKLEN token from ISUPPORT (if available) and the RFC maximum nickname length, too. But in the nickname's case, it seems more likely to cause problems on servers that support longer-than-RFC nicks but don't send NICKLEN.

Exirel

This is going in the right direction, and I think that shouldn't be a problem for you.

I also want to add a quick note about message tags, from https://ircv3.net/specs/extensions/message-tags.html#size-limit

The size limit for message tags is 8191 bytes, including the leading '@' (0x40) and trailing space ' ' (0x20) characters. The size limit for the rest of the message remains unchanged at 512 bytes.

Which is great, I was worried at first that message tags could have a negative effect on this safe length, but that's not the case!

So: yay! Great PR, well done!

test/test_irc.py

deathbybandaid · 2020-10-05T22:22:20Z

This is some quality effort, and something I've wanted for a long time.

bot.action gets kinda dumb for multi-line in my experience, as each continuation line needs to be bot.say instead

dgw · 2020-10-06T21:09:49Z

@Exirel Changes to address your review nitpicks are running CI now. Good call on the extra Unicode trailing test! 😸

@deathbybandaid Yeah, I really don't think bot.action is gonna get this continuation behavior. It doesn't make sense.

Maybe I should build another PR off of this one to make output prefix behave better on split messages, though? I'm still not sure if the current "first message only" prefixing is good enough, or if it should add the prefix to every line.

Exirel

Well done!

Exirel · 2020-10-07T07:36:01Z

I'm still not sure if the current "first message only" prefixing is good enough, or if it should add the prefix to every line.

I still think that 1. more than one line is an edge case and doesn't matter that much and 2. I prefer the prefix for the first message only.

dgw · 2020-10-14T19:40:25Z

@Exirel Sorry, I noticed an oversight while working on something else and had to amend this PR. It's only a small tweak, though; only 3ce465c added, nothing else changed.

Exirel · 2020-10-14T20:34:59Z

nothing else changed.

Nice, new test case covered nicely. Well done again!

Not quite ready to make this the default behavior, but plugins that make use of `max_messages` are rare enough that we can beta-test it there.

New optional parameter to `bot.say()` allows specifying a suffix that Sopel will append to a message if it's too long to fit in the specified number of `max_messages` (including the default of 1).

Co-authored-by: Exirel <florian.strzelecki@gmail.com>

The new test fails without the change in `irc` itself, proving that the previous algorithm would incorrectly split if the recipient's name contained characters that get sent as multiple bytes in UTF-8.

dgw added the Feature label Oct 5, 2020

dgw added this to the 7.1.0 milestone Oct 5, 2020

dgw requested a review from a team October 5, 2020 18:07

Exirel requested changes Oct 5, 2020

View reviewed changes

test/test_irc.py Outdated Show resolved Hide resolved

test/test_irc.py Show resolved Hide resolved

Exirel approved these changes Oct 7, 2020

View reviewed changes

dgw mentioned this pull request Oct 7, 2020

Output appears mangled with RTL writing systems sopel-irc/sopel-youtube#30

Closed

dgw requested a review from Exirel October 14, 2020 19:39

Exirel approved these changes Oct 14, 2020

View reviewed changes

dgw and others added 4 commits October 14, 2020 15:50

irc: calculate safe message length in bot.say() when splitting

b7bf5e7

Not quite ready to make this the default behavior, but plugins that make use of `max_messages` are rare enough that we can beta-test it there.

bot, irc: implement optional continuation indicator for bot.say()

0a6d1ea

New optional parameter to `bot.say()` allows specifying a suffix that Sopel will append to a message if it's too long to fit in the specified number of `max_messages` (including the default of 1).

test_irc: verify behavior of bot.say()'s trailing parameter

bf7337e

Co-authored-by: Exirel <florian.strzelecki@gmail.com>

irc, test_irc: handle multibyte chars in dest when splitting messages

830fbef

The new test fails without the change in `irc` itself, proving that the previous algorithm would incorrectly split if the recipient's name contained characters that get sent as multiple bytes in UTF-8.

dgw force-pushed the bot.say-sendable-more branch from 3ce465c to 830fbef Compare October 14, 2020 20:51

dgw merged commit 1e89915 into master Nov 30, 2020

dgw deleted the bot.say-sendable-more branch November 30, 2020 00:44

dgw mentioned this pull request Nov 30, 2020

modules: use bot.say()'s trailing parameter #1995

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bot, irc: in `bot.say()`, fill IRC line if possible & add optional truncation indicator #1958

bot, irc: in `bot.say()`, fill IRC line if possible & add optional truncation indicator #1958

dgw commented Oct 5, 2020

Exirel left a comment

deathbybandaid commented Oct 5, 2020

dgw commented Oct 6, 2020

Exirel left a comment

Exirel commented Oct 7, 2020

dgw commented Oct 14, 2020

Exirel commented Oct 14, 2020

bot, irc: in bot.say(), fill IRC line if possible & add optional truncation indicator #1958

bot, irc: in bot.say(), fill IRC line if possible & add optional truncation indicator #1958

Conversation

dgw commented Oct 5, 2020

Description

Checklist

Next steps

Notes

Exirel left a comment

Choose a reason for hiding this comment

deathbybandaid commented Oct 5, 2020

dgw commented Oct 6, 2020

Exirel left a comment

Choose a reason for hiding this comment

Exirel commented Oct 7, 2020

dgw commented Oct 14, 2020

Exirel commented Oct 14, 2020

bot, irc: in `bot.say()`, fill IRC line if possible & add optional truncation indicator #1958

bot, irc: in `bot.say()`, fill IRC line if possible & add optional truncation indicator #1958