Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bot, irc: in bot.say(), fill IRC line if possible & add optional truncation indicator #1958

Merged
merged 4 commits into from
Nov 30, 2020

Conversation

dgw
Copy link
Member

@dgw dgw commented Oct 5, 2020

Description

By way of killing (or at least gravely wounding) two birds with one stone, this is my valiant attempt to resolve #1328, and also close #1322—which was left open pending a change like this to make use of the stored hostmask when sending messages. (I say "attempt" mostly because certain @sopel-irc/rockstars might disagree with subtle code style decisions I made; the new logic itself is almost fully covered by new or updated unit tests.)

bot.say() now has the ability to calculate a safe_length for messages, and use that instead of the arbitrary default value of 400 from tools.get_sendable_message(). There is also a new optional parameter, trailing, which replicates a common pattern I've seen in my own and other plugins to append some arbitrary "continuation string" to a message that was too long. (Alternative parameter name suggestions welcome; that's just the best I could think up.)

For plugin authors, use of the new trailing parameter replaces manually calling tools.get_sendable_message() with some arbitrary length value that "should be" short enough to avoid truncation by the server, and leaves truncating the message to Sopel—which knows its own nick, ident, and hopefully hostmask. If Sopel's hostmask is not known, a worst-case maximum length is used instead (thanks to @deathbybandaid for #1322 (comment), which got me 90% of the way there). The worst-case maximum tries to be as comprehensive as possible, taking advantage of the USERLEN token in ISUPPORT if sent by the server, RFC maximum lengths for the hostname and ident, and Sopel's configured ident & nickname values.*

This also kind of addresses part of #1559, regarding calculating the penalty based on only the truncated message body, in the case where (max_messages > 1) or trailing would evaluate to True during execution of bot.say().

The new truncation behavior is explicitly not enabled for the most basic usage of bot.say() yet, as noted in the commit log. We are limiting the potential impact of this behavior change in 7.x to existing plugins that already expressed some concern for over-length messages via use of max_messages, and new/updated plugins that opt-in to the new trailing behavior.

Checklist

  • I have read CONTRIBUTING.md
  • I can and do license this contribution under the EFLv2
  • No issues are reported by make qa (runs make quality and make test)
  • I have tested the functionality of the things this change touches

Next steps

Future PRs may:

  • enable the safe_length for all calls to bot.say()
  • move this logic so it can be used for bot.action(), bot.notice(), and possibly even other commands that take text arguments such as KICK/TOPIC
    • I readily admit, I was focused on getting this working for the most common messaging function used by plugin authors; making it more generic was left for future-me (or whoever else decides to do it first)
  • add the optional max_messages and/or trailing parameters to other messaging functions (e.g. bot.notice() or bot.reply(); only trailing would make sense for bot.action(), IMO)
  • ??? (ideas always welcome)

Notes

* — I have considered using the NICKLEN token from ISUPPORT (if available) and the RFC maximum nickname length, too. But in the nickname's case, it seems more likely to cause problems on servers that support longer-than-RFC nicks but don't send NICKLEN.

@dgw dgw added the Feature label Oct 5, 2020
@dgw dgw added this to the 7.1.0 milestone Oct 5, 2020
@dgw dgw requested a review from a team October 5, 2020 18:07
Copy link
Contributor

@Exirel Exirel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going in the right direction, and I think that shouldn't be a problem for you.

I also want to add a quick note about message tags, from https://ircv3.net/specs/extensions/message-tags.html#size-limit

The size limit for message tags is 8191 bytes, including the leading '@' (0x40) and trailing space ' ' (0x20) characters. The size limit for the rest of the message remains unchanged at 512 bytes.

Which is great, I was worried at first that message tags could have a negative effect on this safe length, but that's not the case!

So: yay! Great PR, well done!

test/test_irc.py Outdated Show resolved Hide resolved
test/test_irc.py Show resolved Hide resolved
@deathbybandaid
Copy link
Contributor

This is some quality effort, and something I've wanted for a long time.

bot.action gets kinda dumb for multi-line in my experience, as each continuation line needs to be bot.say instead

@dgw
Copy link
Member Author

dgw commented Oct 6, 2020

@Exirel Changes to address your review nitpicks are running CI now. Good call on the extra Unicode trailing test! 😸

@deathbybandaid Yeah, I really don't think bot.action is gonna get this continuation behavior. It doesn't make sense.

Maybe I should build another PR off of this one to make output prefix behave better on split messages, though? I'm still not sure if the current "first message only" prefixing is good enough, or if it should add the prefix to every line.

Copy link
Contributor

@Exirel Exirel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!

@Exirel
Copy link
Contributor

Exirel commented Oct 7, 2020

I'm still not sure if the current "first message only" prefixing is good enough, or if it should add the prefix to every line.

I still think that 1. more than one line is an edge case and doesn't matter that much and 2. I prefer the prefix for the first message only.

@dgw
Copy link
Member Author

dgw commented Oct 14, 2020

@Exirel Sorry, I noticed an oversight while working on something else and had to amend this PR. It's only a small tweak, though; only 3ce465c added, nothing else changed.

@Exirel
Copy link
Contributor

Exirel commented Oct 14, 2020

nothing else changed.

Nice, new test case covered nicely. Well done again!

dgw and others added 4 commits October 14, 2020 15:50
Not quite ready to make this the default behavior, but plugins that make
use of `max_messages` are rare enough that we can beta-test it there.
New optional parameter to `bot.say()` allows specifying a suffix that
Sopel will append to a message if it's too long to fit in the specified
number of `max_messages` (including the default of 1).
Co-authored-by: Exirel <florian.strzelecki@gmail.com>
The new test fails without the change in `irc` itself, proving that the
previous algorithm would incorrectly split if the recipient's name
contained characters that get sent as multiple bytes in UTF-8.
@dgw dgw force-pushed the bot.say-sendable-more branch from 3ce465c to 830fbef Compare October 14, 2020 20:51
@dgw dgw merged commit 1e89915 into master Nov 30, 2020
@dgw dgw deleted the bot.say-sendable-more branch November 30, 2020 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants