Skip to content

Commit

Permalink
Add support for on-demand involvement
Browse files Browse the repository at this point in the history
Fixes #15
  • Loading branch information
spantaleev committed Oct 1, 2024
1 parent eae6472 commit 9908512
Show file tree
Hide file tree
Showing 23 changed files with 828 additions and 325 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ It's influenced by [chaz](https://github.com/arcuru/chaz), but does **not** use

![Introduction and general usage](./docs/screenshots/introduction-and-general-usage.webp)

You can find more screenshots on the the [🌟 Features](./docs/features.md) and other [📚 Documentation](./docs/README.md) pages, as well as in the [docs/screenshots](./docs/screenshots) directory.
You can find more screenshots on the [🌟 Features](./docs/features.md) and other [📚 Documentation](./docs/README.md) pages, as well as in the [docs/screenshots](./docs/screenshots) directory.


## 🚀 Getting Started
Expand Down
1 change: 1 addition & 0 deletions docs/access.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Users:

- ✅ can **invite the bot to rooms**
- ✅ can **use all the bot's [features](./features.md)** ([💬 Text Generation](./features.md#-text-generation), [🦻 Speech-to-Text](./features.md#-speech-to-text), etc.) by sending room messages
- ✅ can **mention the bot** in threads and reply chains to provoke it to respond to non-user messages (see [📖 Usage / 💬 Text Generation / On-demand involvement](./usage.md#on-demand-involvement))
- ✅ can **change the bot's configuration in a room** (e.g. `!bai config room ...` commands)
- ❌ cannot **change the bot's global configuration** (e.g. `!bai config global ...` commands)
- ❌ cannot **create new [🤖 Agents](./agents.md)** (neither in rooms, nor globally). See [💼 Room-local agent managers](#-room-local-agent-managers) for controlling which users can create agents.
Expand Down
7 changes: 5 additions & 2 deletions docs/configuration/text-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ You may also wish to see:

In Direct Message rooms with the bot (1:1 rooms), it most usually makes sense for the bot to respond to **all** of your messages, as shown on this [🖼️ screenshot](../screenshots/text-generation.webp).

In group rooms (with multiple users), it may be more appropriate for the bot to only respond to messages that are **prefixed** with the command prefix (e.g. `!bai`), so that other chat exchange in the room will not trigger it. Such a setup is shown on this [🖼️ screenshot](../screenshots/text-generation-prefix-requirement.webp).
In group rooms (with multiple users), it may be more appropriate for the bot to only respond to messages that are **prefixed** with the command prefix (e.g. `!bai`) or which are [mentioning](https://spec.matrix.org/latest/client-server-api/#user-and-room-mentions) the bot (e.g. `@baibot`), so that other chat exchange in the room will not trigger it. Such a setup is shown on the [🖼️ On-demand involvement in the room](../screenshots/text-generation-prefix-requirement.webp) screenshot.

There are exceptions to these rules, and you can configure the bot to respond only to prefixed messages in a 1:1 room, or to respond to all messages even in a multi-user group room.

Expand All @@ -27,7 +27,10 @@ By default, the bot is **auto-configured (upon joining a new room)** to use the

Example: `!bai config room text-generation set-prefix-requirement-type command_prefix` (this can also be set globally, see [🛠️ Room Settings](./README.md#room-settings))

Regardless of this configuration, **the bot will also respond to messages which directly [mention](https://spec.matrix.org/latest/client-server-api/#user-and-room-mentions) the bot** (e.g. `@baibot`), even if they are not prefixed. An example of this can be seen on this [🖼️ screenshot](../screenshots/text-generation-prefix-requirement.webp).
Regardless of this configuration, **the bot will also respond to messages by allowed [👥 Users](../access.md#-users) which directly [mention](https://spec.matrix.org/latest/client-server-api/#user-and-room-mentions) the bot** (e.g. `@baibot`), even if they are not prefixed. An example of this can be seen on these screenshots:

- [🖼️ On-demand involvement in a thread](../screenshots/text-generation-on-demand-thread-involvement.webp)
- [🖼️ On-demand involvement in a reply chain](../screenshots/text-generation-on-demand-reply-involvement.webp)


### 🪄 Auto Usage
Expand Down
2 changes: 2 additions & 0 deletions docs/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ Text Generation is the bot's ability to **respond to users' text messages with t

In multi-user (group) rooms, to avoid disturbing the normal conversation between people, the bot is auto-configured to only respond to messages starting with the command prefix (`!bai`) or direct mentions via the [💬 Text Generation / 🗟 Prefix Requirement Type](./configuration/text-generation.md#-prefix-requirement-type) setting.

Normally, the bot only responds to allowed [👥 Users](./access.md#-users). In certain cases, it's useful for an allowed user to provoke the bot to respond even in foreign threads or reply chains. You can learn more about this feature in the [📖 Usage / 💬 Text Generation / On-demand involvement](./usage.md#on-demand-involvement) section.

A few other features (like [🗣️ Text-to-Speech](#️-text-to-speech) and [🦻 Speech-to-Text](#-speech-to-text)) combine well with Text Generation, so you **don't necessarily need to communicate with the bot via text** (with [Seamless voice interaction](#seamless-voice-interaction), you can communicate only with voice).

You may also wish to see:
Expand Down
Binary file not shown.
Binary file not shown.
24 changes: 20 additions & 4 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ This is related to the [💬 Text Generation](./features.md#-text-generation) fe

If there's a text-generation handler agent configured, the bot **may** respond to messages sent in the room.

🖼️ See screenshots of:
See screenshots of:

- the [default Text Generation flow](./screenshots/text-generation.webp) for 1:1 rooms
- the [Text Generation flow in multi-user rooms](./screenshots/text-generation-prefix-requirement.webp) (where the [🗟 Prefix Requirement](./configuration/text-generation.md#-prefix-requirement-type) setting is auto-configured to "required")
- 🖼️ [the default Text Generation flow](./screenshots/text-generation.webp) in 1:1 rooms
- 🖼️ [the Text Generation flow in multi-user rooms](./screenshots/text-generation-prefix-requirement.webp) (where the [🗟 Prefix Requirement](./configuration/text-generation.md#-prefix-requirement-type) setting is auto-configured to "required")
- [on-demand involvement](#on-demand-involvement)

Whether the bot responds depends on:

Expand All @@ -24,12 +25,27 @@ Whether the bot responds depends on:

- (🎨 agent capabilities) whether the configured `text-generation` (or `catch-all`) handler agent actually supports text-generation. The provider may lack support for this feature or it may be disabled in the [🤖 agents](./agents.md) configuration

- (the [🗟 Prefix Requirement](./configuration/text-generation.md#-prefix-requirement-type) setting) whether a prefix (e.g. `!bai`) is required in front of messages sent to the room. For multi-user rooms, this setting defaults to "required"
- (the [🗟 Prefix Requirement](./configuration/text-generation.md#-prefix-requirement-type) setting) whether a prefix (e.g. `!bai`) or user mention (e.g. `@baibot`) is required for messages sent to the room. For multi-user rooms, this setting defaults to "required". See [on-demand involvement](#on-demand-involvement) for details.

Room messages start a threaded conversation where you can continue back-and-forth communication with the bot.

Unless you've enabled the [♻️ Context Management](./features.md#️-context-management) feature, all messages will be sent to the agent's API each time. If the context management feature is enabled, older messages may be dropped.

#### On-demand involvement

In the following 2 cases, it's useful to involve the bot in conversations on-demand:

1. For multi-user rooms (with the [🗟 Prefix Requirement](./configuration/text-generation.md#-prefix-requirement-type) setting set to "required")
2. In rooms with foreign users (users that are not authorized bot [👥 users](./access.md#-users))

In these instances, an allowed [👥 user](./access.md#-users) can also provoke the bot to respond to **any** thread or reply chain by [mentioning](https://spec.matrix.org/latest/client-server-api/#user-and-room-mentions) the bot (e.g. `@baibot Hello!`). The following screenshots demonstrate this behavior:

- [🖼️ On-demand involvement in the room](./screenshots/text-generation-prefix-requirement.webp)
- [🖼️ On-demand involvement in a thread](./screenshots/text-generation-on-demand-thread-involvement.webp) (the Alice user in this example is not an allowed user, yet her messages are still considered as part of the conversation context)
- [🖼️ On-demand involvement in a reply chain](./screenshots/text-generation-on-demand-reply-involvement.webp) (the Alice user in this example is not an allowed user, yet her messages are still considered as part of the conversation context)

💡 **NOTE**: Normally, the bot **only considers messages from allowed [👥 Users](./access.md#-users)** and ignores all other messages when responding. However, **when the bot is explicitly invoked (via mention)** in a thread or reply chain, **it will consider all messages** in the thread and reply chain (even those from foreign users) as part of the conversation context.


### 🗣️ Text-to-Speech

Expand Down
40 changes: 11 additions & 29 deletions src/bot/messaging.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ use mxlink::{CallbackError, MessageResponseType};
use tracing::Instrument;

use crate::{
conversation::matrix::determine_thread_context_for_room_event,
conversation::matrix::determine_interaction_context_for_room_event,
entity::{MessageContext, MessagePayload, RoomConfigContext, TriggerEventInfo},
};

Expand Down Expand Up @@ -239,7 +239,7 @@ impl Messaging {
}
};

let thread_context = determine_thread_context_for_room_event(
let interaction_context = determine_interaction_context_for_room_event(
self.bot.user_id(),
&room,
&event,
Expand All @@ -248,16 +248,18 @@ impl Messaging {
)
.await;

let thread_context = match thread_context {
let interaction_context = match interaction_context {
Ok(value) => value,
Err(err) => {
tracing::error!(?err, "Failed to determine thread context for event");
tracing::error!(?err, "Failed to determine interaction context for event");
return Ok(());
}
};

let Some(thread_context) = thread_context else {
tracing::debug!("Ignoring message with unknown thread context (likely not a threaded message or a top-level message)");
let Some(interaction_context) = interaction_context else {
tracing::debug!(
"Ignoring message with unknown interaction context (likely not a message for us)"
);
return Ok(());
};

Expand All @@ -276,41 +278,21 @@ impl Messaging {
room_config_context,
self.bot.admin_pattern_regexes().clone(),
trigger_event_info,
thread_context.info.clone(),
interaction_context.thread_info.clone(),
);

let bot_display_name = self
.bot
.room_display_name_fetcher()
.own_display_name_in_room(message_context.room())
.await;

let bot_display_name = match bot_display_name {
Ok(value) => value,
Err(err) => {
tracing::warn!(
?err,
"Failed to fetch bot display name. Proceeding without it"
);
None
}
};

// The first event in the thread determines which handler processes the current event.
let controller_type = crate::controller::determine_controller(
self.bot.command_prefix(),
&thread_context.first_message,
&interaction_context.trigger,
&message_context,
self.bot.user_id(),
&bot_display_name,
);

tracing::info!(?controller_type, "Determined controller");

let _ = room
.send_single_receipt(
ReceiptType::Read,
thread_context.info.clone().into(),
interaction_context.thread_info.clone().into(),
event.event_id.clone(),
)
.await;
Expand Down
129 changes: 105 additions & 24 deletions src/controller/chat_completion/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,28 @@ use crate::entity::roomconfig::{
use crate::entity::MessagePayload;
use crate::strings;
use crate::utils::text_to_speech::create_transcribed_message_text;
use crate::{conversation::create_llm_conversation_for_matrix_thread, entity::MessageContext, Bot};
use crate::{
conversation::{
create_llm_conversation_for_matrix_reply_chain, create_llm_conversation_for_matrix_thread,
matrix::create_list_of_bot_user_prefixes_to_strip,
},
entity::MessageContext,
Bot,
};

#[derive(Debug, PartialEq)]
pub enum ChatCompletionControllerType {
ViaText { prefixes_to_strip: Vec<String> },
// Invoked via a command prefix (e.g. `!bai Hello!`)
TextCommand,
// Invoked via a mention (e.g. `@baibot Hello!`)
TextMention,
// Invoked via a direct message (e.g. `Hello!`)
TextDirect,

Audio,

ViaAudio,
ThreadMention,
ReplyMention,
}

struct TextToSpeechEligiblePayload {
Expand Down Expand Up @@ -125,7 +140,15 @@ pub async fn handle(
None
};

let response_type = MessageResponseType::InThread(message_context.thread_info().clone());
let response_type = match controller_type {
// When we're triggered via a reply mention, we reply to the message that triggered us.
ChatCompletionControllerType::ReplyMention => {
MessageResponseType::Reply(message_context.thread_info().last_event_id.clone())
}

// In all other cases, we're dealing with a threaded conversation, so we reply in the thread.
_ => MessageResponseType::InThread(message_context.thread_info().clone()),
};

let text_to_speech_eligible_payload = handle_stage_text_generation(
bot,
Expand Down Expand Up @@ -353,24 +376,78 @@ async fn handle_stage_text_generation(
)
.await?;

let prefixes_to_strip = match controller_type {
ChatCompletionControllerType::ViaText { prefixes_to_strip } => prefixes_to_strip.clone(),
ChatCompletionControllerType::ViaAudio => vec![],
// We only strip text from the first message if we're invoked via a command prefix.
// Otherwise, we do bot-user mentions stripping on all messages below.
let first_message_prefixes_to_strip = match controller_type {
ChatCompletionControllerType::TextCommand => vec![bot.command_prefix().to_owned()],
_ => vec![],
};

let params = MatrixMessageProcessingParams::new(
bot.user_id().as_str().to_owned(),
message_context.combined_admin_and_user_regexes(),
)
.with_first_message_stripped_prefixes(prefixes_to_strip);
let bot_display_name = bot
.room_display_name_fetcher()
.own_display_name_in_room(message_context.room())
.await;

let conversation = create_llm_conversation_for_matrix_thread(
matrix_link.clone(),
message_context.room(),
message_context.thread_info().root_event_id.clone(),
&params,
)
.await;
let bot_display_name = match bot_display_name {
Ok(value) => value,
Err(err) => {
tracing::warn!(
?err,
"Failed to fetch bot display name. Proceeding without it"
);
None
}
};

let bot_user_prefixes_to_strip =
create_list_of_bot_user_prefixes_to_strip(bot.user_id(), &bot_display_name);

let allowed_users = match controller_type {
// Regular chat completion only operates on messages from allowed users.
ChatCompletionControllerType::TextCommand
| ChatCompletionControllerType::TextMention
| ChatCompletionControllerType::TextDirect
| ChatCompletionControllerType::Audio => {
Some(message_context.combined_admin_and_user_regexes())
}

// When we're triggered via an explicit mention (thread or reply), we wish to operate against the mention's whole context
// (the whole thread or the whole reply chain upward of the message that triggered us).
//
// This is to allow admins and users to trigger text-generation for other users' messages.
// When we're dragged into a conversation by a known (to us) user, we'd like to process all messages in the conversation,
// not just those from allowed users.
ChatCompletionControllerType::ThreadMention
| ChatCompletionControllerType::ReplyMention => None,
};

let params = MatrixMessageProcessingParams::new(bot.user_id().to_owned(), allowed_users)
.with_first_message_prefixes_to_strip(first_message_prefixes_to_strip)
.with_bot_user_prefixes_to_strip(bot_user_prefixes_to_strip);

let conversation = match controller_type {
// When we're triggered via a reply mention, the context is the whole reply chain upward of the message that triggered us.
ChatCompletionControllerType::ReplyMention => {
create_llm_conversation_for_matrix_reply_chain(
&bot.room_event_fetcher().clone(),
message_context.room(),
message_context.thread_info().last_event_id.clone(),
&params,
)
.await
}

// Everything else is happening in a thread, so the context is the whole thread.
_ => {
create_llm_conversation_for_matrix_thread(
matrix_link.clone(),
message_context.room(),
message_context.thread_info().root_event_id.clone(),
&params,
)
.await
}
};

let conversation = match conversation {
Ok(conversation) => conversation,
Expand Down Expand Up @@ -565,11 +642,15 @@ async fn handle_stage_speech_to_text_actual_transcribing(
//
// Regardless of how we post this message, it will be posted as a notice,
// which can indicate to the bot (for potential future text-generation purposes) that this message is not a bot message.
let (transcribed_text, annotate_message_with_reaction) = if let MessageResponseType::InThread(_) = response_type {
(create_transcribed_message_text(&speech_to_text_result.text), false)
} else {
(speech_to_text_result.text, true)
};
let (transcribed_text, annotate_message_with_reaction) =
if let MessageResponseType::InThread(_) = response_type {
(
create_transcribed_message_text(&speech_to_text_result.text),
false,
)
} else {
(speech_to_text_result.text, true)
};

let result = bot
.messaging()
Expand Down
Loading

0 comments on commit 9908512

Please sign in to comment.