Unexpected behavior in ChatSession.ChatAsync methods #261

philippjbauer · 2023-11-07T00:41:14Z

I've been trying to create a chatbot with the ChatSession class and IHistoryTransform implementation to use the correct template for the LLM I am testing LLamaSharp with. I ran into unexpected behavior and could not get good responses from the model.

The only way I got a decent response was when I sent the whole history as a prompt with the correct formatting. Saving and loading the session would lead to ever growing context and significant slowdown.

I noticed that the use of the IHistoryTransform interface in the ChatSession class is leading to these unexpected results. When the ChatSession class was refactored to use the IHistoryTransform interface, the behavior changed dramatically by parsing the text in the prompt argument and adding the prompt (which may be the whole history) into the history of the session.

It's a little bit hard to untangle. I forked the repo and changed the behavior to be a bit more straightforward. With the overload method that accepts a string prompt managing the internal history for the user and the overload that accepts a ChatHistory argument allowing the user to manage the history themselves.

See here: master...philippjbauer:LLamaSharp:master

I see many more improvements here, especially in how the IHistoryTransform interface could be changed to accommodate better templating support. Making use of public properties for role tokens and end tokens so that these tokens can be replaced internally before adding the generated message to the history instead of relying on the inferenceParam.AntiPrompt property.

If this is missing the mark completely let me know, but I couldn't figure out how this should be used after reading the documentation and looking at the code.

Edit: A little bit lengthy but here is my IHistoryTransform implementation:

public class HistoryTransform : IHistoryTransform
{
	private string _nl = Environment.NewLine;
	private string _roleTemplate = "<|{0}|>";
	private string _endToken = "</s>";
	private string _regexPattern = @"(?:<\|([a-z]+)\|>\n)([^<]*)(?:</s>)";

	public string RoleTemplate => _roleTemplate;
	public string EndToken => _endToken;

	public HistoryTransform(
		string? roleTemplate = null,
		string? endToken = null,
		string? regexPattern = null)
	{
		_roleTemplate = roleTemplate ?? _roleTemplate;
		_endToken = endToken ?? _endToken;
		_regexPattern = regexPattern ?? _regexPattern;
	}

	public string HistoryToText(ChatHistory history)
	{
		string text = "";

		foreach (Message message in history.Messages)
		{
			text += FormatMessage(message);
		}

		AuthorRole nextRole = history.Messages.Last().AuthorRole == AuthorRole.User
			? AuthorRole.Assistant
			: AuthorRole.User;

		text += $"{FormatRole(nextRole)}{_nl}";

		return text;
	}

	public ChatHistory TextToHistory(AuthorRole role, string text)
	{
		ChatHistory history = new();

		Match match = Regex.Match(text, _regexPattern);
		string message = match.Groups[2].Value.Trim();

		history.AddMessage(role, message);

		return history;
	}

	// Not part of interface, use in program to load saved history
	public ChatHistory FullTextToHistory(string text)
	{
		ChatHistory history = new();

		MatchCollection matches = Regex.Matches(text, _regexPattern);

		foreach (Match match in matches.Cast<Match>())
		{
			AuthorRole role = match.Groups[1].Value switch
			{
				"system" => AuthorRole.System,
				"user" => AuthorRole.User,
				"assistant" => AuthorRole.Assistant,
				_ => AuthorRole.System
			};

			string message = match.Groups[2].Value.Trim();

			history.AddMessage(role, message);
		}

		return history;
	}

	public string FormatRole(AuthorRole role)
	{
		return string.Format(_roleTemplate, role.ToString().ToLower());
	}

	public string FormatMessage(Message message)
	{
		return $"{FormatRole(message.AuthorRole)}{_nl}{message.Content.Replace(_endToken, "")}{_endToken}{_nl}";
	}
}

The FullTextToHistory method is a crutch to load the history I save alongside the session data. I think this should be handled by the ChatSession class and saved as a JSON formatted file or similar for easy portability when necessary.

The text was updated successfully, but these errors were encountered:

martindevans · 2023-11-07T15:52:05Z

This kind of confusion with ChatSession/History/various executors is actually exactly what got me started contributing to LLamaSharp! If you're interested in making any PRs to improve to current behaviour (even if it requires breaking changes) I'd be very interested to review them.

In the longer term I think we're going to need to do a complete redesign of the higher level features at some point. To me it's very confusing to have 3 types of executors (two of which seem very similar) and the ChatSession built on top of that (which seems fairly similar again).

To kick off that discussion: what do you think would be the "ideal" high API for LLamaSharp (completely ignoring all backwards compatibility problems for now)?

philippjbauer · 2023-11-07T18:02:43Z

Good, I'm not alone in my confusion :)

I do have some ideas regarding the ChatSession design. It's good to know that you are not averse to breaking changes, that will make it a lot easier to bring this part of LLamaSharp forward.

Long-term it would be great to have the following:

Context Overflow Policy (see LM Studio)
- Stop at limit
- Keep system prompt and first user message, truncate middle
- Maintain rolling window and truncate past (I'd say keep system prompt as well though)
Robust templating configuration
- Reading template from GGUF (I think they may be saved in there nowadays?)
Saving Session with history restore
- History stored as portable JSON structure
- Saving inference parameters as portable structure
ChatAsync overloads that allow for the system to handle history and Context Overflow Policy and for the user to handle history (exposing the lower level method so to say)

Just of the top of my head. There's probably more that can be added but I'd have to fiddle around and spend some time thinking about what makes sense to expose to the user.

Short-term, I think the change I linked in my fork would greatly improve the utility of the ChatSession (make it usable at all really). Maybe this could be merged soon?

martindevans · 2023-11-07T19:58:58Z

Short-term, I think the change I linked in my fork would greatly improve the utility of the ChatSession (make it usable at all really). Maybe this could be merged soon?

If you want to open up a PR with this change I'll be happy to review it :)

Long-term it would be great to have the following...

Context Overflow Policy (see LM Studio)

This one is tricky, because there's are several different "layers" you can do it at. You might want to rebuild a new chunk of text (e.g. keep prompt, summarise the rest). Or you might want to go a level "lower" and just rearrange tokens (e.g. keep prompt tokens and most recent output tokens). Or you might want to do some magic with shifting around the KV cache.

Robust templating configuration

Looks like there's some discussion of embedding jinja templates into gguf (here).

Just of the top of my head. There's probably more that can be added but I'd have to fiddle around and spend some time thinking about what makes sense to expose to the user.

if you come up with anything interesting I opened up a discussion the other day, around thoughts for an entire new executor in LLamaSharp. I'd love to hear your thoughts.

martindevans added the bug Something isn't working label Nov 8, 2023

martindevans mentioned this issue Nov 8, 2023

Prevent duplication of user prompts / chat history in ChatSession. #266

Merged

AsakusaRinne added this to LLamaSharp Dev Nov 9, 2023

AsakusaRinne moved this to 🏗 In progress in LLamaSharp Dev Nov 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected behavior in ChatSession.ChatAsync methods #261

Unexpected behavior in ChatSession.ChatAsync methods #261

philippjbauer commented Nov 7, 2023 •

edited

Loading

martindevans commented Nov 7, 2023 •

edited

Loading

philippjbauer commented Nov 7, 2023

martindevans commented Nov 7, 2023

Unexpected behavior in ChatSession.ChatAsync methods #261

Unexpected behavior in ChatSession.ChatAsync methods #261

Comments

philippjbauer commented Nov 7, 2023 • edited Loading

martindevans commented Nov 7, 2023 • edited Loading

philippjbauer commented Nov 7, 2023

martindevans commented Nov 7, 2023

philippjbauer commented Nov 7, 2023 •

edited

Loading

martindevans commented Nov 7, 2023 •

edited

Loading