Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: replace --no-special with --special (and also set control token output to stdout to off by default) #7534

Merged
merged 1 commit into from
May 26, 2024

Conversation

mofosyne
Copy link
Collaborator

@mofosyne mofosyne commented May 25, 2024

This is to solve a pain point for those not expecting output control tokens but who have written scripts already that relied on an output that has no control tokens.

I have also been made aware that the inclusion of control tokens as a default behavior is actually a relatively recent feature in last few week when --conversation and --interactive-specials was included.

On checking so far, I'm not aware of the practical purpose of control tokens being active by default. So I think it makes sense to reverse the current default state of outputting control tokens to the original behavior of not outputting control tokens.

For those who do need control tokens in stdout, they can use --special to enable this feature.

In terms of those relying on --no-token, this was merged in today and only a single release 2hr ago was done so far. So the impact of reversing this feature is minimal in my opinion.

However the regressive change of the control tokens being included by default has been present for a few weeks now, so we will need to alert users of this change back to original behavior in the readme.


Regarding overall learnability of using --special over --no-special

On studying the arguments I think adding --special and changing the output default behavior is actually a more consistent behavior as these two commands are symmetrical in that they 'enable' special token in input or the output.

--special              special tokens output enabled
--interactive-specials allow special tokens in user text, in interactive mode

@mofosyne mofosyne requested review from ggerganov and ngxson May 25, 2024 13:13
@mofosyne mofosyne added Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix bugfix fixes an issue or bug labels May 25, 2024
This also flips the default behavior of the output to not include control token by default.
@mofosyne mofosyne force-pushed the special-token-setting-reverse branch from c4a7eea to 5ca93f7 Compare May 25, 2024 13:47
@mofosyne mofosyne requested a review from cebtenzzre May 26, 2024 12:32
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On checking so far, I'm not aware of the practical purpose of control tokens being active by default.

For me it makes sense to be default, because I use the examples to debug and it's always better to see what are the actual tokens that are produced. The examples are not meant to be used as production applications, so that's why I thought the default behaviour should be suitable for development purposes. But having the --special flag available is fine and I can start using it instead

@mofosyne mofosyne added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 26, 2024
@mofosyne mofosyne merged commit d298382 into ggerganov:master May 26, 2024
65 checks passed
@mofosyne mofosyne deleted the special-token-setting-reverse branch May 26, 2024 14:10
Bryan-Roe added a commit to Bryan-Roe/llama.cpp that referenced this pull request May 26, 2024
This also flips the default behavior of the output to not include control token by default.
@jart
Copy link
Contributor

jart commented May 26, 2024

@ggerganov I'm not sure I'd call shell scripting a production use case. I am willing to budge on this though, since I don't want you to think I'm putting my use cases before yours. For example, the shell scripts that are broken by the tokenization change will almost certainly be specifying --no-display-prompt or --log-disable or --grammar. I didn't think of this until now, but it would also be acceptable to me if we got rid of --special and --no-special and instead we had --no-display-prompt have the effect of meaning "no special tokens please".

@mofosyne
Copy link
Collaborator Author

mofosyne commented May 27, 2024

Ah something like?

const std::string token_str = llama_token_to_piece(ctx, id, !params.display_prompt && !params.conversation);

My opinion here is to revisit this PR: Add a simpler main example and make a stripped down version of main that is focused purely on testing (and grokkability). Since it sounds like gg use of main is purely to focus on the underlying engine (and I think it's a not too bad of an idea in terms of enabling future automated testing as well). It was closed because it got too stale but I think the author of that PR would appreciate that there is a real case for it's existence (in my opinion at least).

@ggerganov
Copy link
Owner

I didn't think of this until now, but it would also be acceptable to me if we got rid of --special and --no-special and instead we had --no-display-prompt have the effect of meaning "no special tokens please".

Yes, that's an option. I don't have strong opinion on that and as long as there is a way to turn on the special tokens, it's all good.

Adding more examples increases the maintenance effort. I think a bigger advantage would be to do some refactoring in the existing examples and "hide" some of the state for sampling and KV cache management that we expose behind the common API

@ngxson
Copy link
Collaborator

ngxson commented May 27, 2024

Just want to add my 2 cents: Since the number of arguments for main example has grown quite a lot now, does it make sense to have a better help message in gpt_params_print_usage? This will not change the behavior, but will make the usage and documentation a bit more clear.

My first suggest would be re-group some arguments into groups:

  • Model params: -m, -model-draft, --model-url, etc.
  • Runtime params: -ngl, --threads, --batch-size, --ctx-size, --mlock, etc.
  • Sampling params: --temperature, --top-k, --top-p, etc.
  • "Chat / instruction" params: --interactive, --chatml, etc.
  • Display params: --simple-io, --special, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants