Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama server - Update doc for slot states #10053

Closed
wants to merge 1 commit into from

Conversation

PyroGenesis
Copy link

Updates the documentation for the state property of slots returned by the /slots endpoint of llama-server. The documentation stated that the possible values were 0 and 1 but I was getting different values which led to me to the change in #9283

Let me know if the explanations need to be changed :)

@ngxson
Copy link
Collaborator

ngxson commented Oct 28, 2024

I think it's better to report just the is_processing state instead of exposing this internal state. The problem is that we don't even know if in the future we will add or remove states (I already added one state in #10023)

In other words, slot.state is pretty much a WIP and I don't think we should expose it as a public API

@PyroGenesis
Copy link
Author

PyroGenesis commented Oct 28, 2024

Do you suggest /slots[i].state should give 0 when slot_state is 0 (internally), and 1 otherwise? Or to remove the elaboration of the other values in the documentation and just have any value > 1 represent is_processing?

Right now, I use slot.state to identify idle slots which can be used to scale up the number of client requests when the number of clients is low.

@ngxson
Copy link
Collaborator

ngxson commented Oct 29, 2024

I'd suggest removing slot[i].state from the REST API and replace it with is_processing. Otherwise, it will be quite messy if slot[i].state does not match the internal slot->state.

We should also add a notice in the docs to inform that the slot data is mostly used for debugging and may have breaking changes quite often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants