Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Return hidden states (in progress?) #6165

Open
Elanmarkowitz opened this issue Jul 6, 2024 · 9 comments
Open

[Feature]: Return hidden states (in progress?) #6165

Elanmarkowitz opened this issue Jul 6, 2024 · 9 comments
Assignees

Comments

@Elanmarkowitz
Copy link

Elanmarkowitz commented Jul 6, 2024

🚀 The feature, motivation and pitch

I know this feature request sort of already exists: #5950
(and older, semi related requests) #3594 #1857

This is a similar pitch but I am creating a new issue as I noticed newer developments in the codebase. The pitch is to support returning hidden states when generating sequences. This enables many potential behaviors such as output classification, guardrails, etc. Whereas #5950 suggested a different step for embedding, I would suggest building it in as an option to EngineArgs or as an option that can be passed in with each generation request.

I see that in v0.5.1 there is already some new code in ModelDriverBase to support return_hidden_states. However, I don't see that supported yet in the LLM engine yet (not an input to EngineArgs). Basically, it seems like this feature is under development. I am mainly wondering what the timeline is for that? And what is the approach being taken so that I and the community can develop accordingly?

Alternatives

No response

Additional context

No response

@LiuXiaoxuanPKU
Copy link
Collaborator

Thanks for the question! We currently use return_hidden_states for speculative decoding. You just need to pass it a a config as here. Feel free to mimic the behavior there.

@Hambaobao
Copy link

Hi, I also have the same need. I hope to store the hidden_states during model inference so that I can conduct some interpretability research.

@PeterAdam2015
Copy link

same need, hope we can get this as an option to return embedding.

@ummagumm-a
Copy link

same need!

@freesunshine0316
Copy link

Thanks for the question! We currently use return_hidden_states for speculative decoding. You just need to pass it a a config as here. Feel free to mimic the behavior there.

Hi,
can you further specify, e.g. with demo code?

@J0hnArren
Copy link

J0hnArren commented Aug 1, 2024

same need

2 similar comments
@Gxy-2001
Copy link

same need

@zkwhandan
Copy link

same need

@LiuXiaoxuanPKU LiuXiaoxuanPKU self-assigned this Sep 24, 2024
@jvlinsta
Copy link

Same need to generate some attention heatmaps, akin to
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants