-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate control vector using llama.cpp #6880
Comments
This looks to just be computing the single eigenvector associated with the dominant eigenvalue? If so, then this can be computed very easily in about 10 lines of C: https://en.m.wikipedia.org/wiki/Power_iteration If for any reason this doesn't work then you can also compute via the SVD which can easily be found using gradient descent: minimise the Frobenius-norm of the outer product of two vectors, then standardise the vectors to have an L2-norm of 1 (this method is often used in recommendation systems). |
I'm just here to comment that this would be a really useful feature! Having a non python way of doing this would be ideal. I am willing to pay a minimum of 100 USD for a working solution I can apply. (sorry if that's not much, I am just a hobbyist paying out of my own pocket, I hope its not a insultingly small amount) |
+1 on this, would be useful. I can try my hand at implementing something in a little while, but I'm new to C++ and can't guarantee anything. |
@jukofyork Thanks for the info. Unfortunately I'm not very good at math so I'm quite struggle understanding. It would be nice if someone can implement something equivalent to PCA but on cpp. @christianazinn FYI, this example from eval-callback can be a good start if you want to give it a try. |
Power Iteration is one of the simplest algorithms imaginable:
The only reason for step 2 is to stop it blowing up and overflowing the machine precision. If the matrix is "nice" then this will converge quite quickly and you'll find the vector doesn't change hardly at all and you have found your principle eigenvector (or equivalently the first principle component). If you try this on a 2D example you'll see how it works: it looks like the needle of a compass settling towards north after you jolt it. The step 2 above makes the tip of the needle always touch the edge of the circle with radius 1 (and in 3D+ the shell of a unit (hyper-)sphere). If you don't bother with step 2 then it will still end up pointing in the right direction, but just be composed of huge numbers.... The 2D example shows how the algorithm can struggle for "non-nice" matrices too: if the 1st and 2nd eigenvector are at right angles to each other it will converge quickly, but if they are pointing in approximately the same direction, it will take much longer. The needle on a compass analogy would be having a strong magnet somewhere near that is almost north but not quite north. If the matrix isn't square then you need use another technique called "Singular Value Decomposition" which is more involved, but not hard to implement if all you care about is getting a very low-rank approximation (you can just use gradient descent). EDIT: Here's a nice video showing it in 2D: https://www.youtube.com/watch?v=wRhYfAObXzY and 3D: https://www.youtube.com/watch?v=AtmpkYYSMk4 (those aren't "nice" matrices and hence why it struggles because of the large off diagonal values!). |
@jukofyork Thanks for the direction. It's not the easiest thing that I can understand, but I'll give it a try. @Yorizuka @christianazinn I have a draft PR just to get some directions. I'm not doing this for bug-bounty motivation, so if there're someone can help me out, you can take the bounty if you want. Thank you. |
Thanks, will move discussion there. Not doing this for bounty motivation either. Currently have PCA working-ish but am stuck getting vector normalization to work ( |
Did you manage to get any further with this? |
@jukofyork FYI, I've already got a working version with good performance. Here is the last result: #7514 (comment) |
Motivation
Support for control vector is added in #5970 , but to generate the vector, users must run python code (which uses huggingface API instead of llama.cpp)
Possible Implementation
By looking at https://github.com/vgel/repeng/blob/main/repeng/extract.py , we can see some key steps that need to be adapted to cpp:
batched_get_hiddens
returns a list of embedding vectors from hidden layers. This can be done in llama.cpp using eval callbackThe text was updated successfully, but these errors were encountered: