Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate control vector using llama.cpp #6880

Closed
ngxson opened this issue Apr 24, 2024 · 9 comments · Fixed by #7514
Closed

Generate control vector using llama.cpp #6880

ngxson opened this issue Apr 24, 2024 · 9 comments · Fixed by #7514
Labels
enhancement New feature or request

Comments

@ngxson
Copy link
Collaborator

ngxson commented Apr 24, 2024

Motivation

Support for control vector is added in #5970 , but to generate the vector, users must run python code (which uses huggingface API instead of llama.cpp)

Possible Implementation

By looking at https://github.com/vgel/repeng/blob/main/repeng/extract.py , we can see some key steps that need to be adapted to cpp:

  • batched_get_hiddens returns a list of embedding vectors from hidden layers. This can be done in llama.cpp using eval callback
  • For the moment, we can't find any lightweight implementation of Principal component analysis (PCA) in cpp. An idea would be to replace it with this UMAP C++ implementation
@ngxson ngxson added the enhancement New feature or request label Apr 24, 2024
@jukofyork
Copy link
Contributor

* For the moment, we can't find any lightweight implementation of Principal component analysis (PCA) in cpp. An idea would be to replace it with [this UMAP C++ implementation ](https://github.com/LTLA/umappp)
pca_model = PCA(n_components=1, whiten=False).fit(train)
Parameters:

    n_componentsint, float or ‘mle’, default=None

        Number of components to keep. if n_components is not set all components are kept:

This looks to just be computing the single eigenvector associated with the dominant eigenvalue? If so, then this can be computed very easily in about 10 lines of C:

https://en.m.wikipedia.org/wiki/Power_iteration

If for any reason this doesn't work then you can also compute via the SVD which can easily be found using gradient descent: minimise the Frobenius-norm of the outer product of two vectors, then standardise the vectors to have an L2-norm of 1 (this method is often used in recommendation systems).

@Yorizuka
Copy link

Yorizuka commented May 17, 2024

I'm just here to comment that this would be a really useful feature! Having a non python way of doing this would be ideal.
I am willing to put out a small bounty on this if that will motivate someone to do it!

I am willing to pay a minimum of 100 USD for a working solution I can apply. (sorry if that's not much, I am just a hobbyist paying out of my own pocket, I hope its not a insultingly small amount)

@christianazinn
Copy link
Contributor

+1 on this, would be useful. I can try my hand at implementing something in a little while, but I'm new to C++ and can't guarantee anything.

@ngxson
Copy link
Collaborator Author

ngxson commented May 22, 2024

@jukofyork Thanks for the info. Unfortunately I'm not very good at math so I'm quite struggle understanding. It would be nice if someone can implement something equivalent to PCA but on cpp.

@christianazinn FYI, this example from eval-callback can be a good start if you want to give it a try.

@jukofyork
Copy link
Contributor

jukofyork commented May 22, 2024

@jukofyork Thanks for the info. Unfortunately I'm not very good at math so I'm quite struggle understanding. It would be nice if someone can implement something equivalent to PCA but on cpp.

Power Iteration is one of the simplest algorithms imaginable:

  1. Pick a random vector, v.
  2. Divide all the elements in the vector v by their sum of squared values and take the square root to create v_norm.
  3. Multiply this vector by the target matrix, v = W . v_norm.
  4. Goto step 2.

The only reason for step 2 is to stop it blowing up and overflowing the machine precision.

If the matrix is "nice" then this will converge quite quickly and you'll find the vector doesn't change hardly at all and you have found your principle eigenvector (or equivalently the first principle component).

If you try this on a 2D example you'll see how it works: it looks like the needle of a compass settling towards north after you jolt it. The step 2 above makes the tip of the needle always touch the edge of the circle with radius 1 (and in 3D+ the shell of a unit (hyper-)sphere). If you don't bother with step 2 then it will still end up pointing in the right direction, but just be composed of huge numbers....

The 2D example shows how the algorithm can struggle for "non-nice" matrices too: if the 1st and 2nd eigenvector are at right angles to each other it will converge quickly, but if they are pointing in approximately the same direction, it will take much longer. The needle on a compass analogy would be having a strong magnet somewhere near that is almost north but not quite north.

If the matrix isn't square then you need use another technique called "Singular Value Decomposition" which is more involved, but not hard to implement if all you care about is getting a very low-rank approximation (you can just use gradient descent).

EDIT: Here's a nice video showing it in 2D: https://www.youtube.com/watch?v=wRhYfAObXzY and 3D: https://www.youtube.com/watch?v=AtmpkYYSMk4

(those aren't "nice" matrices and hence why it struggles because of the large off diagonal values!).

@ngxson
Copy link
Collaborator Author

ngxson commented May 24, 2024

@jukofyork Thanks for the direction. It's not the easiest thing that I can understand, but I'll give it a try.

@Yorizuka @christianazinn I have a draft PR just to get some directions. I'm not doing this for bug-bounty motivation, so if there're someone can help me out, you can take the bounty if you want. Thank you.

@christianazinn
Copy link
Contributor

I have a draft PR just to get some directions. I'm not doing this for bug-bounty motivation, so if there're someone can help me out, you can take the bounty if you want. Thank you.

Thanks, will move discussion there. Not doing this for bounty motivation either. Currently have PCA working-ish but am stuck getting vector normalization to work (ggml_norm isn't working how I expect it to and there's no docs :/).

@jukofyork
Copy link
Contributor

Did you manage to get any further with this?

@ngxson
Copy link
Collaborator Author

ngxson commented Jun 11, 2024

@jukofyork FYI, I've already got a working version with good performance. Here is the last result: #7514 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants