-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: How to use Stanford HELM for local model setup? #1858
Comments
That is correct.
Yes, you will eventually be able to change the model name to whatever you want, and also add multiple model names. Support will be added in #1861 which will be merged in the next couple of days. If you'd like to try it out first, you can use that branch
Currently it only requires the model to be at
Multiple LLMs will be supported by #1861.
Yes, we support a few other ways. The most relevant ones are:
Currently HELM requires an internet connection, but we do want to support running it without an internet connection eventually. If this is an high priority issue for you, please file new issues for any issues you find.
|
Thank you so much @yifanmai for the descriptive answers. This is really helpful. I will file an issue for the offline setup if it gets in our priority too. |
Hello @yifanmai , I came across the response where you mentioned,
I have checked the documentation at https://crfm-helm.readthedocs.io/en/latest/huggingface_models/ regarding evaluating using a Hugging Face checkpoint. However, I wanted to confirm if there are any additional instructions or recommendations beyond what's covered in that documentation. I would greatly appreciate any further guidance on this matter. Thank you for your assistance and the excellent work on HELM! |
Regarding the Hugging Face Hub model integration, the documentation should cover all of the main functionality. Additionally, there is a hidden experimental flag |
@yifanmai Also, what are the expected outputs (by HELM) when it makes the call to the /process and /tokenize endpoints. I tried to infer this by looking at the code the server side code currently looks slightly buggy (issue: llm-efficiency-challenge/neurips_llm_efficiency_challenge#47) \c @drisspg |
Prologue
Here is my understanding of the high-level architecture of HELM (Please correct me if I am wrong here). HELM acts as a client that sends some texts (from the benchmark dataset) to some server (HuggingFace, Open AI, etc...), and the string that is returned back to the client is then evaluated through HELM. The models we are interested in are written down in the spec file. When it comes to local, we provide
neurips/local
so that it knows we are using a local model.NeurIPS Efficiency Challenge
NeurIPS's LLM challenge recently very popular. There is an implementation in the sample submission folder, by Lightning AI and LlamaRecipies. I am taking Lightning AI's implementation in this case. They implemented a simple FastAPI server, with two endpoints
/tokenize
and/process
.My questions
neurips/local
, can I do it through<myusername>/local
? I skimmed through the documentation and seems like we can add more models, but did not get clear.neurips/local
is set, how does HELM get to know there is some localhost being set up and it needs to send the requests there? Does it look for some specific port?The HELM package is super helpful and an awesome initiative by Stanford CRFM for evaluating LLMs with such huge number of scenarios, benchmarks and metrics.
The text was updated successfully, but these errors were encountered: