-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.4.0-rc2 Android Demo App for Llama crashing on Pixel 8 when first prompt submitted #5582
Comments
Hi @grisaitis may I ask which phone are you using and how many RAM is it? It's not related to app/JNI, but runtime itself. |
Sure - Google Pixel 8, with 8GB. Confirmed from my system settings. Thanks so much for helping troubleshoot this! This library is amazing. |
@grisaitis Thank you for the info! Do you have any |
looks like it is a low memory issue, but i'm not sure. these are the
any additional tips appreciated! |
PS here are all logcat messages, from i asked claude to analyze the output. it suggested the error was originating in the
any additional tips appreciated. perhaps i should build the executorch library with logging enabled? eg with |
Thank you! Seems like a low memory issue. For logging, please |
@grisaitis The good news is you can try with the new LLAMA 3.2 1B model https://github.com/pytorch/executorch/pull/5640/files#diff-9a76949fefe609e9fcad45db3f4c6cb18d7913054d0122f672b30f523cadbc78R70 |
thanks! i'll try this. clarification - should i simply do the edit like this? removing the line for
i will probably have to make these changes not in as for other models, i downloaded llama 3 8B and quantized it. i also started downloading 3.1 8B and 3.2 3B. will try those out later, if llama 3 8B doesn't work. |
That's correct. Remove What missing env variables is it? If it's related to copying artifacts for upload please disregard :) I will clean up it later.
Maybe try 3.2 1B? It has lower RAM requirement |
thanks! i did this along with diff
i then rebuilt the app, and tried loading llama 3 8B again. goal: see if memory runs out, according to executorch logs
interestingly, it no longer crashes. rather, when i enter a prompt like "cheese", it responds with an empty string. screenshot: https://github.com/user-attachments/assets/66ef6138-d217-4197-b542-d3f600ba8ef5 here is the logcat output from the seconds before and after entering the prompt "cheese" there appears to be some issue with loading the tokenizer. i built this with relatedly however, i see that my settings for "model" and "tokenizer" are still from when i ran llama 2 7B. i have no way of clearing the "tokenizer" setting. maybe this is causing a problem? do i need to also push screenshot: https://github.com/user-attachments/assets/814a5b46-0e70-4050-a7b5-215ddf2c2ccd |
Thanks for the update @grisaitis yes you need to push the tokenizer file and select it. Please make sure to use llama 3 tokenizer (tiktoken) for llama 3 model |
Thanks so much for your help! Sorry for not following up sooner. I indeed got the demo to work with LLaMA 3.2 3B. It works beautifully. My only feedback is to modify the documentation so that, also in the Android section (or perhaps before anything platform-specific?), it is mentioned to rename |
In the latest Lines 345 to 346 in c48d867
|
🐛 Describe the bug
I find that when I submit my first prompt on the demo app, the app crashes.
See this screenshot video:
hello
and press the enter buttonscreen-20240924-174804.mp4
This is the logcat output:
Any idea why this is happening?
Versions
PyTorch version: 2.5.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 14.4.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.30.3
Libc version: N/A
Python version: 3.10.0 (default, Mar 3 2022, 03:54:28) [Clang 12.0.0 ] (64-bit runtime)
Python platform: macOS-14.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M1
Versions of relevant libraries:
[pip3] executorch==0.4.0a0+759e0c8
[pip3] numpy==1.26.4
[pip3] torch==2.5.0
[pip3] torchaudio==2.5.0
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0
[conda] executorch 0.4.0a0+759e0c8 pypi_0 pypi
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.5.0 pypi_0 pypi
[conda] torchaudio 2.5.0 pypi_0 pypi
[conda] torchsr 1.0.4 pypi_0 pypi
[conda] torchvision 0.20.0 pypi_0 pypi
build settings for the app
i'm following these tutorial pages (note, these are from v0.4.0-rc2 release)
The text was updated successfully, but these errors were encountered: