Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated Junk output generated using gemma-2b-it-gpu-int4.bin on Mobile device #5534

Open
KosuriSireesha opened this issue Jul 17, 2024 · 5 comments
Assignees
Labels
platform:android Issues with Android as Platform stat:awaiting googler Waiting for Google Engineer's Response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution

Comments

@KosuriSireesha
Copy link

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Android 14

Mobile device if the issue happens on mobile device

Android Mobile device (Motorola edge 50 Ultra)

Browser and version if the issue happens on browser

No response

Programming Language and version

Kotlin

MediaPipe version

0.10.14

Bazel version

No response

Solution

LLMInference

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Gemma GPU model giving Junk output from the 2nd query

Describe the expected behaviour

Model needs to give relevant response to the user Query

Standalone code/steps you may have used to try to get what you need

Tested on Android Mobile device .
Followed the steps mentioned at -https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/android-
1.Used  LLMInference example  app from  the git https://github.com/googlesamples/mediapipe
2. Added Maven dependency in the build gradle -
 dependencies {
    implementation 'com.google.mediapipe:tasks-genai:0.10.14'
}
https://mvnrepository.com/artifact/com.google.mediapipe/tasks-genai
3 GPU model used in the LlmInference example- gemma-2b-it-gpu-int4.bin(downloaded from https://www.kaggle.com/models/google/gemma/tfLite)


4.Run the LLmInference APP on Mobile device.
5. Enter any query and the model gives response
6.Enter the second query either related to the context of the first query or any other query 
Continuous Junk response results are shown with out Done being sent .
LLm inference App needs to be restarted to use it again 

I am getting response only for one query and from second query getting the Junk output

Other info / Complete Logs

out put of the partial results from LLmInference -

36489: 07-15 16:42:41.352 24038 24072 D ChatViewModel: partialresult first message: катеринакатеринакатерина
	Line 36514: 07-15 16:42:41.455 24038 24072 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36650: 07-15 16:42:41.565 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36680: 07-15 16:42:41.666 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36692: 07-15 16:42:41.771 24038 24308 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36693: 07-15 16:42:41.873 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36699: 07-15 16:42:41.977 24038 24072 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36709: 07-15 16:42:42.081 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36710: 07-15 16:42:42.186 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36711: 07-15 16:42:42.286 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
	Line 36714: 07-15 16:42:42.387 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
@KosuriSireesha KosuriSireesha added the type:bug Bug in the Source Code of MediaPipe Solution label Jul 17, 2024
@kuaashish
Copy link
Collaborator

Hi @KosuriSireesha,

Could you please try running this sample app frome her https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android and let us know if you encounter the same issue there as well?

Thank you!!

@kuaashish kuaashish added platform:android Issues with Android as Platform task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup stat:awaiting response Waiting for user response labels Jul 17, 2024
@KosuriSireesha
Copy link
Author

Hi @kuaashish ,
Yes , I do tested the sampleapp from https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android

Issue is still reproducible

Note :On com.google.mediapipe:tasks-genai:0.10.11 , GPU model is working fine .On 0.10.14 version it failed .

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Waiting for user response label Jul 17, 2024
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@KosuriSireesha
Copy link
Author

Reopened the issue .

@kuaashish
Copy link
Collaborator

Hi @PaulTR,

Could you please look into this issue? Currently, we do not have a real device to reproduce it. Additionally, we are not sure if the issue is specific to a particular device. It seems that the sample app is also behaving the same way based on the input received.

Thank you!!

@kuaashish kuaashish assigned PaulTR and unassigned kuaashish Jul 18, 2024
@kuaashish kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:android Issues with Android as Platform stat:awaiting googler Waiting for Google Engineer's Response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution
Projects
None yet
Development

No branches or pull requests

3 participants