[beam search] add output for manually checking the correctness #8684

youkaichao · 2024-09-21T02:54:54Z

I find facebook/opt-125m only repeats itself. The output of TinyLlama/TinyLlama-1.1B-Chat-v1.0 is better, although it is still not very sensible.

[2024-09-21T01:57:27Z] >>>0-th hf output: -- | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT-J is a | [2024-09-21T01:57:27Z] >>>0-th vllm output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT-J is a | [2024-09-21T01:57:27Z] >>>1-th hf output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT-J is an | [2024-09-21T01:57:27Z] >>>1-th vllm output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT-J is an | [2024-09-21T01:57:27Z] >>>2-th hf output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications, such as chatbots, language translation, and text generation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications, such as chatbots, language translation, and text generation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT | [2024-09-21T01:57:27Z] >>>2-th vllm output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications, such as chatbots, language translation, and text generation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications, such as chatbots, language translation, and text generation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. GPT-J: GPT | [2024-09-21T01:57:27Z] >>>3-th hf output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. T5: T5 is a language model pre- | [2024-09-21T01:57:27Z] >>>3-th vllm output: | [2024-09-21T01:57:27Z] vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 2. OpenAI GPT-3: OpenAI GPT-3 is a language model pre-trained on a large corpus of text. It can generate human-like text and has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 3. GPT-Neo: GPT-Neo is an improved version of GPT-3 that has been trained on a larger corpus of text. It has been used in various applications such as chatbots, text generation, and translation. | [2024-09-21T01:57:27Z] | [2024-09-21T01:57:27Z] 4. T5: T5 is a language model pre-

-Original file line number
+Diff line change
@@ Expand Up / @@ -11,7 +11,7 @@ @@
     #   3. Use the model "huggyllama/llama-7b".
     MAX_TOKENS = [128]
     BEAM_WIDTHS = [4]
-    MODELS = ["facebook/opt-125m"]
+    MODELS = ["TinyLlama/TinyLlama-1.1B-Chat-v1.0"]
     @pytest.mark.parametrize("model", MODELS)
@@ Expand All / @@ -37,8 +37,15 @@ def test_beam_search_single_input( @@
                                                            beam_width, max_tokens)
         for i in range(len(example_prompts)):
-            hf_output_ids, _ = hf_outputs[i]
-            vllm_output_ids, _ = vllm_outputs[i]
+            hf_output_ids, hf_output_texts = hf_outputs[i]
+            vllm_output_ids, vllm_output_texts = vllm_outputs[i]
+            for i, (hf_text,
+                    vllm_text) in enumerate(zip(hf_output_texts,
+                                                vllm_output_texts)):
+                print(f">>>{i}-th hf output:")
+                print(hf_text)
+                print(f">>>{i}-th vllm output:")
+                print(vllm_text)
             assert len(hf_output_ids) == len(vllm_output_ids)
             for j in range(len(hf_output_ids)):
                 assert hf_output_ids[j] == vllm_output_ids[j], (
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[beam search] add output for manually checking the correctness #8684

Diff view

Diff view

There are no files selected for viewing

youkaichao Sep 21, 2024

[beam search] add output for manually checking the correctness #8684

[beam search] add output for manually checking the correctness #8684

Diff view

Diff view

There are no files selected for viewing

youkaichao Sep 21, 2024

Choose a reason for hiding this comment