docs: litellm examples (#681)

Co-authored-by: Mikyo King <mikyo@arize.com>
Arize-ai · Aug 28, 2024 · b6cfe69 · b6cfe69
1 parent a646a31
commit b6cfe69
Show file tree

Hide file tree

Showing 5 changed files with 157 additions and 22 deletions.
diff --git a/README.md b/README.md
@@ -45,22 +45,23 @@ OpenInference provides a set of instrumentations for popular machine learning SD
 
 ### Examples
 
-| Name                                                                                           | Description                                                                                  | Complexity Level |
-|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|------------------|
-| [OpenAI SDK](python/instrumentation/openinference-instrumentation-openai/examples/)            | OpenAI Python SDK, including chat completions and embeddings                                 | Beginner         |
-| [MistralAI SDK](python/instrumentation/openinference-instrumentation-mistralai/examples/)      | MistralAI Python SDK                                                                         | Beginner         |
-| [VertexAI SDK](python/instrumentation/openinference-instrumentation-vertexai/examples/)        | VertexAI Python SDK                                                                          | Beginner         |
-| [LlamaIndex](python/instrumentation/openinference-instrumentation-llama-index/examples/)       | LlamaIndex query engines                                                                     | Beginner         |
-| [DSPy](python/instrumentation/openinference-instrumentation-dspy/examples/)                    | DSPy primitives and custom RAG modules                                                       | Beginner         |
-| [Boto3 Bedrock Client](python/instrumentation/openinference-instrumentation-bedrock/examples/) | Boto3 Bedrock client                                                                         | Beginner         |
-| [LangChain](python/instrumentation/openinference-instrumentation-langchain/examples/)          | LangChain primitives and simple chains                                                       | Beginner         |
-| [LiteLLM](python/instrumentation/openinference-instrumentation-litellm/)                       | A lightweight LiteLLM framework                                                              | Beginner         |
-| [Groq](python/instrumentation/openinference-instrumentation-groq/examples/)                    | Groq and AsyncGroq chat completions                                                          | Beginner         |
-| [Anthropic](python/instrumentation/openinference-instrumentation-anthropic/examples/)          | Anthropic Messages client                                                                    | Beginner         |
-| [LlamaIndex + Next.js Chatbot](python/examples/llama-index/)                                   | A fully functional chatbot using Next.js and a LlamaIndex FastAPI backend                    | Intermediate     |
-| [LangServe](python/examples/langserve/)                                                        | A LangChain application deployed with LangServe using custom metadata on a per-request basis | Intermediate     |
-| [DSPy](python/examples/dspy-rag-fastapi/)                                                      | A DSPy RAG application using FastAPI, Weaviate, and Cohere                                   | Intermediate     |
-| [Haystack](python/instrumentation/openinference-instrumentation-haystack/examples/)            | A Haystack QA RAG application                                                                | Intermediate     |
+| Name                                                                                                 | Description                                                                                  | Complexity Level |
+|------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|------------------|
+| [OpenAI SDK](python/instrumentation/openinference-instrumentation-openai/examples/)                  | OpenAI Python SDK, including chat completions and embeddings                                 | Beginner         |
+| [MistralAI SDK](python/instrumentation/openinference-instrumentation-mistralai/examples/)            | MistralAI Python SDK                                                                         | Beginner         |
+| [VertexAI SDK](python/instrumentation/openinference-instrumentation-vertexai/examples/)              | VertexAI Python SDK                                                                          | Beginner         |
+| [LlamaIndex](python/instrumentation/openinference-instrumentation-llama-index/examples/)             | LlamaIndex query engines                                                                     | Beginner         |
+| [DSPy](python/instrumentation/openinference-instrumentation-dspy/examples/)                          | DSPy primitives and custom RAG modules                                                       | Beginner         |
+| [Boto3 Bedrock Client](python/instrumentation/openinference-instrumentation-bedrock/examples/)       | Boto3 Bedrock client                                                                         | Beginner         |
+| [LangChain](python/instrumentation/openinference-instrumentation-langchain/examples/)                | LangChain primitives and simple chains                                                       | Beginner         |
+| [LiteLLM](python/instrumentation/openinference-instrumentation-litellm/)                             | A lightweight LiteLLM framework                                                              | Beginner         |
+| [LiteLLM Proxy](python/instrumentation/openinference-instrumentation-litellm/examples/litellm-proxy/)| LiteLLM Proxy to log OpenAI, Azure, Vertex, Bedrock                                          | Beginner         |
+| [Groq](python/instrumentation/openinference-instrumentation-groq/examples/)                          | Groq and AsyncGroq chat completions                                                          | Beginner         |
+| [Anthropic](python/instrumentation/openinference-instrumentation-anthropic/examples/)                | Anthropic Messages client                                                                    | Beginner         |
+| [LlamaIndex + Next.js Chatbot](python/examples/llama-index/)                                         | A fully functional chatbot using Next.js and a LlamaIndex FastAPI backend                    | Intermediate     |
+| [LangServe](python/examples/langserve/)                                                              | A LangChain application deployed with LangServe using custom metadata on a per-request basis | Intermediate     |
+| [DSPy](python/examples/dspy-rag-fastapi/)                                                            | A DSPy RAG application using FastAPI, Weaviate, and Cohere                                   | Intermediate     |
+| [Haystack](python/instrumentation/openinference-instrumentation-haystack/examples/)                  | A Haystack QA RAG application                                                                | Intermediate     |
 
 ## JavaScript
 

diff --git a/cspell.json b/cspell.json
@@ -11,6 +11,7 @@
         "instrumentator",
         "Instrumentor",
         "langchain",
+        "litellm",
         "llms",
         "nextjs",
         "openinference",

diff --git a/python/instrumentation/openinference-instrumentation-litellm/README.md b/python/instrumentation/openinference-instrumentation-litellm/README.md
@@ -1,8 +1,8 @@
-# OpenInference liteLLM Instrumentation
+# OpenInference LiteLLM Instrumentation
 
-Python autoinstrumentation library for the liteLLM package
+[LiteLLM](https://github.com/BerriAI/litellm) allows developers to call all LLM APIs using the openAI format. [LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy) is a proxy server to call 100+ LLMs in OpenAI format. Both are supported by this auto-instrumentation.
 
-This package implements OpenInference tracing for the following liteLLM functions:
+This package implements OpenInference tracing for the following LiteLLM functions:
 - completion()
 - acompletion()
 - completion_with_retries()
@@ -11,7 +11,7 @@ This package implements OpenInference tracing for the following liteLLM function
 - image_generation()
 - aimage_generation()
 
-These traces are fully OpenTelemetry compatible and can be sent to an OpenTelemetry collector for viewing, such as [Arize `phoenix`](https://github.com/Arize-ai/phoenix).
+These traces are fully OpenTelemetry compatible and can be sent to an OpenTelemetry collector for viewing, such as [Arize Phoenix](https://github.com/Arize-ai/phoenix).
 
 
 ## Installation
@@ -58,13 +58,13 @@ import os
 os.environ["OPENAI_API_KEY"] = "PASTE_YOUR_API_KEY_HERE"
 ```
 
-Instrumenting `litellm` is simple:
+Instrumenting `LiteLLM` is simple:
 
 ```python
 LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
 ```
 
-Now, all calls to `litellm` functions are instrumented and can be viewed in the `phoenix` UI.
+Now, all calls to `LiteLLM` functions are instrumented and can be viewed in the `phoenix` UI.
 
 ```python
 completion_response = litellm.completion(model="gpt-3.5-turbo", 
@@ -102,6 +102,7 @@ Now any liteLLM function calls you make will not send traces to Phoenix until in
 
 ## More Info
 
+* Details on how to setup a [LiteLLM Proxy](https://docs.litellm.ai/docs/observability/arize_integration)
 * [More info on OpenInference and Phoenix](https://docs.arize.com/phoenix)
 * [How to customize spans to track sessions, metadata, etc.](https://github.com/Arize-ai/openinference/tree/main/python/openinference-instrumentation#customizing-spans)
 * [How to account for private information and span payload customization](https://github.com/Arize-ai/openinference/tree/main/python/openinference-instrumentation#tracing-configuration)
diff --git a/...entation/openinference-instrumentation-litellm/examples/litellm-proxy/README.md b/...entation/openinference-instrumentation-litellm/examples/litellm-proxy/README.md
@@ -0,0 +1,63 @@
+# LiteLLM Proxy Server
+
+Use [LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy) to Log OpenAI, Azure, Vertex, Bedrock (100+ LLMs) to Arize
+
+Use LiteLLM Proxy for:
+- Calling 100+ LLMs OpenAI, Azure, Vertex, Bedrock/etc. in the OpenAI ChatCompletions & Completions format
+- Automatically Log all requests to Arize AI
+- Proving a central self hosted server for calling LLMs + logging to Arize 
+
+
+## Step 1. Create a Config for LiteLLM proxy
+
+LiteLLM Requires a config with all your models define - we will call this file `litellm_config.yaml`
+
+[Detailed docs on how to setup litellm config - here](https://docs.litellm.ai/docs/proxy/configs)
+
+```yaml
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: openai/fake
+      api_key: fake-key
+      api_base: https://exampleopenaiendpoint-production.up.railway.app/
+
+litellm_settings:
+  success_callback: ["arize"] # 👈 Set Arize AI as a callback
+
+environment_variables: # 👈 Set Arize AI env vars
+    ARIZE_SPACE_KEY: "d0*****"
+    ARIZE_API_KEY: "141a****"
+```
+
+Step 2. Start LiteLLM proxy
+
+```shell
+docker run \
+    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
+    -p 4000:4000 \
+    ghcr.io/berriai/litellm:main-latest \
+    --config /app/config.yaml --detailed_debug
+```
+
+Step 3. Test it - Make /chat/completions request to LiteLLM proxy
+
+```shell
+curl -i http://localhost:4000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-1234" \
+  -d '{
+    "model": "gpt-4",
+    "messages": [
+      {"role": "user", "content": "Hello, Claude gm!"}
+    ]
+}'
+```
+
+## Expected output on Arize AI below:
+
+<img width="1283" alt="Xnapper-2024-07-23-17 07 34" src="https://github.com/user-attachments/assets/7460bc2b-7f4f-4ec4-b966-2bf33a26ded5">
+
+
+## Additional Resources
+- [LiteLLM Arize AI docs](https://docs.litellm.ai/docs/observability/arize_integration)
diff --git a/python/instrumentation/openinference-instrumentation-litellm/examples/litellm.py b/python/instrumentation/openinference-instrumentation-litellm/examples/litellm.py
@@ -0,0 +1,69 @@
+import os
+
+import litellm
+import phoenix as px
+
+# Get the secret key from environment variables
+os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
+
+# Launch Phoenix app
+session = px.launch_app()
+
+# Import OpenTelemetry components
+from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter  # noqa: E402
+from opentelemetry.sdk.trace import TracerProvider  # noqa: E402
+from opentelemetry.sdk.trace.export import SimpleSpanProcessor  # noqa: E402
+
+from openinference.instrumentation.litellm import LiteLLMInstrumentor  # noqa: E402
+
+# Set up OpenTelemetry tracing
+endpoint = "http://127.0.0.1:6006/v1/traces"
+tracer_provider = TracerProvider()
+tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
+LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
+
+# Simple single message completion call
+litellm.completion(
+    model="gpt-3.5-turbo", messages=[{"content": "What's the capital of China?", "role": "user"}]
+)
+
+# Multiple message conversation completion call with added param
+litellm.completion(
+    model="gpt-3.5-turbo",
+    messages=[
+        {"content": "Hello, I want to bake a cake", "role": "user"},
+        {"content": "Hello, I can pull up some recipes for cakes.", "role": "assistant"},
+        {"content": "No actually I want to make a pie", "role": "user"},
+    ],
+    temperature=0.7,
+)
+
+# Multiple message conversation acompletion call with added params
+await litellm.acompletion(  # noqa: F704
+    model="gpt-3.5-turbo",
+    messages=[
+        {"content": "Hello, I want to bake a cake", "role": "user"},
+        {"content": "Hello, I can pull up some recipes for cakes.", "role": "assistant"},
+        {"content": "No actually I want to make a pie", "role": "user"},
+    ],
+    temperature=0.7,
+    max_tokens=20,
+)
+
+# Completion with retries
+litellm.completion_with_retries(
+    model="gpt-3.5-turbo",
+    messages=[{"content": "What's the highest grossing film ever", "role": "user"}],
+)
+
+# Embedding call
+litellm.embedding(model="text-embedding-ada-002", input=["good morning from litellm"])
+
+# Asynchronous embedding call
+await litellm.aembedding(model="text-embedding-ada-002", input=["good morning from litellm"])  # noqa: F704
+
+# Image generation call
+litellm.image_generation(model="dall-e-2", prompt="cute baby otter")
+
+# Asynchronous image generation call
+await litellm.aimage_generation(model="dall-e-2", prompt="cute baby otter")  # noqa: F704