Skip to content

Commit

Permalink
docs: litellm examples (#681)
Browse files Browse the repository at this point in the history
Co-authored-by: Mikyo King <mikyo@arize.com>
  • Loading branch information
harrisonchu and mikeldking authored Aug 28, 2024
1 parent a646a31 commit b6cfe69
Show file tree
Hide file tree
Showing 5 changed files with 157 additions and 22 deletions.
33 changes: 17 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,22 +45,23 @@ OpenInference provides a set of instrumentations for popular machine learning SD

### Examples

| Name | Description | Complexity Level |
|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|------------------|
| [OpenAI SDK](python/instrumentation/openinference-instrumentation-openai/examples/) | OpenAI Python SDK, including chat completions and embeddings | Beginner |
| [MistralAI SDK](python/instrumentation/openinference-instrumentation-mistralai/examples/) | MistralAI Python SDK | Beginner |
| [VertexAI SDK](python/instrumentation/openinference-instrumentation-vertexai/examples/) | VertexAI Python SDK | Beginner |
| [LlamaIndex](python/instrumentation/openinference-instrumentation-llama-index/examples/) | LlamaIndex query engines | Beginner |
| [DSPy](python/instrumentation/openinference-instrumentation-dspy/examples/) | DSPy primitives and custom RAG modules | Beginner |
| [Boto3 Bedrock Client](python/instrumentation/openinference-instrumentation-bedrock/examples/) | Boto3 Bedrock client | Beginner |
| [LangChain](python/instrumentation/openinference-instrumentation-langchain/examples/) | LangChain primitives and simple chains | Beginner |
| [LiteLLM](python/instrumentation/openinference-instrumentation-litellm/) | A lightweight LiteLLM framework | Beginner |
| [Groq](python/instrumentation/openinference-instrumentation-groq/examples/) | Groq and AsyncGroq chat completions | Beginner |
| [Anthropic](python/instrumentation/openinference-instrumentation-anthropic/examples/) | Anthropic Messages client | Beginner |
| [LlamaIndex + Next.js Chatbot](python/examples/llama-index/) | A fully functional chatbot using Next.js and a LlamaIndex FastAPI backend | Intermediate |
| [LangServe](python/examples/langserve/) | A LangChain application deployed with LangServe using custom metadata on a per-request basis | Intermediate |
| [DSPy](python/examples/dspy-rag-fastapi/) | A DSPy RAG application using FastAPI, Weaviate, and Cohere | Intermediate |
| [Haystack](python/instrumentation/openinference-instrumentation-haystack/examples/) | A Haystack QA RAG application | Intermediate |
| Name | Description | Complexity Level |
|------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|------------------|
| [OpenAI SDK](python/instrumentation/openinference-instrumentation-openai/examples/) | OpenAI Python SDK, including chat completions and embeddings | Beginner |
| [MistralAI SDK](python/instrumentation/openinference-instrumentation-mistralai/examples/) | MistralAI Python SDK | Beginner |
| [VertexAI SDK](python/instrumentation/openinference-instrumentation-vertexai/examples/) | VertexAI Python SDK | Beginner |
| [LlamaIndex](python/instrumentation/openinference-instrumentation-llama-index/examples/) | LlamaIndex query engines | Beginner |
| [DSPy](python/instrumentation/openinference-instrumentation-dspy/examples/) | DSPy primitives and custom RAG modules | Beginner |
| [Boto3 Bedrock Client](python/instrumentation/openinference-instrumentation-bedrock/examples/) | Boto3 Bedrock client | Beginner |
| [LangChain](python/instrumentation/openinference-instrumentation-langchain/examples/) | LangChain primitives and simple chains | Beginner |
| [LiteLLM](python/instrumentation/openinference-instrumentation-litellm/) | A lightweight LiteLLM framework | Beginner |
| [LiteLLM Proxy](python/instrumentation/openinference-instrumentation-litellm/examples/litellm-proxy/)| LiteLLM Proxy to log OpenAI, Azure, Vertex, Bedrock | Beginner |
| [Groq](python/instrumentation/openinference-instrumentation-groq/examples/) | Groq and AsyncGroq chat completions | Beginner |
| [Anthropic](python/instrumentation/openinference-instrumentation-anthropic/examples/) | Anthropic Messages client | Beginner |
| [LlamaIndex + Next.js Chatbot](python/examples/llama-index/) | A fully functional chatbot using Next.js and a LlamaIndex FastAPI backend | Intermediate |
| [LangServe](python/examples/langserve/) | A LangChain application deployed with LangServe using custom metadata on a per-request basis | Intermediate |
| [DSPy](python/examples/dspy-rag-fastapi/) | A DSPy RAG application using FastAPI, Weaviate, and Cohere | Intermediate |
| [Haystack](python/instrumentation/openinference-instrumentation-haystack/examples/) | A Haystack QA RAG application | Intermediate |

## JavaScript

Expand Down
1 change: 1 addition & 0 deletions cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
"instrumentator",
"Instrumentor",
"langchain",
"litellm",
"llms",
"nextjs",
"openinference",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# OpenInference liteLLM Instrumentation
# OpenInference LiteLLM Instrumentation

Python autoinstrumentation library for the liteLLM package
[LiteLLM](https://github.com/BerriAI/litellm) allows developers to call all LLM APIs using the openAI format. [LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy) is a proxy server to call 100+ LLMs in OpenAI format. Both are supported by this auto-instrumentation.

This package implements OpenInference tracing for the following liteLLM functions:
This package implements OpenInference tracing for the following LiteLLM functions:
- completion()
- acompletion()
- completion_with_retries()
Expand All @@ -11,7 +11,7 @@ This package implements OpenInference tracing for the following liteLLM function
- image_generation()
- aimage_generation()

These traces are fully OpenTelemetry compatible and can be sent to an OpenTelemetry collector for viewing, such as [Arize `phoenix`](https://github.com/Arize-ai/phoenix).
These traces are fully OpenTelemetry compatible and can be sent to an OpenTelemetry collector for viewing, such as [Arize Phoenix](https://github.com/Arize-ai/phoenix).


## Installation
Expand Down Expand Up @@ -58,13 +58,13 @@ import os
os.environ["OPENAI_API_KEY"] = "PASTE_YOUR_API_KEY_HERE"
```

Instrumenting `litellm` is simple:
Instrumenting `LiteLLM` is simple:

```python
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
```

Now, all calls to `litellm` functions are instrumented and can be viewed in the `phoenix` UI.
Now, all calls to `LiteLLM` functions are instrumented and can be viewed in the `phoenix` UI.

```python
completion_response = litellm.completion(model="gpt-3.5-turbo",
Expand Down Expand Up @@ -102,6 +102,7 @@ Now any liteLLM function calls you make will not send traces to Phoenix until in

## More Info

* Details on how to setup a [LiteLLM Proxy](https://docs.litellm.ai/docs/observability/arize_integration)
* [More info on OpenInference and Phoenix](https://docs.arize.com/phoenix)
* [How to customize spans to track sessions, metadata, etc.](https://github.com/Arize-ai/openinference/tree/main/python/openinference-instrumentation#customizing-spans)
* [How to account for private information and span payload customization](https://github.com/Arize-ai/openinference/tree/main/python/openinference-instrumentation#tracing-configuration)
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# LiteLLM Proxy Server

Use [LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy) to Log OpenAI, Azure, Vertex, Bedrock (100+ LLMs) to Arize

Use LiteLLM Proxy for:
- Calling 100+ LLMs OpenAI, Azure, Vertex, Bedrock/etc. in the OpenAI ChatCompletions & Completions format
- Automatically Log all requests to Arize AI
- Proving a central self hosted server for calling LLMs + logging to Arize


## Step 1. Create a Config for LiteLLM proxy

LiteLLM Requires a config with all your models define - we will call this file `litellm_config.yaml`

[Detailed docs on how to setup litellm config - here](https://docs.litellm.ai/docs/proxy/configs)

```yaml
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

litellm_settings:
success_callback: ["arize"] # 👈 Set Arize AI as a callback

environment_variables: # 👈 Set Arize AI env vars
ARIZE_SPACE_KEY: "d0*****"
ARIZE_API_KEY: "141a****"
```
Step 2. Start LiteLLM proxy
```shell
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
```

Step 3. Test it - Make /chat/completions request to LiteLLM proxy

```shell
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
]
}'
```

## Expected output on Arize AI below:

<img width="1283" alt="Xnapper-2024-07-23-17 07 34" src="https://github.com/user-attachments/assets/7460bc2b-7f4f-4ec4-b966-2bf33a26ded5">


## Additional Resources
- [LiteLLM Arize AI docs](https://docs.litellm.ai/docs/observability/arize_integration)
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
import os

import litellm
import phoenix as px

# Get the secret key from environment variables
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

# Launch Phoenix app
session = px.launch_app()

# Import OpenTelemetry components
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter # noqa: E402
from opentelemetry.sdk.trace import TracerProvider # noqa: E402
from opentelemetry.sdk.trace.export import SimpleSpanProcessor # noqa: E402

from openinference.instrumentation.litellm import LiteLLMInstrumentor # noqa: E402

# Set up OpenTelemetry tracing
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)

# Simple single message completion call
litellm.completion(
model="gpt-3.5-turbo", messages=[{"content": "What's the capital of China?", "role": "user"}]
)

# Multiple message conversation completion call with added param
litellm.completion(
model="gpt-3.5-turbo",
messages=[
{"content": "Hello, I want to bake a cake", "role": "user"},
{"content": "Hello, I can pull up some recipes for cakes.", "role": "assistant"},
{"content": "No actually I want to make a pie", "role": "user"},
],
temperature=0.7,
)

# Multiple message conversation acompletion call with added params
await litellm.acompletion( # noqa: F704
model="gpt-3.5-turbo",
messages=[
{"content": "Hello, I want to bake a cake", "role": "user"},
{"content": "Hello, I can pull up some recipes for cakes.", "role": "assistant"},
{"content": "No actually I want to make a pie", "role": "user"},
],
temperature=0.7,
max_tokens=20,
)

# Completion with retries
litellm.completion_with_retries(
model="gpt-3.5-turbo",
messages=[{"content": "What's the highest grossing film ever", "role": "user"}],
)

# Embedding call
litellm.embedding(model="text-embedding-ada-002", input=["good morning from litellm"])

# Asynchronous embedding call
await litellm.aembedding(model="text-embedding-ada-002", input=["good morning from litellm"]) # noqa: F704

# Image generation call
litellm.image_generation(model="dall-e-2", prompt="cute baby otter")

# Asynchronous image generation call
await litellm.aimage_generation(model="dall-e-2", prompt="cute baby otter") # noqa: F704

0 comments on commit b6cfe69

Please sign in to comment.