diff --git a/docs/docs/observability/01-overview.mdx b/docs/docs/observability/01-overview.mdx new file mode 100644 index 0000000000..4fe0a2a962 --- /dev/null +++ b/docs/docs/observability/01-overview.mdx @@ -0,0 +1,124 @@ +--- +title: Overview +description: Learn how to instrument your application with Agenta for enhanced observability. This guide covers the benefits of observability, how Agenta helps, and how to get started. +--- + +```mdx-code-block +import DocCard from '@theme/DocCard'; +import clsx from 'clsx'; +import Image from "@theme/IdealImage"; + +``` + +## Why Observability? + +Observability is the practice of monitoring and understanding the behavior of your LLM application. With Agenta, you can add a few lines of code to start tracking all inputs, outputs, and metadata of your application. +This allows you to: + +- **Debug Effectively**: View exact prompts sent and contexts retrieved. For complex workflows like agents, you can trace the data flow and quickly identify root causes. +- **Bootstrap Test Sets**: Track real-world inputs and outputs and use them to bootstrap test sets in which you can continuously iterate. +- **Find Edge Cases**: Identify latency spikes and cost increases. Understand performance bottlenecks to optimize your app's speed and cost-effectiveness. +- **Track Costs and Latency Over Time**: Monitor how your app's expenses and response times change. +- **Compare App Versions**: Compare the behavior in productions of different versions of your application to see which performs better. + +Illustration of observability + +## Observability in Agenta + +Agenta's observability features are built on **OpenTelemetry (OTel)**, an open-source standard for application observability. This provides several advantages: + +- **Wide Library Support**: Use many supported libraries right out of the box. +- **Vendor Neutrality**: Send your traces to platforms like New Relic or Datadog without code changes. Switch vendors at will. +- **Proven Reliability**: Use a mature and actively maintained SDK that's trusted in the industry. +- **Ease of Integration**: If you're familiar with OTel, you already know how to instrument your app with Agenta. No new concepts or syntax to learn—Agenta uses familiar OTel concepts like traces and spans. + +## Key Concepts + +**Traces**: A trace represents the complete journey of a request through your application. In our context, a trace corresponds to a single request to your LLM application. + +**Spans**: A span is a unit of work within a trace. Spans can be nested, forming a tree-like structure. The root span represents the overall operation, while child spans represent sub-operations. Agenta enriches each span with cost information and metadata when you make LLM calls. + +## Next Steps + +
+
+ + +
+ +
+ +
+
+ +### Integrations + +
+ +
+ +
+ +
+ +
+
+
+ +
+ +
+ +
+ +
+
diff --git a/docs/docs/observability/02-quickstart.mdx b/docs/docs/observability/02-quickstart.mdx new file mode 100644 index 0000000000..1e00398f51 --- /dev/null +++ b/docs/docs/observability/02-quickstart.mdx @@ -0,0 +1,107 @@ +--- +title: Quick Start +--- + +```mdx-code-block +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import Image from "@theme/IdealImage"; + +``` + +Agenta enables you to capture all inputs, outputs, and metadata from your LLM applications, **whether they're hosted within Agenta or running in your own environment**. + +This guide will walk you through setting up observability for an OpenAI application running locally. + +:::note +If you create an application through the Agenta UI, tracing is enabled by default. No additional setup is required—simply go to the observability view to see all your requests. +::: + +## Step-by-Step Guide + +### 1. Install Required Packages + +First, install the Agenta SDK, OpenAI, and the OpenTelemetry instrumentor for OpenAI: + +```bash +pip install -U agenta openai opentelemetry-instrumentation-openai +``` + +### 2. Configure Environment Variables + + + +If you're using Agenta Cloud or Enterprise Edition, you'll need an API key: + +1. Visit the [Agenta API Keys page](https://cloud.agenta.ai/settings?tab=apiKeys). +2. Click on **Create New API Key** and follow the prompts. + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +### 3. Instrument Your Application + +Below is a sample script to instrument an OpenAI application: + +```python +# highlight-start +import agenta as ag +from opentelemetry.instrumentation.openai import OpenAIInstrumentor +import openai +# highlight-end + +# highlight-start +ag.init() +# highlight-end + +# highlight-start +OpenAIInstrumentor().instrument() +# highlight-end + +response = openai.ChatCompletion.create( + model="gpt-3.5-turbo", + messages=[ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Write a short story about AI Engineering."}, + ], +) + +print(response.choices[0].message.content) +``` + +**Explanation**: + +- **Import Libraries**: Import Agenta, OpenAI, and the OpenTelemetry instrumentor. +- **Initialize Agenta**: Call `ag.init()` to initialize the Agenta SDK. +- **Instrument OpenAI**: Use `OpenAIInstrumentor().instrument()` to enable tracing for OpenAI calls. + +### 4. View Traces in the Agenta UI + +After running your application, you can view the captured traces in Agenta: + +1. Log in to your Agenta dashboard. +2. Navigate to the **Observability** section. +3. You'll see a list of traces corresponding to your application's requests. + +Illustration of observability diff --git a/docs/docs/observability/03-observability-sdk.mdx b/docs/docs/observability/03-observability-sdk.mdx new file mode 100644 index 0000000000..2ebe100ff7 --- /dev/null +++ b/docs/docs/observability/03-observability-sdk.mdx @@ -0,0 +1,189 @@ +--- +title: Observability SDK +--- + +```mdx-code-block +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import Image from "@theme/IdealImage"; + +``` + +This guide shows you how to use the Agenta observability SDK to instrument workflows in your application. + +If you're exclusively using a framework like [LangChain](/observability/integrations/langchain), you can use the auto-instrumentation packages to automatically instrument your application. + +However, if you need more flexibility, you can use the SDK to: + +- Instrument custom functions in your workflow +- Add additional **metadata** to the spans +- Link traces to **applications**, **variants**, and **environments** in Agenta + +## Setup + +**1. Install the Agenta SDK** + +```bash +pip install -U agenta +``` + +**2. Set environment variables** + + + + +1. Visit the [Agenta API Keys page](https://cloud.agenta.ai/settings?tab=apiKeys). +2. Click on **Create New API Key** and follow the prompts. + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +## Instrumenting functions with the decorator + +To instrument a function, add the `@ag.instrument()` decorator. This automatically captures all input and output data. + +The decorator has a `spankind` argument to categorize each span in the UI. Available types are: + +`agent`, `chain`, `workflow`, `tool`, `embedding`, `query`, `completion`, `chat`, `rerank` + +:::caution +The instrument decorator should be the top-most decorator on a function (i.e. the last decorator before the function call). +::: + +```python +@ag.instrument(spankind="task") +def myllmcall(country:str): + prompt = f"What is the capital of {country}" + response = client.chat.completions.create( + model='gpt-4', + messages=[ + {'role': 'user', 'content': prompt}, + ], + ) + return response.choices[0].text + +@ag.instrument(spankind="workflow") +def generate(country:str): + return myllmcall(country) + +``` + +Agenta automatically determines the parent span based on the function call and nests the spans accordingly. + +## Modify a span's metadata + +You can add additional information to a span's metadata using `ag.tracing.store_meta()`. This function accesses the active span from the context and adds the key-value pairs to the metadata. + +```python +@ag.instrument(spankind="task") +def compile_prompt(country:str): + prompt = f"What is the capital of {country}" + + # highlight-next-line + ag.tracing.store_meta({"prompt_template": prompt}) + + formatted_prompt = prompt.format(country=country) + return formatted_prompt + +``` + +Metadata is displayed in the span's raw data view. + +## Linking spans to applications, variants, and environments + +You can link a span to an application, variant, and environment by calling `ag.tracing.store_refs()`. + +Applications, variants, and environments can be referenced by their slugs, versions, and commit IDs (for specific versions). +You can link a span to an application and variant like this: + +```python + +@ag.instrument(spankind="workflow") +def generate(country:str): + prompt = f"What is the capital of {country}" + + + formatted_prompt = prompt.format(country=country) + + completion = client.chat.completions.create( + model='gpt-4', + messages=[ + {'role': 'user', 'content': formatted_prompt}, + ], + ) + + # highlight-start + ag.tracing.store_refs( + { + "application.slug": "capital-app", + "environment.slug": "production", + } + ) + # highlight-end + return completion.choices[0].message.content + +``` + +`ag.tracing.store_refs()` takes a dict with keys from `application.slug`, `application.id`, `application.version`, `variant.slug`, `variant.id`, `variant.version`, `environment.slug`, `environment.id` and `environment.commit_id`, with the values being the slug, id, version or commit id of the application, variant, and environment respectively. + +## Storing Internals + +Internals are additional data stored in the span. Compared to metadata, internals have the following differences: + +- Internals are saved within the span data and are searchable with plain text queries. +- Internals are shown by default in the span view in a collapsible section, while metadata is only shown as part of the JSON file with the raw data (i.e. better visibility with internals). +- **Internals can be used for evaluation**. For instance, you can save the retrieved context in the internals and then use it to evaluate the factuality of the response. + +As a rule of thumb, use metadata for additional information that is not used for evaluation and not elementary to understand the span, otherwise use internals. + +Internals can be stored similarly to metadata: + +```python +@ag.instrument(spankind="workflow") +def rag_workflow(query:str): + + context = retrieve_context(query) + + # highlight-start + ag.tracing.store_internals({"context": context}) + # highlight-end + + prompt = f"Answer the following question {query} based on the context: {context}" + + completion = client.chat.completions.create( + model='gpt-4', + messages=[ + {'role': 'user', 'content': formatted_prompt}, + ], + ) + return completion.choices[0].message.content + +``` + +## Excluding Inputs/Outputs from Capture + +In some cases, you may want to exclude parts of the inputs or outputs due to privacy concerns or because the data is too large to be stored in the span. + +You can achieve this by setting the ignore_inputs and ignore_outputs arguments to True in the instrument decorator. + +```python +@ag.instrument(spankind="workflow", ignore_inputs=True, ignore_outputs=True) +def rag_workflow(query:str): + ... +``` diff --git a/docs/docs/observability/integrations/01-openai.mdx b/docs/docs/observability/integrations/01-openai.mdx new file mode 100644 index 0000000000..84000d407b --- /dev/null +++ b/docs/docs/observability/integrations/01-openai.mdx @@ -0,0 +1,198 @@ +--- +title: OpenAI +--- + +```mdx-code-block +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +``` + +OpenAI calls can be automatically instrumented with Agenta using the `opentelemetry-instrumentation-openai` package. This guide shows how to set it up. + +## Installation + +Install the required packages: + +```bash +pip install -U agenta openai opentelemetry-instrumentation-openai +``` + +--- + +## Instrument OpenAI API Calls + +### 1. Configure Environment Variables + + + + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +### 2. Initialize Agenta and Instrument OpenAI + +```python +# highlight-start +import agenta as ag +from opentelemetry.instrumentation.openai import OpenAIInstrumentor +# highlight-end +import openai + +# highlight-start +ag.init() +OpenAIInstrumentor().instrument() +# highlight-end + +response = openai.ChatCompletion.create( + model="gpt-3.5-turbo", + messages=[ + {"role": "user", "content": "Write a short story about AI."}, + ], +) + +print(response.choices[0].message.content) +``` + +After running this code, Agenta will automatically capture the details of this API call. + +## Instrumenting a Workflow with a Parent Span + +If you have a function or workflow with multiple calls you want to monitor as a single trace, you can use the `@ag.instrument` decorator. + +### Example + +```python +# highlight-start +import agenta as ag +from opentelemetry.instrumentation.openai import OpenAIInstrumentor +# highlight-end + +import openai +import asyncio + +# highlight-start +ag.init() +OpenAIInstrumentor().instrument() +# highlight-end + +# highlight-next-line +@ag.instrument(spankind="TASK") +async def generate_story_prompt(topic: str, genre: str): + return f"Write a {genre} story about {topic}." + +# highlight-next-line +@ag.instrument(spankind="WORKFLOW") +async def generate_story(topic: str, genre: str): + prompt = await generate_story_prompt(topic, genre) + response = openai.ChatCompletion.create( + model="gpt-3.5-turbo", + messages=[ + { + "role": "user", + "content": prompt, + }, + ], + ) + return response.choices[0].message.content + +if __name__ == "__main__": + asyncio.run(generate_story(topic="the future", genre="sci-fi")) +``` + +## Associating Traces with Applications + +In the previous examples, the traces are instrumented in the global project scope. They are not associated with a specific application or variant or environment. +To link traces to specific parts of your Agenta projects, you can store references inside your instrumented functions. + +```python +# highlight-next-line +@ag.instrument(spankind="WORKFLOW") +async def generate_story(topic: str, genre: str): + # highlight-start + # Associate with a specific application and variant + ag.tracing.store_refs({ + "application.id": "YOUR_APPLICATION_ID", + "variant.id": "YOUR_VARIANT_ID", + "environment.slug": "production", + }) + # highlight-end + + response = openai.ChatCompletion.create( + model="gpt-3.5-turbo", + messages=[ + { + "role": "user", + "content": f"Write a {genre} story about {topic}.", + }, + ], + ) + return response.choices[0].message.content +``` + +**Note**: You need to have the `@ag.instrument` decorator to use `ag.tracing.store_refs`. + +## Complete Example + +Here's the full code putting it all together: + +```python +import os +import agenta as ag +from opentelemetry.instrumentation.openai import OpenAIInstrumentor +import openai +import asyncio + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" # Skip if using OSS locally +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" # Use "http://localhost" for OSS + +# highlight-next-line +ag.init() + +# Set your OpenAI API key +openai.api_key = "YOUR_OPENAI_API_KEY" + +# highlight-next-line +OpenAIInstrumentor().instrument() + +# highlight-next-line +@ag.instrument(spankind="WORKFLOW") +async def generate_story(topic: str, genre: str): + # highlight-start + # Associate with application and variant + ag.tracing.store_refs({ + "application.id": "YOUR_APPLICATION_ID", + "variant.id": "YOUR_VARIANT_ID", + }) + ag.tracing.store_refs({"environment.slug": "production"}) + # highlight-end + # Make the OpenAI API call + response = openai.ChatCompletion.create( + model="gpt-3.5-turbo", + messages=[ + { + "role": "user", + "content": f"Write a {genre} story about {topic}.", + }, + ], + ) + return response.choices[0].message.content + +if __name__ == "__main__": + asyncio.run(generate_story(topic="the future", genre="sci-fi")) +``` diff --git a/docs/docs/observability/integrations/02-langchain.mdx b/docs/docs/observability/integrations/02-langchain.mdx new file mode 100644 index 0000000000..3a196a52b0 --- /dev/null +++ b/docs/docs/observability/integrations/02-langchain.mdx @@ -0,0 +1,116 @@ +--- +title: LangChain +description: Learn how to instrument LangChain traces with Agenta for enhanced LLM observability. This guide covers setup, configuration, and best practices for monitoring structured data extraction using LangChain and OpenAI models. +--- + +```mdx-code-block +import Image from "@theme/IdealImage"; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +``` + +[LangChain](https://python.langchain.com/) **is a framework for developing applications powered by large language models (LLMs)**. By instrumenting LangChain with Agenta, you can monitor and debug your applications more effectively, gaining insights into each step of your workflows. + +This guide shows you how to instrument LangChain applications using Agenta's observability features. + +## Installation + +Install the required packages: + +```bash +pip install -U agenta openai opentelemetry-instrumentation-langchain langchain langchain_community +``` + +## Configure Environment Variables + + + + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +## Code Example + +```python +# highlight-next-line +import agenta as ag +# highlight-next-line +from opentelemetry.instrumentation.langchain import LangchainInstrumentor +from langchain.schema import SystemMessage, HumanMessage +from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate +from langchain_community.chat_models import ChatOpenAI +from langchain.chains import LLMChain, SequentialChain, TransformChain + +# highlight-next-line +ag.init() +# highlight-next-line +LangchainInstrumentor().instrument() + + +def langchain_app(): + # Initialize the chat model + chat = ChatOpenAI(temperature=0) + + # Define a transformation chain to create the prompt + transform = TransformChain( + input_variables=["subject"], + output_variables=["prompt"], + transform=lambda inputs: {"prompt": f"Tell me a joke about {inputs['subject']}."}, + ) + + # Define the first LLM chain to generate a joke + first_prompt_messages = [ + SystemMessage(content="You are a funny sarcastic nerd."), + HumanMessage(content="{prompt}"), + ] + first_prompt_template = ChatPromptTemplate.from_messages(first_prompt_messages) + first_chain = LLMChain(llm=chat, prompt=first_prompt_template, output_key="joke") + + # Define the second LLM chain to translate the joke + second_prompt_messages = [ + SystemMessage(content="You are an Elf."), + HumanMessagePromptTemplate.from_template( + "Translate the joke below into Sindarin language:\n{joke}" + ), + ] + second_prompt_template = ChatPromptTemplate.from_messages(second_prompt_messages) + second_chain = LLMChain(llm=chat, prompt=second_prompt_template) + + # Chain everything together in a sequential workflow + workflow = SequentialChain( + chains=[transform, first_chain, second_chain], + input_variables=["subject"], + ) + + # Execute the workflow and print the result + result = workflow({"subject": "OpenTelemetry"}) + print(result) + +# Run the LangChain application +langchain_app() +``` + +## Explanation + +- **Initialize Agenta**: `ag.init()` sets up the Agenta SDK. +- **Instrument LangChain**: `LangchainInstrumentor().instrument()` instruments LangChain for tracing. This must be called **before** running your application to ensure all components are traced. + +## Using Workflows + +You can optionally use the `@ag.instrument(spankind="WORKFLOW")` decorator to create a parent span for your workflow. This is optional, but it's a good practice to instrument the main function of your application. diff --git a/docs/docs/observability/integrations/04-instructor.mdx b/docs/docs/observability/integrations/04-instructor.mdx new file mode 100644 index 0000000000..20fe4f129f --- /dev/null +++ b/docs/docs/observability/integrations/04-instructor.mdx @@ -0,0 +1,104 @@ +--- +title: Instructor +description: Learn how to implement and monitor Instructor traces with Agenta for enhanced LLM observability. This comprehensive guide covers setup, configuration, and best practices for tracking structured data extraction using Instructor and OpenAI models. +--- + +```mdx-code-block +import Image from "@theme/IdealImage"; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +``` + +[Instructor](https://python.useinstructor.com/) is a library that helps you extract structured data from natural language using LLMs. By instrumenting Instructor with Agenta, you can monitor and debug your applications more effectively. + +This guide shows you how to instrument Instructor when using OpenAI models, you can use the same approach for other LLM providers like Anthropic, Google, etc. You just need to use the correct instrumentation library for the LLM provider. + +Illustration of instructor instrumented trace + +## Installation + +Install the required packages: + +```bash +pip install -U agenta openai opentelemetry-instrumentation-openai instructor +``` + +## Configure Environment Variables + + + + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +## Code Example + +```python +# highlight-next-line +import agenta as ag +import openai +import instructor +from pydantic import BaseModel +# highlight-next-line +from opentelemetry.instrumentation.openai import OpenAIInstrumentor + +# highlight-next-line +ag.init() + +# highlight-start +# Instrument OpenAI before creating the Instructor client +OpenAIInstrumentor().instrument() +# highlight-end + +class UserInfo(BaseModel): + name: str + age: int + +# highlight-next-line +@ag.instrument(spankind="WORKFLOW") +def instructor_workflow(): + # Create an Instructor client using the instrumented OpenAI model + client = instructor.from_openai(openai.OpenAI()) + + # Extract structured data from natural language + user_info = client.chat.completions.create( + model="gpt-3.5-turbo", + response_model=UserInfo, + messages=[{"role": "user", "content": "John Doe is 30 years old."}], + ) + return user_info + +user_info = instructor_workflow() +print(user_info) +``` + +:::warning +**Order Matters**: Call `OpenAIInstrumentor().instrument()` **before** creating the Instructor client with `instructor.from_openai(openai.OpenAI())`. Both modify the OpenAI library, so the order ensures proper instrumentation. +::: + +## Explanation + +- **Initialize Agenta**: `ag.init()` sets up the Agenta SDK. +- **Instrument OpenAI**: `OpenAIInstrumentor().instrument()` instruments the OpenAI library for tracing. This must come **before** creating the Instructor client. +- **Instrument the Workflow**: The `@ag.instrument(spankind="WORKFLOW")` decorator creates a parent span. This is optional, but it's a good practice to instrument the main function of your application. diff --git a/docs/docs/observability/integrations/05-litellm.mdx b/docs/docs/observability/integrations/05-litellm.mdx new file mode 100644 index 0000000000..3c02e7f797 --- /dev/null +++ b/docs/docs/observability/integrations/05-litellm.mdx @@ -0,0 +1,90 @@ +--- +title: LiteLLM +description: Learn how to instrument LiteLLM traces with Agenta for enhanced LLM observability. This guide covers setup, configuration, and best practices for monitoring API calls and performance using LiteLLM and OpenAI models. +--- + +```mdx-code-block +import Image from "@theme/IdealImage"; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +``` + +[LiteLLM](https://www.litellm.ai/) is Python SDK that allows you to **call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]**. + +This guide shows you how to instrument LiteLLM applications using Agenta's observability features. + +## Installation + +Install the required packages: + +```bash +pip install -U agenta litellm +``` + +## Configure Environment Variables + + + + +```python +import os + +os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY" +os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" +``` + + + + +```python +import os + +os.environ["AGENTA_HOST"] = "http://localhost" +``` + + + + +## Code Example + +```python +# highlight-next-line +import agenta as ag +import litellm +import asyncio + +# highlight-next-line +ag.init() + +# highlight-next-line +@ag.instrument() +async def agenerate_completion(): + # highlight-next-line + litellm.callbacks = [ag.callbacks.litellm_handler()] + + # Define the messages for the chat completion + messages = [ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Write a short story about AI Engineering."}, + ] + temperature = 0.2 + max_tokens = 100 + + # Make an asynchronous chat completion request + chat_completion = await litellm.acompletion( + model="gpt-3.5-turbo", + messages=messages, + temperature=temperature, + max_tokens=max_tokens, + ) + print(chat_completion) + +# Run the asynchronous function +asyncio.run(agenerate_completion()) +``` + +## Explanation + +- **Initialize Agenta**: `ag.init()` sets up the Agenta SDK. +- **Instrument the Function**: The `@ag.instrument()` decorator wraps the `agenerate_completion` function, creating a parent span in Agenta. +- **Set Up Callback**: `litellm.callbacks = [ag.callbacks.litellm_handler()]` sets the Agenta callback handler for LiteLLM. This enables LiteLLM to send trace data to Agenta. diff --git a/docs/docs/observability/integrations/_category_.json b/docs/docs/observability/integrations/_category_.json new file mode 100644 index 0000000000..10e5757b16 --- /dev/null +++ b/docs/docs/observability/integrations/_category_.json @@ -0,0 +1,4 @@ +{ + "position": 4, + "label": "Integrations" +} diff --git a/docs/docs/observability/quickstart.mdx b/docs/docs/observability/quickstart.mdx deleted file mode 100644 index 35d3d2b378..0000000000 --- a/docs/docs/observability/quickstart.mdx +++ /dev/null @@ -1,165 +0,0 @@ ---- -title: Quick Start ---- - -# Setting up telemetry - -You can configure Agenta to capture all inputs, outputs, and other metadata from your LLM applications, regardless of whether they are hosted in Agenta or in your environment. - -Post instrumentation, Agenta provides a dashboard that offers an overview of your app's performance metrics over time, including request counts, average latency, and costs. - -We also provide a table detailing all the requests made to your LLM application. This table can be filtered and used to enrich your test sets, debug your applications, or fine-tune them. - -:::tip -Concepts of Telemetry: - -**Traces:** A trace represents the entire journey of a request or operation as it moves through a system. In our context, a trace represents one request to the LLM application. - -**Spans:** A span represents a unit of work within a trace. Spans are nested to form a tree-like structure, with the root span representing the overall operation, and child spans representing sub-operations. In Agenta, we enrich each span with cost information and metadata in the event of an LLM call. - -::: - -:::note - When creating an application from the UI, tracing is enabled by default. No - setup is required. Simply navigate to the observability view to see all - requests. -::: - -## 1. Create an application in **agenta** - -To start, we need to create an application in **agenta**. You can do this using the command from the CLI: - -```python -agenta init -``` - -This command creates a new application in **agenta** and a `config.toml` file with all the information about the application. - -## 2. Initialize **agenta** - -```python -import agenta as ag -# Option 1 - -ag.init(api_key="", app_id="") - -# Option 2 -os.environ["AGENTA_API_KEY"] = "" -os.environ["AGENTA_APP_ID"] = "" -ag.init() - -# Option 3 -ag.init(config_fname="config.toml") # using the config.toml generated by agenta init - -``` - -You can find the API Key under the Setting view in **agenta**. - -The app id can be found in the `config.toml` file if you have created the application from the CLI. - -Note that if you are serving your application to the **agenta** cloud, **agenta** will automatically populate all the information in the environment variable. Therefore, you only need to use `ag.init()`. - -## 3. Instrument with the decorator - -Add the `@ag.instrument()` decorator to the functions you want to instrument. This decorator will trace all input and output information for the functions. - -:::caution - {" "} - Make sure the `instrument` decorator is the first decorator in the function.{" "} -::: - -```python -@ag.instrument(spankind="llm") -def myllmcall(country:str): - prompt = f"What is the capital of {country}" - response = client.chat.completions.create( - model='gpt-4', - messages=[ - {'role': 'user', 'content': prompt}, - ], - ) - return response.choices[0].text - -@ag.instrument() -def generate(country:str): - return myllmcall(country) - -``` - -## 4. Modify a span's metadata - -You can modify a span's metadata to add additional information using `ag.tracing.set_span_attributes()`. This function will access the active span and add the key-value pairs to the metadata.: - -```python -@ag.instrument(spankind="llm") -def myllmcall(country:str): - prompt = f"What is the capital of {country}" - response = client.chat.completions.create( - model='gpt-4', - messages=[ - {'role': 'user', 'content': prompt}, - ], - ) - ag.tracing.set_span_attributes({"model": "gpt-4"}) - return response.choices[0].text - -``` - -## 5. Putting it all together - -Here's how our code would look if we combine everything: - -```python -import agenta as ag - -os.environ["AGENTA_API_KEY"] = "" -os.environ["AGENTA_APP_ID"] = "" -ag.init() - -@ag.instrument(spankind="llm") -def myllmcall(country:str): - prompt = f"What is the capital of {country}" - response = client.chat.completions.create( - model='gpt-4', - messages=[ - {'role': 'user', 'content': prompt}, - ], - ) - ag.tracing.set_span_attributes({"model": "gpt-4"}) - return response.choices[0].text - -@ag.instrument() -def generate(country:str): - return myllmcall(country) - -``` - -# Setting up telemetry for apps hosted in agenta - -If you're creating an application to serve to agenta, not much changes. You just need to add the entrypoint decorator, ensuring it comes _before_ the instrument decorator. - -```python -import agenta as ag - -ag.init() -ag.config.register_default(prompt=ag.TextParam("What is the capital of {country}")) - -@ag.instrument(spankind="llm") -def myllmcall(country:str): - response = client.chat.completions.create( - model='gpt-4', - messages=[ - {'role': 'user', 'content': ag.config.prompt.format(country=country)}, - ], - ) - ag.tracing.set_span_attributes({"model": "gpt-4"}) - return response.choices[0].text - -@ag.entrypoint -@ag.instrument() -def generate(country:str): - return myllmcall(country) - -``` - -The advantage of this approach is that the configuration you use is automatically instrumented along with the other data. diff --git a/docs/static/images/observability/instructor.png b/docs/static/images/observability/instructor.png new file mode 100644 index 0000000000..eea75d6e6e Binary files /dev/null and b/docs/static/images/observability/instructor.png differ diff --git a/docs/static/images/observability/observability.png b/docs/static/images/observability/observability.png new file mode 100644 index 0000000000..9fb4f1b858 Binary files /dev/null and b/docs/static/images/observability/observability.png differ