Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: Not getting token counts for Databricks (both foundation & external models) #1297

Closed
1 task done
tkanhe opened this issue Jun 11, 2024 · 1 comment
Closed
1 task done

Comments

@tkanhe
Copy link

tkanhe commented Jun 11, 2024

Which component is this bug for?

Langchain Instrumentation

📜 Description

Databricks supports the OpenAI Client for querying LLM models (foundation and external models). I am using it with Langchain. I am getting the traces but not the token count.

Ref. https://docs.databricks.com/en/machine-learning/model-serving/score-foundation-models.html

👟 Reproduction steps

Code:

import asyncio

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.chains.question_answering import load_qa_chain
from langchain.docstore.document import Document
from langchain_openai import ChatOpenAI
from pydantic import BaseModel
from traceloop.sdk import Traceloop
from traceloop.sdk.decorators import workflow

from prompt import prompt_template, variables

Traceloop.init(app_name="tk test", disable_batch=True, api_endpoint="http://localhost:4318")


app = FastAPI()


class Message(BaseModel):
    content: str


@workflow(name="send_message")
async def send_message(question: str):
    callback = AsyncIteratorCallbackHandler()

    model = ChatOpenAI(
        model_name="databricks-dbrx-instruct",
        api_key="dapi51b9336b04**********************e",
        base_url="https://dbc-***********.cloud.databricks.com/serving-endpoints",
        streaming=True,
        callbacks=[callback],
    )

    chain = load_qa_chain(model, chain_type="stuff", prompt=prompt_template, verbose=False)

    task = asyncio.create_task(chain.ainvoke({"input_documents": [Document(page_content=variables["context"])], "question": question}))

    try:
        async for token in callback.aiter():
            yield token
    except Exception as e:
        print(f"Caught exception: {e}")
    finally:
        callback.done.set()
    await task


@app.post("/stream_chat")
def stream_chat(message: Message):
    generator = send_message(message.content)
    return StreamingResponse(generator, media_type="text/event-stream")

👍 Expected behavior

It should give the token count along with the traces...

👎 Actual Behavior with Screenshots

Traces I'm getting:

{
        "Timestamp": datetime.datetime(2024, 6, 11, 11, 41, 7, 762861),
        "TraceId": "81d12193976454e2d9bdbdd0944c3c06",
        "SpanId": "aba567ee86a445e8",
        "ParentSpanId": "55c0ac37410e2496",
        "TraceState": "",
        "SpanName": "openai.chat",
        "SpanKind": "SPAN_KIND_CLIENT",
        "ServiceName": "llm",
        "ResourceAttributes": {"service.name": "llm"},
        "ScopeName": "opentelemetry.instrumentation.openai.v1",
        "ScopeVersion": "0.22.0",
        "SpanAttributes": {
            "traceloop.association.properties.endpoint_id": "6605069e8357cc62d841c9cd",
            "gen_ai.system": "OpenAI",
            "llm.is_streaming": "true",
            "gen_ai.prompt.0.role": "user",
            "gen_ai.completion.0.finish_reason": "stop",
            "traceloop.association.properties.node_id": "NA",
            "gen_ai.response.model": "dbrx-instruct-032724",
            "llm.headers": "None",
            "traceloop.association.properties.node_label": "NA",
            "gen_ai.prompt.0.content": "You are an AWS expert. Use the provided context related AWS re:Invent content to gather detailed information about AWS re:Invent launches. If a question is not related to the topic of AWS re:Invent, refrain from answering it. Instead, encourage the inquirer to pose questions that are relevant to AWS re:Invent topics.\n\nWhen responding to user queries, particularly those asking for statistical data or specific details about the launches, extract and summarize the relevant information. The response should be concise, accurate, and directly address the user's question.\n\nFor example, if a user asks, 'What are all the AI/ML related launches from re:Invent?', the system should:\n\nExtract key details about each AI/ML launch, such as the name of the service or feature, its purpose, and any significant attributes or innovations it brings.\nCompile this information into a coherent, comprehensive summary that answers the user's query clearly.\nAdditionally, ensure the information is up-to-date and reflects the latest re:Invent announcements. In cases where the query is ambiguous or too broad, request more specific details from the user to refine the search and provide the most relevant answer.\n###\nCONTEXT: Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services.\n###\nQUESTION: what is aws ?\n###  \n",
            "gen_ai.completion.0.content": "AWS, or Amazon Web Services, is a comprehensive cloud computing platform offered by Amazon. It includes a variety of services, such as computing power, storage options, networking, and databases, that can be used to run applications and services in the cloud. AWS offers a broad set of tools and services, including those related to AI/ML, data analytics, and security, that can help organizations of all sizes scale and grow their operations.\n\nFor more information about AWS re:Invent launches related to AI/ML, please refer to the context provided in the prompt.",
            "traceloop.association.properties.request_id": "1718106066413861406",
            "traceloop.association.properties.service": "get_llm_chain_streaming",
            "gen_ai.request.model": "databricks-dbrx-instruct",
            "gen_ai.request.temperature": "0",
            "gen_ai.openai.api_base": "https://dbc-*********.cloud.databricks.com/serving-endpoints/",
            "gen_ai.completion.0.role": "assistant",
            "llm.request.type": "chat",
        },
        "Duration": 1987758731,
        "StatusCode": "STATUS_CODE_OK",
        "StatusMessage": "",
        "Events.Timestamp": [
            datetime.datetime(2024, 6, 11, 11, 41, 8, 57539),
            datetime.datetime(2024, 6, 11, 11, 41, 8, 68653),
            datetime.datetime(2024, 6, 11, 11, 41, 8, 83035),
....
            datetime.datetime(2024, 6, 11, 11, 41, 9, 721325),
            datetime.datetime(2024, 6, 11, 11, 41, 9, 738651),
        ],
        "Events.Name": [
            "llm.content.completion.chunk",
            "llm.content.completion.chunk",
            "llm.content.completion.chunk",
            "llm.content.completion.chunk",
....
        ],
        "Events.Attributes": [
            {},
            {},
            {},
            {},
.....
        ],
        "Links.TraceId": [],
        "Links.SpanId": [],
        "Links.TraceState": [],
        "Links.Attributes": [],
    }

🤖 Python Version

3.10

📃 Provide any additional context for the Bug.

traceloop-sdk==0.22.0
langchain==0.1.20
langchain-anthropic==0.1.13
langchain-aws==0.1.6
langchain-community==0.0.38
langchain-core==0.1.52
langchain-openai==0.0.8
opentelemetry-api==1.25.0
opentelemetry-contrib-instrumentations==0.41b0
opentelemetry-distro==0.45b0
opentelemetry-exporter-otlp-proto-common==1.25.0
opentelemetry-exporter-otlp-proto-grpc==1.25.0
opentelemetry-exporter-otlp-proto-http==1.25.0
opentelemetry-instrumentation==0.46b0
opentelemetry-instrumentation-aio-pika==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-aiopg==0.41b0
opentelemetry-instrumentation-alephalpha==0.22.0
opentelemetry-instrumentation-anthropic==0.22.0
opentelemetry-instrumentation-asgi==0.46b0
opentelemetry-instrumentation-asyncpg==0.41b0
opentelemetry-instrumentation-aws-lambda==0.41b0
opentelemetry-instrumentation-bedrock==0.22.0
opentelemetry-instrumentation-boto==0.41b0
opentelemetry-instrumentation-boto3sqs==0.41b0
opentelemetry-instrumentation-botocore==0.41b0
opentelemetry-instrumentation-cassandra==0.41b0
opentelemetry-instrumentation-celery==0.41b0
opentelemetry-instrumentation-chromadb==0.22.0
opentelemetry-instrumentation-cohere==0.22.0
opentelemetry-instrumentation-confluent-kafka==0.41b0
opentelemetry-instrumentation-dbapi==0.41b0
opentelemetry-instrumentation-django==0.41b0
opentelemetry-instrumentation-elasticsearch==0.41b0
opentelemetry-instrumentation-falcon==0.41b0
opentelemetry-instrumentation-fastapi==0.46b0
opentelemetry-instrumentation-flask==0.41b0
opentelemetry-instrumentation-google-generativeai==0.22.0
opentelemetry-instrumentation-grpc==0.41b0
opentelemetry-instrumentation-haystack==0.22.0
opentelemetry-instrumentation-httpx==0.41b0
opentelemetry-instrumentation-jinja2==0.41b0
opentelemetry-instrumentation-kafka-python==0.41b0
opentelemetry-instrumentation-langchain==0.22.0
opentelemetry-instrumentation-llamaindex==0.22.0
opentelemetry-instrumentation-logging==0.41b0
opentelemetry-instrumentation-milvus==0.22.0
opentelemetry-instrumentation-mistralai==0.22.0
opentelemetry-instrumentation-mysql==0.41b0
opentelemetry-instrumentation-mysqlclient==0.41b0
opentelemetry-instrumentation-ollama==0.22.0
opentelemetry-instrumentation-openai==0.22.0
opentelemetry-instrumentation-pika==0.41b0
opentelemetry-instrumentation-pinecone==0.22.0
opentelemetry-instrumentation-psycopg2==0.41b0
opentelemetry-instrumentation-pymemcache==0.41b0
opentelemetry-instrumentation-pymongo==0.41b0
opentelemetry-instrumentation-pymysql==0.41b0
opentelemetry-instrumentation-pyramid==0.41b0
opentelemetry-instrumentation-qdrant==0.22.0
opentelemetry-instrumentation-redis==0.41b0
opentelemetry-instrumentation-remoulade==0.41b0
opentelemetry-instrumentation-replicate==0.22.0
opentelemetry-instrumentation-requests==0.46b0
opentelemetry-instrumentation-sklearn==0.41b0
opentelemetry-instrumentation-sqlalchemy==0.46b0
opentelemetry-instrumentation-sqlite3==0.41b0
opentelemetry-instrumentation-starlette==0.41b0
opentelemetry-instrumentation-system-metrics==0.41b0
opentelemetry-instrumentation-together==0.22.0
opentelemetry-instrumentation-tornado==0.41b0
opentelemetry-instrumentation-tortoiseorm==0.41b0
opentelemetry-instrumentation-transformers==0.22.0
opentelemetry-instrumentation-urllib==0.41b0
opentelemetry-instrumentation-urllib3==0.46b0
opentelemetry-instrumentation-vertexai==0.22.0
opentelemetry-instrumentation-watsonx==0.22.0
opentelemetry-instrumentation-weaviate==0.22.0
opentelemetry-instrumentation-wsgi==0.41b0
opentelemetry-propagator-aws-xray==1.0.1
opentelemetry-proto==1.25.0
opentelemetry-sdk==1.25.0
opentelemetry-semantic-conventions==0.46b0
opentelemetry-semantic-conventions-ai==0.3.1
opentelemetry-util-http==0.46b0

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

None

@nirga
Copy link
Member

nirga commented Jul 19, 2024

Fixed with #1452

@nirga nirga closed this as completed Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants