Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ASGI instrumentation to better handle websockets communication #505

Open
owais opened this issue May 24, 2021 · 3 comments
Open

Update ASGI instrumentation to better handle websockets communication #505

owais opened this issue May 24, 2021 · 3 comments
Labels
bug Something isn't working triaged

Comments

@owais
Copy link
Contributor

owais commented May 24, 2021

Describe your environment
ASGI instrumentation today supports both HTTP and websockets requests but treats them the same. I think it works correctly for HTTP but needs to be changed when ti comes to websockets.

HTTP

Like all other instrumentations, ASGI instrumentation correctly generates one trace per HTTP request. Due to it's async nature, it generates 3 spans instead of 1 span that other "sync" frameworks generate. We could in theory generate a single span and update the same span when we notice async events like "send" (that send http response or parts of it back to client) but I think it's better to represent how the code actually works. IMO it is working correctly today for this use case.

Example HTTP trace representing a single HTTP request

code:

code
from opentelemetry.instrumentation.asgi import OpenTelemetryMiddleware

from starlette.middleware import Middleware
from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route

async def homepage(request):
    return JSONResponse({'hello': 'world'})

routes = [
    Route("/", endpoint=homepage)
]

app = Starlette(debug=True, routes=routes, middleware=[Middleware(OpenTelemetryMiddleware)])
trace image

The two child spans represent ASGI callbacks that are responsible for sending data back to the client. The first send starts the response and sets headers. The second one returns the response body.

Web Sockets

On the other hand, websockets are generally used by applications that need to keep long lived connections open such chat applications, notification channels etc. The ASGI instrumentation treats this the same as HTTP. This results in the entire websockets connection being represented as a single trace. A chat application example with trace:

code
from opentelemetry.instrumentation.asgi import OpenTelemetryMiddleware

from starlette.middleware import Middleware
from starlette.applications import Starlette
from starlette.endpoints import WebSocketEndpoint, HTTPEndpoint
from starlette.responses import HTMLResponse
from starlette.routing import Route, WebSocketRoute


html = """
<!DOCTYPE html>
<html>
    <head>
        <title>Chat</title>
    </head>
    <body>
        <h1>WebSocket Chat</h1>
        <form action="" onsubmit="sendMessage(event)">
            <input type="text" id="messageText" autocomplete="off"/>
            <button>Send</button>
        </form>
        <ul id='messages'>
        </ul>
        <script>
            var ws = new WebSocket("ws://localhost:8000/ws");
            ws.onmessage = function(event) {
                var messages = document.getElementById('messages')
                var message = document.createElement('li')
                var content = document.createTextNode(event.data)
                message.appendChild(content)
                messages.appendChild(message)
            };
            function sendMessage(event) {
                var input = document.getElementById("messageText")
                ws.send(input.value)
                input.value = ''
                event.preventDefault()
            }
        </script>
    </body>
</html>
"""

class Homepage(HTTPEndpoint):
    async def get(self, request):
        return HTMLResponse(html)

class Echo(WebSocketEndpoint):
    encoding = "text"

    async def on_receive(self, websocket, data):
        await websocket.send_text(f"Message text was: {data}")

routes = [
    Route("/", Homepage),
    WebSocketRoute("/ws", Echo)
]

app = Starlette(routes=routes, middleware=[Middleware(OpenTelemetryMiddleware)])
trace image

The parent SERVER span represents the the chat session (ws connection) being establish. Each couple of child receive+send spans represent a single chat message being sent to the server and the response returned by the server. While this can be useful in some cases, I don't think it's representative of the vast majority of websockets use cases. I think the server span should be created to represent the connection/session but each message request/response cycle should be represented as it's own trace. In other words, the above trace should be actually be 9 traces. The parent/server span from the trace should be an independent trace while as each pair of receive+send messages should form 8 more traces representing 8 chat messages (ws messages). Each receive (request) span should become a SERVER span with the send (response) span being it's child with kind set to INTERNAL.

I'm not sure if this can be done easily given the async nature of ASGI but it should be achievable when the a higher level framework like Starlette or FastAPI are used in combination with ASGI. To enable use cases like this, ASGI instrumentation should set the current context/span on the websocket message object (a dict). Instrumentations for higher level frameworks should then instrument concepts from those frameworks (handlers, views, etc) and use the context attached to the websocket messages when available.

@owais owais added the bug Something isn't working label May 24, 2021
@cdvv7788
Copy link
Contributor

cdvv7788 commented Jun 4, 2021

@owais Possibly relevant for this issue: #467
Fastapi uses the asgi instrumentation under the hood, and it is having issues (on the http side).

@github-actions
Copy link

github-actions bot commented Jul 5, 2021

This issue was marked stale due to lack of activity. It will be closed in 30 days.

@chicoxyzzy
Copy link

I have the same issue. Is there any solution to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged
Projects
None yet
Development

No branches or pull requests

4 participants