Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graphql-ws/Client : Infinite loop when missing credentials #244

Closed
nomocas opened this issue Oct 18, 2021 · 19 comments
Closed

graphql-ws/Client : Infinite loop when missing credentials #244

nomocas opened this issue Oct 18, 2021 · 19 comments
Labels
unrelated Not relevant, related or caused by the lib

Comments

@nomocas
Copy link

nomocas commented Oct 18, 2021

Hello All,

Client graphql-ws v5.5.0
on nodejs v14.17.0

Server : Hasura v2.0.9

Expected Behaviour
I expected graphql-ws/client to show an error when server does not accept connexion because missing credentials.

Actual Behaviour
Instead it loops infinitely (without any error). So if you do not have acces to server's logs, you simply don't understand why such behaviour happen.

To reproduce : simply setup a server (by example hasura >= v2.0.8) that needs credentials, try to connect with graphql-ws/client without providing credentials. Boom : it loops.

By the way, there is nowhere in the documentation a clear explanation about how to setup credentials in the client.

For those who are looking for it (example with hasura) :

    const client = createClient({
       ...
        connectionParams: {
            headers: {
                'x-hasura-admin-secret': '...',
            },
        },
       ...
   })
@enisdenjo
Copy link
Owner

What loops infinitely? Client keeps trying to reconnect? For the client to fail right away, set the retryAttempts client option to zero (0).

The default for the client is to silently try reconnecting 5 times with randomized exponential back-off and then fail. Please check out the retryAttempts and retryWait client options.

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Hello, thanks for the answer.

It loops indefinitely and tries to connect, really fast, and show nothings by default. It only stop because I have a timeout handler in my code that throw and break the execution.

When I add all events handlers with console logs (npm/debug statement in fact) : it show this (at a really fast rate, my terminal is instantly filled) :

hpk:graphql-ws closed +7ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +8ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +1ms
  hpk:graphql-ws closed +4ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +6ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +0ms
  hpk:graphql-ws closed +6ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +6ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +2ms
  hpk:graphql-ws closed +8ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +5ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +1ms
  hpk:graphql-ws closed +6ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +6ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +0ms
  hpk:graphql-ws closed +5ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +6ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +0ms
  hpk:graphql-ws closed +6ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +6ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +0ms
  hpk:graphql-ws closed +8ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +4ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
  hpk:graphql-ws ping true +0ms
  hpk:graphql-ws pong false +0ms
  hpk:graphql-ws closed +8ms
  hpk:graphql-ws connecting +0ms
  hpk:graphql-ws opened +8ms
  hpk:graphql-ws message { type: 'ping', payload: { message: 'keepalive' } } +0ms
 ....

and it continues until my timeout breaks the execution.

So it tries much more than 5 times. And it seems that the exponential backoff does not work (there is no delay at all between retries).

@enisdenjo
Copy link
Owner

Ok, can you elaborate a bit more? Whats the close reason? Can you create a small repro? Can you share your client/server implementation?

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Here is my client implementation :

import { createClient, ExecutionResult, SubscribePayload, Client } from 'graphql-ws';
import ws from 'ws';
import { v4 } from 'uuid';
import dbg from 'debug';

export { SubscribePayload, ExecutionResult };

const debug = dbg('hpk:graphql-ws');

export function createGQLClient(url: string) {
    debug('createGQLClient(%s)', url);
    const client = createClient({
        url,
        webSocketImpl: ws,
        generateID: () => v4(),
        connectionParams: {
            headers: {
                // this should be removed to have the bug
                'x-hasura-admin-secret': process.env.HASURA_GRAPHQL_ADMIN_SECRET as string,
            },
        },
        on: {
            connected: () => {
                debug('connected');
            },
            connecting: () => {
                debug('connecting');
            },
            opened: () => {
                debug('opened');
            },
            closed: () => {
                debug('closed');
            },
            message: (msg) => {
                debug('message', msg);
            },
            ping: (msg) => {
                debug('ping', msg);
            },
            pong: (msg) => {
                debug('pong', msg);
            },
            error: (err) => {
                debug('error');
                console.error(err);
            },
        },
        onNonLazyError: (errorOrCloseEvent) => {
            debug('NonLazyError');
            console.error(errorOrCloseEvent);
        },
    });

    client.on('error', (err) => {
        debug('client error');
        console.error(err);
    });
    return client;
}

For the server, as told before : I'm using hasura v2.0.9. (so no implementation to provide)

When I provide credentials in connectionParams, there is no bug.

@enisdenjo
Copy link
Owner

Please provide me with the close reason.

import { createClient, ExecutionResult, SubscribePayload, Client } from 'graphql-ws';
import ws from 'ws';
import { v4 } from 'uuid';
import dbg from 'debug';

export { SubscribePayload, ExecutionResult };

const debug = dbg('hpk:graphql-ws');

export function createGQLClient(url: string) {
    debug('createGQLClient(%s)', url);
    const client = createClient({
        url,
        webSocketImpl: ws,
        generateID: () => v4(),
        connectionParams: {
            headers: {
                // this should be removed to have the bug
                'x-hasura-admin-secret': process.env.HASURA_GRAPHQL_ADMIN_SECRET as string,
            },
        },
        on: {
            connected: () => {
                debug('connected');
            },
            connecting: () => {
                debug('connecting');
            },
            opened: () => {
                debug('opened');
            },
-           closed: () => {
-               debug('closed');
-           },
+           closed: (err) => {
+               debug('closed', err.code, err.reason); // 👈
+          },
            message: (msg) => {
                debug('message', msg);
            },
            ping: (msg) => {
                debug('ping', msg);
            },
            pong: (msg) => {
                debug('pong', msg);
            },
            error: (err) => {
                debug('error');
                console.error(err);
            },
        },
        onNonLazyError: (errorOrCloseEvent) => {
            debug('NonLazyError');
            console.error(errorOrCloseEvent);
        },
    });

    client.on('error', (err) => {
        debug('client error');
        console.error(err);
    });
    return client;
}

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

here is the close event trace (after printing it in the closed event handler) :

hpk:graphql-ws closed CloseEvent {
  [Symbol(kTarget)]: <ref *1> WebSocket {
    _events: [Object: null prototype] {
      error: [Function],
      close: [Array],
      open: [Function],
      message: [Function]
    },
    _eventsCount: 4,
    _maxListeners: undefined,
    _binaryType: 'nodebuffer',
    _closeCode: 4400,
    _closeFrameReceived: true,
    _closeFrameSent: true,
    _closeMessage: <Buffer 49 6e 76 61 6c 69 64 20 6d 65 73 73 61 67 65>,
    _closeTimer: Timeout {
      _idleTimeout: -1,
      _idlePrev: null,
      _idleNext: null,
      _idleStart: 5835,
      _onTimeout: null,
      _timerArgs: undefined,
      _repeat: null,
      _destroyed: true,
      [Symbol(refed)]: true,
      [Symbol(kHasPrimitive)]: false,
      [Symbol(asyncId)]: 95,
      [Symbol(triggerId)]: 73
    },
    _extensions: {},
    _protocol: 'graphql-transport-ws',
    _readyState: 3,
    _receiver: Receiver {
      _writableState: [WritableState],
      _events: [Object: null prototype] {},
      _eventsCount: 0,
      _maxListeners: undefined,
      _binaryType: 'nodebuffer',
      _extensions: {},
      _isServer: false,
      _maxPayload: 104857600,
      _skipUTF8Validation: false,
      _bufferedBytes: 0,
      _buffers: [],
      _compressed: false,
      _payloadLength: 17,
      _mask: undefined,
      _fragmented: 0,
      _masked: false,
      _fin: true,
      _opcode: 8,
      _totalPayloadLength: 0,
      _messageLength: 0,
      _fragments: [],
      _state: 0,
      _loop: false,
      [Symbol(kCapture)]: false,
      [Symbol(websocket)]: [Circular *1]
    },
    _sender: Sender {
      _extensions: {},
      _socket: [Socket],
      _firstFragment: true,
      _compress: false,
      _bufferedBytes: 0,
      _deflating: false,
      _queue: []
    },
    _socket: Socket {
      connecting: false,
      _hadError: false,
      _parent: null,
      _host: 'localhost',
      _readableState: [ReadableState],
      _events: [Object: null prototype],
      _eventsCount: 2,
      _maxListeners: undefined,
      _writableState: [WritableState],
      allowHalfOpen: false,
      _sockname: null,
      _pendingData: null,
      _pendingEncoding: '',
      server: null,
      _server: null,
      parser: null,
      _httpMessage: null,
      timeout: 0,
      write: [Function: writeAfterFIN],
      [Symbol(async_id_symbol)]: 73,
      [Symbol(kHandle)]: null,
      [Symbol(kSetNoDelay)]: true,
      [Symbol(lastWriteQueueSize)]: 0,
      [Symbol(timeout)]: null,
      [Symbol(kBuffer)]: null,
      [Symbol(kBufferCb)]: null,
      [Symbol(kBufferGen)]: null,
      [Symbol(kCapture)]: false,
      [Symbol(kBytesRead)]: 670,
      [Symbol(kBytesWritten)]: 1352,
      [Symbol(RequestTimeout)]: undefined,
      [Symbol(websocket)]: undefined
    },
    _bufferedAmount: 0,
    _isServer: false,
    _redirects: 0,
    _url: 'ws://localhost:8080/v1/graphql',
    _req: null,
    [Symbol(kCapture)]: false
  },
  [Symbol(kType)]: 'close',
  [Symbol(kCode)]: 4400,
  [Symbol(kReason)]: 'Invalid message',
  [Symbol(kWasClean)]: true

So we see a message as a buffer (_closeMessage), and a reason as 'Invalid message' (Symbol(kReason))

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Ok : after adding proper console log : it shows this while looping :

closed 1000 {"server_error_msg":"4400: Connection initialization failed: Missing Authorization header in JWT authentication mode"}

It's better. (by the way, could you add somewhere in documentation how to handle properly the error events, as this, it's really not clear how to do it).

But, we still have a infinite loop, and no exponential back-off.

@enisdenjo
Copy link
Owner

Can you first confirm that the error gets reported when setting retryAttempts to zero (0)? You should get the error in both the on.error and client.subscribe(payload, { ..., error: () => {/*here*/}, ... }).

import { createClient, ExecutionResult, SubscribePayload, Client } from 'graphql-ws';
import ws from 'ws';
import { v4 } from 'uuid';
import dbg from 'debug';

export { SubscribePayload, ExecutionResult };

const debug = dbg('hpk:graphql-ws');

export function createGQLClient(url: string) {
    debug('createGQLClient(%s)', url);
    const client = createClient({
        url,
        webSocketImpl: ws,
+       retryAttempts: 0,
        generateID: () => v4(),
        connectionParams: {
            headers: {
                // this should be removed to have the bug
                'x-hasura-admin-secret': process.env.HASURA_GRAPHQL_ADMIN_SECRET as string,
            },
        },
        on: {
            connected: () => {
                debug('connected');
            },
            connecting: () => {
                debug('connecting');
            },
            opened: () => {
                debug('opened');
            },
            closed: (err) => {
                debug('closed', err.code, err.reason); // 👈
           },
            message: (msg) => {
                debug('message', msg);
            },
            ping: (msg) => {
                debug('ping', msg);
            },
            pong: (msg) => {
                debug('pong', msg);
            },
            error: (err) => {
                debug('error');
                console.error(err);
            },
        },
        onNonLazyError: (errorOrCloseEvent) => {
            debug('NonLazyError');
            console.error(errorOrCloseEvent);
        },
    });

    client.on('error', (err) => {
        debug('client error');
        console.error(err);
    });
    return client;
}

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Same thing with retryAttempts: 0,

It loops indefinitely... and no error handlers are triggered. None of them.

@enisdenjo
Copy link
Owner

enisdenjo commented Oct 18, 2021

I see here that the server is closing with a code 1000. Code 1000 is reserved for Normal Closure indicating a graceful close which is why the client retries immediately after receiving the 1000. Guess you're getting stuck here:

graphql-ws/src/client.ts

Lines 767 to 770 in 21e44db

// normal closure (possibly all subscriptions have completed)
// if no locks were acquired in the meantime, shouldnt try again
if (isLikeCloseEvent(errOrCloseEvent) && errOrCloseEvent.code === 1000)
return locks > 0;

The actual problem here is the server... It should NOT be using close code 1000 for erroneous closes.

FYI Hasura does not use this library as their server, they instead have their own implementation of GraphQL over WebSocket.

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Yes, hasura have its own implementation, but, since hasura v2.0.8, it uses your transport layer when connecting with WS.

As you already know because you are the author of the related thread : hasura/graphql-engine#6264 ;)

@enisdenjo
Copy link
Owner

Yes, they use my GraphQL over WebSocket protocol. However, this does not mean they should use code 1000 for non-erroneous closes. This is something that the protocol does not need to specify since it is a part of the WebSocket standard. See close codes: https://developer.mozilla.org/en-US/docs/Web/API/CloseEvent/code.

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

When I check the error returned by hasura : it's a 4400. Not a 1000.

closed 1000 {"server_error_msg":"4400: Connection initialization failed: Missing Authorization header in JWT authentication mode"}

@enisdenjo
Copy link
Owner

It is a 1000. The 4400 is a part of the error text message.

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

Ok so hasura send two codes ?
1000 as main error code and 4400 as reason code ?

What should I do to have this fixed ?

@enisdenjo
Copy link
Owner

There is just one WebSocket close code. The 2nd one is a part of the error text message, it does not do anything, is just a message.

Am working on an issue for Hasura, please bear with me.

@nomocas
Copy link
Author

nomocas commented Oct 18, 2021

I'll do that ;)

@enisdenjo
Copy link
Owner

Haha, beat you to it: hasura/graphql-engine#7696. 😄

@enisdenjo
Copy link
Owner

I am closing this issue in favor of hasura/graphql-engine#7696. As soon as it gets resolved there, it will work as expected here.

@enisdenjo enisdenjo added the unrelated Not relevant, related or caused by the lib label Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unrelated Not relevant, related or caused by the lib
Projects
None yet
Development

No branches or pull requests

2 participants