Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centrifugo v2 #221

Closed
FZambia opened this issue May 16, 2018 · 23 comments
Closed

Centrifugo v2 #221

FZambia opened this issue May 16, 2018 · 23 comments

Comments

@FZambia
Copy link
Member

FZambia commented May 16, 2018

Hello dear Centrifugers.

The work on Centrifugo 2 started in the end of 2017 and it's now almost done. It will serve the same purpose as Centrifugo v1 but won't be backwards compatible – migration to it will require adapting both backend and frontend sides of your application (of course if you decide to migrate). Changes are not too difficult. I will try to write more information later. For now you can look at post describing some of v2 aspects and reasons lead to some decisions. It's not fully actual at moment but the main ideas are the same.

Several highlights of v2:

  • Cleaner and more structured protocol defined in protobuf schema
  • binary Websocket support (Protobuf). Of course JSON still there
  • JWT for authentication instead of hand-crafted HMAC sign
  • GRPC client transport (not for browser) (see below)
  • Prometheus integration and automatic export of stats to Graphite
  • Refactored Javascript (ES6), Go and gomobile client libraries.
  • Simplified API auth (got rid of request body signing)
  • GRPC for server API
  • Structured logging
  • Mechanism to merge several Websocket messages into one
  • Better recovery algorithm to fix several recovered flag false positives
  • Goreleaser for automatic releases to Github (previously I have to upload everything manually)
  • Based on new library centrifuge for Go language

Some things were removed from Centrifugo in v2 release:

  • publishing over Redis queue
  • admin websocket endpoint
  • client limited channels
  • websocket prepared message support

Some things you can help with as it's really hard to do everything myself:

All these tasks require you already familiar with Centrifugo or want to dive deeper as you need to understand how things work internally.

During next days I am planning to work on docs - most of them must be written from scratch so I don't know how much time it will take. Docs prototype located here. Centrifugo v2 itself is in c2 branch.

At moment I am looking forward for developers who are using Centrifugo and want to review Centrifugo v2 at its alpha and beta state. If you ever wanted something backwards incompatible to be added into Centrifugo core - this is the right moment to say. Please contact me here, over email or Gitter.

@arrowcircle
Copy link

Hey! Great news!
Where I can find the protocol changes to make ruby lib compatible?

@FZambia
Copy link
Member Author

FZambia commented May 17, 2018

@arrowcircle hi!

Just wrote a chapter in new docs about API. In short - it's just a POST request with JSON body to /api endpoint and optional API key set via Authorization header. No signing needed anymore. This commit into Python cent library adapts client to be used with new Centrifugo - it can help to understand which changes needed. Also note that token renamed to sign and timestamp renamed to exp and changed semantics (it's now timestamp seconds of connection expiration instead of current timestamp seconds). So helper functions will change a bit too.

I think most of the things are pretty final though after some feedback still can change a bit.

@FZambia
Copy link
Member Author

FZambia commented Jun 5, 2018

So just to give some info about v2 status - at moment I am trying to solve two questions:

  1. Does GRPC client transport based on bidirectional streaming has benefits over Websocket for Centrifugo use cases - my first measurements showed that Websocket is better in all aspects (server CPU, server memory, traffic) for our use cases. There is possibility that GRPC client transport won't be included into release from start and chance that it won't be used at all.

  2. I want to find a better algorithm for message recovery after disconnect. Particularly for the case when there were no active messages in history cache and client reconnects. For this case Centrifugo can't say exactly after reconnect were all messages recovered or not (recovered flag is false in subscription response). The idea is understand that all messages were recovered if disconnect time was no bigger than history_lifetime and no more than history_size messages appeared in cache .

@FZambia
Copy link
Member Author

FZambia commented Jun 14, 2018

I removed GRPC bidirectional streaming client transport because:

  • GRPC requires more memory on server (4x compared to Websocket)
  • GRPC generates more traffic via interface than Websocket with protobuf (~20-30% more for Centrifuge protocol)
  • GRPC is much more CPU hungry on server side (2x-3x)

It's still possible to put it back in future if we find its advantages in some scenarios. Note that GRPC for server API is still here.

Also improved message recovery - new docs here https://centrifugal.github.io/centrifugo/server/recover/

@masterada
Copy link

Hello,

Great work. After browsing through the code I have some thoughts (I have never used centrifugo before, I'm just know checking the project to see if it fits my use case).

I see that the Engine is no longer pluggable.

  • Is there a reason for unexporting the engine methods?
  • Is there a reason for removing the plugin.go? I understand it doesn't make much sense in the library code, but I think it could still be included in the server project. Registering a new plugin and using it from config is much cleaner in my point of view than forking the server project to change 1 line for using a different engine.
  • I also realized that the engine interface consists of 3 parts: PUB/SUB mechanics, channel history and presence information. I think it could be separated into 3 different interfaces, with minimal effort (histroy saving would need to be moved to a dedicated addHistory method, which could be called from Node.publish method). RedisEngine could still implement all 3 interfaces, but pub/sub, history and presence handling could be swapped out independently.

My use case:

I need to write a special presence handling. In v1 I could write a custom engine that has an embedded RedisEngine struct with the presence methods overriden. Now with the Engine interface methods being unexported it's no longer possible.

An another note:

It would be really nice to have a mute client in channel feature in the server API, resulting in that client not getting the messages.

My use case:

Free clients join a channel. One of them starts paying. This one client will receive slightly different notifications on that channel. I know it's possible to leave that channel and join another for the paying session, but it would need to be initiated on frontend, and it would be really nice to be able to solve this on backend only. An alternate solution could be to be able to add "except_clients" id list to publish/broadcast messages.

@FZambia
Copy link
Member Author

FZambia commented Jun 14, 2018

Hello @masterada ! Thanks for a great feedback!

Is there a reason for unexporting the engine methods?

Yes, the reason here is that I don't know any other Engine implementations and their requirements so decided to approach this with caution. I.e. my final goal is to make engine interface fully exported and pluggable in Centrifuge lib - but I don't want to export things right now to not break public API later. So if someone interested in having engine exported we can find a proper way and moment to export it. Also see below.

Is there a reason for removing the plugin.go?

As Centrifugo will now use Centrifuge lib it's not that difficult to plug whatever developer wants. From my point this makes library much more manageable and easier to maintain. Regarding Centrifugo server implementation: adding new plugin using code from plugin.go anyway required rebuilding binary by developer itself. So I think there is no much difference in possibilities but the code is much cleaner now. Also in version 2 I tend to remove some parts that seem hacky to me and not globally useful - this is one of them.

I also realized that the engine interface consists of 3 parts...

I also noticed this and I actually have secret gist regarding to this. The problem with 3 parts is that it generally looks cleaner and more flexible but not justified by reality where we only have 2 main Engines where everything done in memory or in Redis and this separation can be a bit overkill.

If you look to my gist you will see that PUB/SUB mechanics combined with channel history in Broker interface. That's because from performance and atomicity perspective it's a great win to save message into history in publish method of PUB/SUB broker - in case of Redis it allows to do this in one RTT to Redis (via lua script). I suppose there is some way to separate engine in parts but still keep this property - but I just had no time and use case to investigate this more to find correct and elegant component design than I already did. But personally I am for this separation - but it's just not that simple.

Btw, this topic about correct engine separation is one of the reasons I don't want to export Engine interface right now.

It would be really nice to have a mute client in channel feature in the server API, resulting in that client not getting the messages.
Free clients join a channel. One of them starts paying. This one client will receive slightly different notifications on that channel.

Could you elaborate more about this - why not using 2 different channels for this?

Actually I thought many times about having server-side Subscribe() method in Centrifuge library (not in Centrifugo for now while there are no hooks to communicate with backend) so backend could subscribe client to channels itself. But I have not found an elegant way yet how to integrate this to protocol and existing client libraries. I see that you have figured out Centrifuge/Centrifugo internals pretty well - so maybe you will have some ideas on this.

@FZambia
Copy link
Member Author

FZambia commented Jun 14, 2018

I'll try to elaborate more on my points above as some of my thoughts were pretty chaotic.

As far as I understand you are suggesting to do sth like this:

type Engine interface{
    Broker
    HistoryKeeper
    PresenceKeeper
}

Both Memory and Engine will implement all methods of Engine interface thus will work. And if someone want to switch component it will be possible to call sth like node.SetBroker(BrokerImplementation) and control on PUB/SUB mechanics will be passed to this component.

In Node publish we can call:

node.historyKeeper.addHistory(...)
node.broker.Publish(...)

Instead of

node.engine.Publish(...)

If you look at Publish method of Redis engine you will see that it publishes to channel and saves history in one RTT to Redis. This is a property I want to keep for Redis engine. First idea is making addHistory noop in Redis Engine but this means that Redis Engine can't be used as one of history keepers if we swap PUB/SUB broker to sth else. The solution - make it configurable - noop addHistory in one case and addHistory which saves history in another case. This is not very beautiful.

Regarding to muting and except_clients - your case can be solved subscribing on two different channels - on both even if client have not start paying - you just don't publish new messages into that channel until right moment to start doing this. Maybe there is problem that I just don't see.

Regarding to server-side subscribe. It's possible to subscribe on server-side but client will not have callback handler set to process messages coming from channel. Also there is a question about message recovery - can't imagine how to fit it into this model - looks like this must be a task for application code in this case.

@masterada
Copy link

Thanks for the detailed explanation. I completely understand you reasons for not wanting to sacrifice performance for a feature that's might not even needed (separating broker from history).

About subscribing clients on server side

Let's assume a js client subscribed to a public:news channel. He is handling "news" type messages. It might not even make sense to subscribe him to a public:groceries channel, because the client would need to handle these new types of messages, so we might as well just instruct the client to subscribe to public:groceries channels by itself.

On the other hand, it might make sense to subscribe the client to a gossips:news channel. It has the same kind of messages, the client already handles them, the only difference is that now the client will get more messages of the same kind. However it would still be confusing, the client subscribed to public:news, and suddenly it starts getting messages from gossips:news. I don't see a good, non-confusing way of implementing this, so let me suggest a different approach.

Message tagging

During publishing a message, there is an optional tag parameter. Each subscription (user-channel combination) has it's own tags. When forwarding a message to a user, only forward it if the subscription has a tag matching the message's tag. Configuration could contain default tags (for namespaces) that are automatically added to each new subscriptions. Tags could be managed on either client side or server side (with the option to disable client side tag management).

So instead of:

  1. client subscribes to public:news
  2. if client has access to gossips, he subscribes to gossips:news as well (with proper authentication)

You could do:

  1. client subscribes to news (and the subscription gets the default public tag)
  2. if client wants to read gossips as well, client tags the subscription with gossips

Or:

  1. client subscribes to news (and the subscription gets the default public tag)
  2. backend tags the subscription with gossips tag (calling something like /tag?user=<USER ID>&channel=news)

It would solve my use case as well:

  1. client connects to channel, gets notifications (and the subscription gets the default free tag)
  2. backend removes the free tag and adds the paying tag

It's not the same as subscribing the client from backend, but I think this could be easier to use than using multiple subscription from the client to get public and access restricted messages of the same type.

Regarding your case can be solved subscribing on two different channels...: it does not work if there can be more than 1 clients on the same paying channel (it's doable with user restricted channels but a bit more complecated). Also it's problematic for me to get the free messages as well while the user is in paying status (can be solved by filtering on frontend, but again, more complex).

Of course all this is just a suggestion. If you like it, I can help with the implementation. If you don't, it will still be a great project :D

@FZambia
Copy link
Member Author

FZambia commented Jun 15, 2018

Possible solutions

Still not sure I understand your difficulties right.

  1. You can have 2 channels - one for free events and one for paid events. As soon as user starts paying it subscribes on paid channel stream and receives both free and paid events from 2 streams. And on client side you have the same publication handler for events from both channel subscriptions.

  2. Another option you mentioned in your first post - resubscribing on paid channel as soon as user starts paying. In this case on backend you publish free events to both channels (free channel and paid channel) and paid events only to paid channel. So you have 2 separate streams - one for free users and one for paid users.

Tags

At moment Centrifugo assumes identical data for each channel - this is especially important in terms of history/recovery. Though we can still keep full channel history but do server-side filtering based on tags before sending message or history data to client. Also Centrifugo designed in a way that all state required to subscribe comes from client side - so server can be just in-memory message proxy. Do I understand right that when you say about /tag?user=<USER ID>&channel=news you mean AJAX request to backend (similar to what we have with private channels)?

I suppose yes because If you mean server-side integration of Centrifugo and backend via hooks then this idea fails quickly (for example restart of node with Memory engine - we can afford losing message history but loosing tag information is critical). So the only way to keep tags is always pass them from client on subscription request.

This is actually an interesting idea that theoretically can allow to do some interesting channel configurations.

Adding more frontend-backend integration points (a-la private channel authorization now) will be hard to maintain in client libraries. So maybe this can be included into private channel subscription workflow? At moment we have channels starting with $ - private channels. Every time client want to subscribe on private channel request sent via AJAX (in the case of browser) to backend which provides sign for this subscription request. We can theoretically inject tags on this stage on backend as part of private channel subscription request. Those tags must be included into signing process so client can't cheat on tags. In case of Centrifuge library tags fit pretty well and can be simply set by application backend code inside onSubscribe handler.

Channel remains the same, Publication will be delivered to all nodes subscribed on channel and server will do extra filtration based on client subscription tags before actually send message to client connection. History for channel will be kept full inside engine - and every time clients ask for history/recovery we can do extra filtration on Centrifuge/Centrifugo side to only provide history that relates to client tags. This is possible at moment because client can only ask history after subscribe request.

One caveat: this requires resubscription in case when user subscription tags must be changed (i.e. user becomes paid user and have to receive paid events). At this point this looks similar to point 2 from possible solutions section above. Though at least you can publish new events into one channel (though for me it seems not too bad to publish into 2 different channels).

It's difficult to tell at moment that there are no other caveats that can be found when we try to add this in code - it's hard to keep everything in the head. So proof of concept required and also this should not affect performance when tags not used. And of course I need an understanding that tags solve a problem that is hard (or not performant) to solve without them. Or maybe find some other applications for this feature.

Summary 😀

It's pretty hard to discuss this on Github, because I have feeling that I still don't understand your use case right and suggesting unviable solutions:) What is your thoughts on my points here? If you feel that I don't understand you right then maybe we could discuss this in chat on Gitter.

@masterada
Copy link

masterada commented Jun 16, 2018

You are right, I didn't think about the issue of persisting tag information.

I will try to clarify my circumstances:

  1. We are developing a platform, and want to keep the usage of this platform as simple as possible. That means as simple frontend code as possible.
  2. Change between paid and free status is not always initiated by client. It might come from a backend event (eg user runs out of money).

The 2. means the only viable workflow is the following:

  1. client subscribes to a user restricted operation channel (so we can notify it to join/leave paid/free channels)
  2. user joins free channel
  3. when backend event needs to trigger subscription change, it send a message on operation channel (either sending the sign here that the client can use to join the paid channel, or instructing the client to request a sign and then join the channel)

(This is very similar to your suggestions. )

In order to keep this simple, everything more than a one time subscription to 1 or more channels is unexceptable.

I can of course solve this issue by providing my own library that wraps the centrifuge js client and adds the above mentioned functionality, hiding these details from the users of our platform (and by users i mean frontend developers).

So to summarize:

  • If backend side subscription works, but it requires extra effort from frontend developers, it's a no-go for me.
  • If tags work, but it requires extra effort from frontend developers, it's a no-go for me.
  • Temporary unsubscription (aka muting) could work (if there is no client side logic associated with it), but I now see it has the same issue as tags (the need to persist muted state)

With tags the following workflow could be implemented in the centrifuge server and client:

  1. backend sets tag via the server api (it's not an add/delete, backend must specify the new tags exactly)
  2. server notifies client to request a tag change, providing it with a sign to do so (a sign thats based on all newly active tags)
  3. client library updates tags seemlesly
  4. client library saves the new tag sign for the channel, so it can use it for reconnect

Could even hide tags feature from client library, by sending client an updated subscription data + sign (with tag info that the client doesn't need to know about), which it used to upgrade it's existing subscription (and for later reconnect if needed).

But this complex client side logic might not be worth the feature. It's always a tough call to draw the line between features and simplicity :)

One more thing that popped into my mind during writing this: have you considered using JWT? It's a standardized solution that encapsulates data and it's signature, basically used for the same purpose as you use the signs for.

@FZambia
Copy link
Member Author

FZambia commented Jun 16, 2018

A quick question - from your post I did not understand - is one subscription to private channel is acceptable for you? I. e. 1 Subscription to channel that needs requesting backend for sign and possibly tags (the request to backend will be sent every time user subscribes) ?

@masterada
Copy link

masterada commented Jun 16, 2018

Yes, if it's a one time thing (eg: during site load on traditional websites, or opening a page in a single page application). What's not ok is handling the free/paid status change in the 3rd party code in any way. In other words: if the developer who uses our platform needs to write any code that reacts to the paid/free status change, that's not ok.

I want to completly hide the fact from the platform's users that there is even a free/paid status. I want them to subscribe to one channel and keep processing the messages without caring about whether they are free or paid. If in the backend it's solved by 2 channels I don't care, I just want to hide this implementation detail completly.

@FZambia
Copy link
Member Author

FZambia commented Jun 16, 2018

Yep, thanks! I considered JWT before - but it seemed hard to support it across languages. Actually Centrifugo was born before JWT gained its popularity. Now looks like there are tons of libs implementing RFC spec, so this looks reasonable. Though still needs a bit investigation as all libs has its own API to generate tokens - hopefully resulting string is spec compliant and Go server can verify and decode it despite of language that was used to generate it :)

It seems also that using JWT will allow to simplify integration with Centrifugo where we don't have helper libraries and be more flexible when we want to add features to Centrifugo-specific data (like tags from this discussion) - because at moment we have to add this to all helper libraries.

Back to tags. Adding more stuff to protocol like updating subscription state seems a very complex solution. It's possible to implement but you are right that it makes things more difficult and hard to debug. Sure there could be a better way. Some ideas:

  • use disconnect API command to disconnect user. In this case client will automatically reconnect and thus will have a chance to get actual tags from backend during private channel subscription process. Downside is that it will reconnect with delay but I think it's possible to add new fields to disconnect command like reconnect_delay: true, reconnect_after: 0 to control disconnect behaviour.
  • use unsubscribe API command with a new field that will tell client that it must unsubscribe and then subscribe again (smth like resubscribe: true): so will get actual tags from backend during private channel subscription process.

Both approaches never guarantee delivery (as Centrifugo is at most once delivery transport) but should work in practice in normal circumstances. And actually your suggested approach updating subscription state has the same guarantees.

Does this make sense for you?

@FZambia
Copy link
Member Author

FZambia commented Jun 16, 2018

BTW this all can be paired with connection check mechanism to ensure valid client state.

Update: no, this is wrong as connection check does not operate with subscriptions.

@FZambia
Copy link
Member Author

FZambia commented Jun 16, 2018

I investigated JWT a bit - looks like it suits pretty well. Generated token in Python:

jwt.encode({"user": "42", "exp": 121010101010, "tags": ["a", "b"]}, key="secret")

Then decoded in Go:

package main

import (
	"fmt"

	"github.com/dgrijalva/jwt-go"
)

type ConnClaims struct {
	User string   `json:"user"`
	Info string   `json:"info"`
	Tags []string `json:"tags"`
	jwt.StandardClaims
}

func main() {
	s := "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoiNDIiLCJleHAiOjEyMTAxMDEwMTAxMCwidGFncyI6WyJhIiwiYiJdfQ.fUoNhGoYgXwJd9D9K_hloFo0MkwUgQyIrDQJDN0Akp8"

	token, err := jwt.ParseWithClaims(s, &ConnClaims{}, func(token *jwt.Token) (interface{}, error) {
		return []byte("secret"), nil
	})

	if claims, ok := token.Claims.(*ConnClaims); ok && token.Valid {
		fmt.Printf("%v %v %v", claims.User, claims.Tags, claims.StandardClaims.ExpiresAt)
	} else {
		fmt.Println(err)
	}
}

An interesting idea here is adding tags to user connection itself instead of subscription. This will allow to set tags on connect and filter publications based on user tags and not on subscription tags. This is less flexible in general but will allow to not use private subscriptions. The only problem here is updating tags on the fly. This is easy to do during connection check request. But to change tags immediately after they changed on backend some sort of signal required - maybe new API refresh command that will force active user connections to refresh token from backend thus updating tags. Maybe sth else? In this case looks like it can be paired with connection check to ensure valid tag state.

@masterada
Copy link

masterada commented Jun 17, 2018

I read the centrifuge js lib code, and had the exact same tought - using refresh to update the tags, and an option to force client refresh from server api. I don't think it's an issue to have user scoped tags instead of subscription tags - it's still possible to prefix the tag name with the channel name if needed. It might be a good idea though to make guest tags configurable (a static list of tags that apply to guests).

If you are looking at jwt, I suggest you check out go-jose instead. It implements all of jws, jwe and jwt (go-jwt only implements jws + jwt), even thought you will probably not need the jwe part. I also found it a bit easier to use. Here is an example of usage (parse+validate).

We use jwt in php, go and nodejs, so far the only difficulty we ran into is that some libraries accept the key in base64 format (eg: php), while others use it as-is (eg: go). It caused us some headache :)

@FZambia
Copy link
Member Author

FZambia commented Jun 27, 2018

@masterada I've created pull request to https://github.com/centrifugal/centrifuge (#6) with JWT support.

I had time to think more about tags idea while adding JWT. In general I still like what tags can provide in terms of channel configuration. But our final implementation ideas here not very robust unfortunately.

Imagine situation where tags set via user connection token. Then at some moment tags change. If user is offline at this moment he won't get updated tags and will reconnect with the same token after going online (if token not expired). Not asking for token on reconnect is important in terms of not ddosing application backend with CPU intensive tasks (for example when Centrifugo node restarts). This means that user will have old tags until next token refresh. Maybe we should just provide an option to refresh token on every client reconnect.

From this perspective having tags information in private subscription token is more robust as private subscription token is asked every time client resubscribes. This means that on Centrifugo node restart there will be lots of private subscription requests after every client reconnect. But this is a reasonable compromise that we already had before, people use this and not everyone actually using private subscriptions. But to update tags on the fly some sort of signal required (disconnect/resubscribe maybe) and looks like subscription token refreshing on expiration is also a good idea. But this requires quite a lot work - not sure I can spend time for this at moment. But seems like it's possible to add at any moment later.

So I am not sure about best way to add this feature yet.

@FZambia
Copy link
Member Author

FZambia commented Jun 27, 2018

If you are looking at jwt, I suggest you check out go-jose instead. It implements all of jws, jwe and jwt (go-jwt only implements jws + jwt), even thought you will probably not need the jwe part. I also found it a bit easier to use. Here is an example of usage (parse+validate).

In Centrifuge case we have to handle token expiration in special way to support refresh workflow. I looked at go-jose and have not found a straighforward way to check that the only problem with token is that it's expired.

@masterada
Copy link

I see your point about tags and refresh. Still, tags could be a private channel only feature.

I decided I will use the centrifuge library in a new project, because I will need to change private channel subscriptions very often, and I think the short refresh interval I would need to do this with token style is more of a performance overhead than using backend webhooks from centrifuge server for authentication. I will try to include the minimal code to be able to support tags with the library, and create a pull request. But before that I need to dig in some more :)

@FZambia
Copy link
Member Author

FZambia commented Jul 4, 2018

@masterada ok, feel free to ask any questions on Gitter and via personal messages if you prefer. As you can see I was able to implement subscription expiration - implementation is not ideal but I think it's pretty sufficient for this moment.

@FZambia
Copy link
Member Author

FZambia commented Jul 17, 2018

Centrifugo v2.0.0-alpha.2 just released - this is first public pre-release, hope someone will give it a try and share feedback.

@FZambia
Copy link
Member Author

FZambia commented Aug 8, 2018

Centrifugo v2.0.0-beta.1 released

@FZambia
Copy link
Member Author

FZambia commented Sep 18, 2018

So Centrifugo v2 released - release notes are here. Thanks a lot to everyone who helped during development: @masterada @mogol @Inpassor @furdarius @wlredeye and others.

There are still lots of things to do in transition to v2 - update remaining libraries, several examples still use v1, fixing bugs (sure there are some). But the important step just made:)

@FZambia FZambia closed this as completed Sep 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants