Network communication design #530

TheJJ · 2016-03-20T00:47:16Z

tl;wr: the server does all calculations, clients only render the state they received. state is transmitted by keyframes at "predicted" end points.

The design is pretty much like on would do it for a FPS.
The basic idea is this: The server sends a packet which updates the "target" state of the future for clients. They interpolate the received movement/update/... functions and try reaching the target state all the time.

server
------

* central trust instance for the game state.
* does all the calculations based on input events.
* create and distribute keyframes for all clients


clients
-------

* receive keyframes
* calculate world state for current time by interpolating keyframes
* display world state and fancy animations et
* send actions to server (timestamp is probably a bad idea)

This is the low-level part for transmitting the results of the simulation itself.

To understand how the prediction works, see #740.

The text was updated successfully, but these errors were encountered:

janisozaur · 2016-05-30T12:40:04Z

In https://github.com/OpenRCT2/OpenRCT2 we have added multiplayer capability to the game, and included a simple lobby, master server, which lists online public games.

At first there was no authentication, but people abused publicly available servers to destroy maps. We have then added ability to password-protect servers and assign users into groups with specified permissions, but it turned out to not be enough, as we could not make those permissions persist. Our plan was to include centralised authorisation server, but this met some substantial resistance, as it was "against the spirit of open source", even when we wanted to publish the code for running your own auth service.

The solution we came up was to utilise OpenSSL's public-key cryptography to generate key on client and have the client sign server-generated token, client then sends signature to the server, gets verified and if that step is successful, the client is granted permissions he was assigned last time.

If you would like to employ similar scheme in your project, you can use OpenRCT2/OpenRCT2#3699 as a reference.

TheJJ · 2016-05-30T12:54:23Z

Thanks for the hints, doesn't sound bad. But this issue is rather about the simple server-client interaction, without all the lobby stuff. I just opened another issue for that: #562. I don't think we'll have a griefing problem, because games are rather short-lived and only spectators should be able to join afterwards. For client authentification, we thought about using GPG, so achievements can be issued to key owners for example. Match-rejoins after a disconnect can also be performed that way securely. But it's not that much a difference to OpenSSL, except maybe licensing fuckup.

janisozaur · 2016-05-30T13:22:22Z

When I was designing the system, I found out about https://keybase.io/, this looks nice and something we could've probably used, but it still in invite-only mode. If you use crypto keys nevertheless, perhaps something worth taking a look at.

timo-42 · 2016-09-15T22:56:42Z

I tried to understand the proposed netcode:
flow chart
dia file for editing

sources:
aoe2 netcode we dont want it (cheaters, performance problems)
our netcode inspiration

janisozaur · 2016-09-16T06:11:28Z

One interesting bit in this diagram is limiting network messages clients receive to only what they can see. This is a very sound approach, but one that could probably make syncing game state much much harder. I'm eager to see how you guys solve that.

Another piece of documentation you may perhaps find useful or a source of inspiration would be https://github.com/OpenTTD/OpenTTD/blob/master/docs/desync.txt

Have you already given any thought how would your network stream look like? Are you going to use something to encapsulate it, like protobuf?

timo-42 · 2016-09-16T12:25:24Z

protobuf looks exactly what we need for sending messages. It seems simple enough, so all developers can use without knowledge about the network system. You need somthing synct to the client? No problem create message format and send it.

desync: we should not have problems with desync. The Server is authoritative. We could do something like:
hash over all 20 units => send it to the server, if something is wrong server sends the fulll unit information for these 20 units again
therefore there is no "butterfly effect", we can always get the true state(building, techtree, units, world) from the server

sending only relevant information to the client:

server calculates all changes and creates messages
which messages happen not in the fog of war off a player? send these

VelorumS · 2016-09-16T17:44:23Z

Desync is still an issue, because the info we're getting from the server is about evolution of a unit in next several seconds. Bad implementation will jingle or create a lot of traffic.

Even if our implementation has no desync bugs, it still needs to handle things like de-spawning a unit as a part of the normal flow.

Also, don't forget in the diagram another prediction loop when units start to move before we even send command to the server.

timo-42 · 2016-09-16T18:22:56Z

You are right, there will be small desyncs until the server sends updated path, hitpoints, etc for the unit.
But the alternative lockstep model has huge disadvantages:

gamespeed is locked with player(highest ping or slowest pc)
the code path must be exactly the same
floating point must be reproducible across cpus and architectures
But as a first step, we could easy implement the lockstep mechanism from aoe2, we send the messages back from the server without the predictions.

I don't understand where is the problem with de-spawning units? send a message with unit A is killed/will be killed at 2:45

is the prediction loop necessary/possible? We cant know when the message exactly arrives at the server and will be processed. We wont send a timestamp with it, because this may be a big loophole for cheating. Therefore, A command will be executed when the server says so, not before. If we do prediction client side, we may be undo building creation, ..., I think that would be disappointing? experience for the player if he wants to build a castle and we give the visual feedback and 80ms later we remove it from the map because another player builds a tower there.

VelorumS · 2016-09-16T20:16:03Z

I don't understand where is the problem with de-spawning units? send a message with unit A is killed/will be killed at 2:45

Message from server: "Unit is in the prod queue, will be ready in 10 seconds". Ok, client spawns a unit in 10 seconds. Message from server: "Sorry, actually rax were destroyed and there never was any unit spawned". Client makes the unit vanish.

About prediction: you click - they move. No time to wait for the server roundtrip, because after playing with 20ms ping, the players will be getting nausea on 100ms.

janisozaur · 2016-09-16T20:33:25Z

The best thing you could do about floats is not to use them for game state. It may be hard, but other than following IEEE754 to a t, or using something like libfixmath*, you face differences in implementations. Check out also https://www.reddit.com/r/gamedev/comments/3tx6gh/article_i_wrote_minimizing_the_pain_of_lockstep/

* I know you guys, you will try to roll your own.

VelorumS · 2016-09-16T21:05:26Z

I know you guys, you will try to roll your own.

There is one already. I was like "wtf, these uint64_t values are making no sense when I'm printing them!".

timo-42 · 2016-09-16T22:03:15Z

new term proposal:
prediction: used when the server sends a path to client, client can interpolate the server predicted path
pre prediction: client gets user action and tries to predict the right action

@ChipmunkV good point. But we could send a message: "building destroyed","units newer existed: 4,5,6" (so no body will be shown instead when the units killed message would be used)

for referencing the ids: every client get a pool assigned with ids(global unique),so he chooses them and the server can reference them when they are killed before the queue message arrives at the server

New Algorithm:

SDL_Events
create messages and queue them locally and send them to server
process server messages (prediction)
process local queue (pre prediction)
make pre predictions, who attacks who?
draw screen

maybe client side pre prediction is not needed or it is only useful in some cases. Client side pre prediction would be extremly hard to implement.

now the server must look at the timestamps from clients and choose if they are plausible. Clients with different Pings have different time to send their messages.
What happens if a client ping drops suddenly? Do we drop their messages? Does he want to cheat? Messages from server to client in 20ms and from client to server 100ms. If Server accepts old messages from 100ms ago, client may exploit this time diffrence.
What happens if a client ping improves suddenly? Did he cheat prior?

If we allow pre prediction, we must replay all frames which happend since then. => this may be highly cpu intense

My opinion: Server accept messages as they come in as if they were triggered now. What clients may can do to reduce lag: if we know ping is 40ms, than we can assume out message(which should be correct) will be processed in 20ms. So we buffer it until then and we wont be so far off.
Note: 20ms is the same frame @30fps and the next one @60fps.

janisozaur · 2016-09-17T07:18:29Z

TheJJ · 2016-09-26T22:51:45Z

@timohaas jup that sound like a good way.

We should abstract the communication in several ways, the net/ subsystem is just for network communication, then we'll have curve/ for all the curve generation and interpolation, and adapt the game logic to make use of them.

Then we need to create/extend the renderer/ subsystem to display the curves with the appropriate assets.

This is all gonna be very hard and needs lots of restructuring, but we can do it!

mic-e · 2016-10-02T18:31:03Z

Possible protobuf alternative: https://capnproto.org/

janisozaur · 2016-10-02T19:15:26Z

Yes, protobuf is not the only kid on the block. capnproto has nice summarising feature matrix and some decent comparisons on this page: https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html

Tomatower · 2016-10-10T20:12:05Z

I would implement a custom protocol based on a deterministic lockstep method.
Ontop of UDP as non-reliable but stateless and robust against package loss.

The idea is to transfer "single occurence, event like" information in a reliable way (more later) and "status updates" like unit positions etc. in a non-reliable way.

To transfer a piece of information reliable, you transmit it with every packet every frame (maybe more than one eventframe per frame), until you recieve an ACK from the other end, that events until the specific frame number have been transmitted.

Status updates are used to fill up a packet to MTU size (because if a packet is split on the way because it is bigger than the MTU, you again loose precious milliseconds)

So more in detail a example packet can look like that:

frame_ack: 6{ 
   {
      frame: 1 { Event: Move Unit 5 to X/Y: 10/100; Send Message "FOO" to user 2 }, 
      frame: 5 { Event: Unit 5: attack unit 7; Unit 4: die }
   }, 
   {
      unit 5:{ x: 3 y: 90; HP: 1325; Trajectory: 4/90, 4/91, 5/92}
      unit 7:{ x: 10; y:101; HP 123523; Trajectory: 0}
   }
}

This means for the remaining concept:

There has to exist a simulator, that can process input events to the actions of units.
There has to exist a renderer that can render this state
There has to exist a error-manager, that can interpolate between the local position and the network-authorized position (e.g. Server) so the units dont jump around on the screen and nobody can see possible errors.
The Client can issue actions to the server
The Server confirms the actions, but the client may already start the movement locally

Please give me your comments what you think about this concept.

VelorumS · 2016-10-10T20:34:45Z

@Tomatower, you may start writing net/.

I'm making a sketch of what should be between net/ and render/: https://github.com/ChipmunkV/openage/blob/dd6aa96d326e5cd9556357ae14c864709fb688b5/libopenage/curve/entities_conductor.h

(branch VelorumS@dd6aa96)

Tomatower · 2016-10-10T20:55:27Z

Currently I was more thinking about applying the different packages pulled from the server (via a "get events for frame 5"-method), and then interpolating from there with the exact same codebase as it exists on the server.

If one needs to extrapolate etc. it would be better to apply a "what-if" interface:

"I am in Frame 8. Now the Server said: In Frame 5 unit 3 started to move. Where is it now?"

timo-42 · 2016-10-10T21:26:31Z

Tomatower · 2016-10-10T21:55:38Z

The first Network-Client should be to start a second low spec tool, that can blindly render, what the full spec Server (the current version: You can move units etc.). This will test the basic communication.

Then we should provide the possibility for the client to create input events to be transmitted to the server, and be mapped to another player than the one currently running.

At this point in time we start implementing the need to know principle

Then we can start designing a lobby, that enables to connect different players together, and start removing the input in the server (maybe keep the rendering without FOW?)

Then we can think about creating fancy stuff like hot-seat switching, AI via Network, ...

So as Checklist:

Low Spec Viewer
Input transmission
Lobby Technology
Multiple players ( > 1, up to ~16 for the first start)

VelorumS · 2016-10-11T01:03:45Z

The unknown is an integration with the game data. The subsystem should use the right types of curves for the unit/player properties, defined in nyan. And it must be transparent, so if there is a change in the property definition - it's only in the .nyan file and the renderer.

gamestate without curves(map tiles, messages, player ressources)

Have you considered implementing them as curves?

VelorumS · 2016-10-11T01:15:39Z

Currently I was more thinking about applying the different packages pulled from the server (via a "get events for frame 5"-method), and then interpolating from there with the exact same codebase as it exists on the server.
If one needs to extrapolate etc. it would be better to apply a "what-if" interface:
"I am in Frame 8. Now the Server said: In Frame 5 unit 3 started to move. Where is it now?"

In that networking model, server sends events, but some of them are already extrapolations. So, server is at frame 5, but already telling that the unit will finish being produced on the frame 105 (200ms logic frame, so - in 20s).

Then server corrects the predictions as it goes.

The nice thing is that the client can throw in its own extrapolations into the same array, and render that as usual. Later it will be overwritten by the corrections from the server anyways.

TheJJ · 2016-10-17T22:39:44Z

For transmitting nyan objects over the network, we can either submit all the applied patches and track all changes on all peers that way, or we can "bake" objects and just send those (e.g. the combined variant of all attributes of that unit type). Sending patches in a guaranteed way will probably reduce coding overhead for the prediction subsystem though.

The nyan system is synced by just sending the time of a patch application. The database has to be identical at the start, and patches are applied by event-curves ("apply patch Loom at time 2332s"). The client nyan database is then used for its local prediction, the server uses it for authroritative simulation anyway.

Tomatower · 2017-01-28T23:33:16Z

After having a more in-depth look into the features, i have found SCTP as a possible network wrapper. It has good support in Linux, and there are libraries available for windows.

It supports especially multiple streams and off-band messages.

Tomatower · 2017-02-09T13:21:39Z

I have an Idea about a possible Curve-API and Code in the Backend, and I welcome feedback

The Idea

The Data to be accessed seems to look like that without curve (a rought example, that shall be seen only as example for the API idea)

struct single_unit { //Obejct Store
	vector<2, float> position; //MultidimensionalContinuou
	float hp; //SimpleContinuous
	int ammo; //Discrete
};

struct gameContainer { //Object Store
	std::unordered_map<int, single_unit> units; //Identificators
};

void event(event); //Event triggering

Curve API

The derived API shall be like that:

class SingleUnit : public curve::Object {
	curve::Array<2, float> position;
	curve::Continuous<float> hp;
	curve::Discrete<int> ammo;
};

class GameContainer : public CurveObject {
	curve::unordered_map<int, SingleUnit> units;
};

class curve::Event {
	int event_type;
	vector<int32_t> data;
};

curve::event_iterator events(time start, time end);

extra parsers for the event iterator stuff can be implemented.

The API Interface Idea

The core definitions will look like:

event_iterator {
	//Standard iterator stuff
	bool valid(); //if there are still elements in the queue
};

Reading:
curve::s can be initialized with a certain timestamp and a referencing mother object. Then one can access their values.

Data is stored within the curve::s themselve, each do basically their own history managament

data functions on non-list types.

add(time, value) insert a new value inbetween or at the end (depends on the timestamp)
replace (time, value) insert a new value, and remove everything after
remove(time) remove a keyframe. time has to match exactly

data getters on single-dimensional values

get(time) get the value at this point in time
get() - if object was bound to a time at construction

data getters on multi-dimensional values (Array and unordered map)

get(time, i)
get(i) - if object was bound to a time at construction

data functions on the unordered map:

add(time, std::pair<int, value>) add a new key/value-pair. will fail if key already exists. A key has to be globally and over the whole time of the game unique.
remove(time, key)
iterate(time) - gives an iterator over the list at the given time

TheJJ · 2017-02-19T17:18:06Z

We escalated this even more, so the transmission over the network is still curves via keyframes, internally we create the predictions with #740.

simonsan · 2020-04-12T00:24:59Z

Protobuf alternative
https://google.github.io/flatbuffers/

Why not use Protocol Buffers, or .. ?

Protocol Buffers is indeed relatively similar to FlatBuffers, with the primary difference being that FlatBuffers does not need a parsing/ unpacking step to a secondary representation before you can access data, often coupled with per-object memory allocation. The code is an order of magnitude bigger, too. Protocol Buffers has neither optional text import/export nor schema language features like unions.

Nice! This is a simple web front end for the FlatBuffers Compiler (flatc 1.10.0):
https://flatbuffers.ar.je/

heinezen · 2020-04-12T19:24:33Z

@simonsan AoE also uses flat buffers in their multiplayer protocol I think.

duanqn · 2020-04-30T23:48:06Z

Is it possible to take NAT into consideration when we choose protocols?
In China the IPv4 address is so limited compared to the large user base so we have NAT everywhere. A lot of games can't be played without a dedicated server because they don't work with NATs.

Also looking into the future it might be useful to make our game compatible with IPv6.

heinezen · 2020-05-01T02:26:09Z

@duanqn I would support that and we would have to think about that anyway since other mechanisms such as DSLite make direct IP connections impossible. Committing to IPv6 would also be a good idea.

TheJJ · 2020-05-01T14:36:59Z

For now we assume that the dedicated server is reachable somehow (directly (v4/v6), VPN, portforward, ...), so we don't plan for any nat-traversal mechanisms. These can be added later as an extension when we deem them useful.

duanqn · 2020-05-01T20:48:59Z

@TheJJ I agree that NAT is not a top concern for now, but it's probably beneficial to keep it in mind. My concern is that if we choose some protocols, then NAT traversal may become impossible. For example I think any TCP-based protocols cannot work with NAT.
I think direct IP multiplayer game is essential and is a more practical goal, especially in our development phase.

heinezen · 2020-05-02T12:56:57Z

@duanqn Do you know which type of NAT we would be dealing with in Chinese networks? I have little knowledge on how the network over there work :)

For most applications, STUN/ICE/TURN and NAT64 should work for the UDP side, and also for TCP if I remember correctly. Some of these methods would also require a relay server to be reachable from China which could be tricky.

duanqn · 2020-05-02T17:25:07Z

@heinezen Good question... I don't know actually. I think the network environment varies from region to region, but in general there is heavy presence of NAT and firewalls.
I don't know a lot about NAT traversal. I just googled those methods. I think TURN would definitely work. Getting a relay server with public IP is not hard (just get a machine from any cloud service provider), but it is also expensive. The cost is similar to having a dedicated server.
It would be better if we can use STUN. If my understanding is correct the STUN server is not involved in actual data transfer. However since I never heard about it I assume it's not working. I found some implementations on GitHub maybe I can ask my friends to test them.

VelorumS · 2020-05-02T18:37:19Z

@duanqn as a client use the Trickle ICE WebRTC sample for testing. It's just a local web page that can communicate with STUN and TURN servers. Can test with some public servers.

As a STUN/TURN server the resiprocate-turn-server works fine. It's in the Ubuntu repo.

My current job is a VoIP server, routers, recorders.

duanqn · 2020-05-07T19:18:29Z

I asked one of my friend to test with https://github.com/jselbie/stunserver
It looks like the NAT mapping and firewalls are independant of remote IP:port pairs. So there is a good chance we can use STUN. But I haven't found a test application that actually does NAT hole-punching.

TheJJ added in progress to-discuss Idea or suggestion that needs some discussion before implementation area: network Has interaction with the network (Internet) labels Mar 20, 2016

coffenbacher mentioned this issue May 3, 2016

Game stats tracking #325

Open

TheJJ removed the in progress label Sep 28, 2016

TheJJ mentioned this issue Oct 13, 2016

Disconnect-resume mechanism for multiplayer games #638

Open

VelorumS mentioned this issue Oct 16, 2016

History and interpolation #642

Closed

7 tasks

TheJJ mentioned this issue Oct 22, 2016

savefile Format proposal #649

Closed

TheJJ mentioned this issue Nov 2, 2016

Added my collection of memory mapping for the Genie-Engine #647

Closed

VelorumS mentioned this issue Jan 29, 2017

Google Summer of Code 2017 #724

Closed

4 tasks

TheJJ added this to the Architecture restructuring milestone May 13, 2017

Vtec234 mentioned this issue May 22, 2017

Input replay #584

Closed

7 tasks

simonsan mentioned this issue Jul 13, 2019

Google Summer of Code 2020 #1141

Closed

6 tasks

Network communication design #530

Network communication design #530

Comments

TheJJ commented Mar 20, 2016 • edited Loading

janisozaur commented May 30, 2016

TheJJ commented May 30, 2016

janisozaur commented May 30, 2016

timo-42 commented Sep 15, 2016 • edited Loading

janisozaur commented Sep 16, 2016 • edited Loading

timo-42 commented Sep 16, 2016 • edited Loading

VelorumS commented Sep 16, 2016 • edited Loading

timo-42 commented Sep 16, 2016

VelorumS commented Sep 16, 2016

janisozaur commented Sep 16, 2016

VelorumS commented Sep 16, 2016

timo-42 commented Sep 16, 2016

janisozaur commented Sep 17, 2016

TheJJ commented Sep 26, 2016

mic-e commented Oct 2, 2016

janisozaur commented Oct 2, 2016

Tomatower commented Oct 10, 2016 • edited Loading

VelorumS commented Oct 10, 2016 • edited Loading

Tomatower commented Oct 10, 2016

timo-42 commented Oct 10, 2016

Tomatower commented Oct 10, 2016

VelorumS commented Oct 11, 2016

VelorumS commented Oct 11, 2016

TheJJ commented Oct 17, 2016 • edited Loading

Tomatower commented Jan 28, 2017 • edited by TheJJ Loading

Tomatower commented Feb 9, 2017 • edited by TheJJ Loading

The Idea

Curve API

The API Interface Idea

TheJJ commented Feb 19, 2017

simonsan commented Apr 12, 2020 • edited Loading

heinezen commented Apr 12, 2020

duanqn commented Apr 30, 2020

heinezen commented May 1, 2020

TheJJ commented May 1, 2020

duanqn commented May 1, 2020

heinezen commented May 2, 2020

duanqn commented May 2, 2020

VelorumS commented May 2, 2020 • edited Loading

duanqn commented May 7, 2020

TheJJ commented Mar 20, 2016 •

edited

Loading

timo-42 commented Sep 15, 2016 •

edited

Loading

janisozaur commented Sep 16, 2016 •

edited

Loading

timo-42 commented Sep 16, 2016 •

edited

Loading

VelorumS commented Sep 16, 2016 •

edited

Loading

Tomatower commented Oct 10, 2016 •

edited

Loading

VelorumS commented Oct 10, 2016 •

edited

Loading

TheJJ commented Oct 17, 2016 •

edited

Loading

Tomatower commented Jan 28, 2017 •

edited by TheJJ

Loading

Tomatower commented Feb 9, 2017 •

edited by TheJJ

Loading

simonsan commented Apr 12, 2020 •

edited

Loading

VelorumS commented May 2, 2020 •

edited

Loading