Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network communication design #530

Open
TheJJ opened this issue Mar 20, 2016 · 37 comments
Open

Network communication design #530

TheJJ opened this issue Mar 20, 2016 · 37 comments
Labels
area: network Has interaction with the network (Internet) to-discuss Idea or suggestion that needs some discussion before implementation

Comments

@TheJJ
Copy link
Member

TheJJ commented Mar 20, 2016

tl;wr: the server does all calculations, clients only render the state they received. state is transmitted by keyframes at "predicted" end points.

The design is pretty much like on would do it for a FPS.
The basic idea is this: The server sends a packet which updates the "target" state of the future for clients. They interpolate the received movement/update/... functions and try reaching the target state all the time.

server
------

* central trust instance for the game state.
* does all the calculations based on input events.
* create and distribute keyframes for all clients


clients
-------

* receive keyframes
* calculate world state for current time by interpolating keyframes
* display world state and fancy animations et
* send actions to server (timestamp is probably a bad idea)

This is the low-level part for transmitting the results of the simulation itself.

To understand how the prediction works, see #740.

@TheJJ TheJJ added in progress to-discuss Idea or suggestion that needs some discussion before implementation area: network Has interaction with the network (Internet) labels Mar 20, 2016
@janisozaur
Copy link
Contributor

In https://github.com/OpenRCT2/OpenRCT2 we have added multiplayer capability to the game, and included a simple lobby, master server, which lists online public games.

At first there was no authentication, but people abused publicly available servers to destroy maps. We have then added ability to password-protect servers and assign users into groups with specified permissions, but it turned out to not be enough, as we could not make those permissions persist. Our plan was to include centralised authorisation server, but this met some substantial resistance, as it was "against the spirit of open source", even when we wanted to publish the code for running your own auth service.

The solution we came up was to utilise OpenSSL's public-key cryptography to generate key on client and have the client sign server-generated token, client then sends signature to the server, gets verified and if that step is successful, the client is granted permissions he was assigned last time.

If you would like to employ similar scheme in your project, you can use OpenRCT2/OpenRCT2#3699 as a reference.

@TheJJ
Copy link
Member Author

TheJJ commented May 30, 2016

Thanks for the hints, doesn't sound bad. But this issue is rather about the simple server-client interaction, without all the lobby stuff. I just opened another issue for that: #562. I don't think we'll have a griefing problem, because games are rather short-lived and only spectators should be able to join afterwards. For client authentification, we thought about using GPG, so achievements can be issued to key owners for example. Match-rejoins after a disconnect can also be performed that way securely. But it's not that much a difference to OpenSSL, except maybe licensing fuckup.

@janisozaur
Copy link
Contributor

When I was designing the system, I found out about https://keybase.io/, this looks nice and something we could've probably used, but it still in invite-only mode. If you use crypto keys nevertheless, perhaps something worth taking a look at.

@timo-42
Copy link

timo-42 commented Sep 15, 2016

I tried to understand the proposed netcode:
flow chart
dia file for editing

sources:
aoe2 netcode we dont want it (cheaters, performance problems)
our netcode inspiration

@janisozaur
Copy link
Contributor

janisozaur commented Sep 16, 2016

One interesting bit in this diagram is limiting network messages clients receive to only what they can see. This is a very sound approach, but one that could probably make syncing game state much much harder. I'm eager to see how you guys solve that.

Another piece of documentation you may perhaps find useful or a source of inspiration would be https://github.com/OpenTTD/OpenTTD/blob/master/docs/desync.txt

Have you already given any thought how would your network stream look like? Are you going to use something to encapsulate it, like protobuf?

@timo-42
Copy link

timo-42 commented Sep 16, 2016

protobuf looks exactly what we need for sending messages. It seems simple enough, so all developers can use without knowledge about the network system. You need somthing synct to the client? No problem create message format and send it.

desync: we should not have problems with desync. The Server is authoritative. We could do something like:
hash over all 20 units => send it to the server, if something is wrong server sends the fulll unit information for these 20 units again
therefore there is no "butterfly effect", we can always get the true state(building, techtree, units, world) from the server

sending only relevant information to the client:

  1. server calculates all changes and creates messages
  2. which messages happen not in the fog of war off a player? send these

@VelorumS
Copy link
Contributor

VelorumS commented Sep 16, 2016

Desync is still an issue, because the info we're getting from the server is about evolution of a unit in next several seconds. Bad implementation will jingle or create a lot of traffic.

Even if our implementation has no desync bugs, it still needs to handle things like de-spawning a unit as a part of the normal flow.

Also, don't forget in the diagram another prediction loop when units start to move before we even send command to the server.

@timo-42
Copy link

timo-42 commented Sep 16, 2016

You are right, there will be small desyncs until the server sends updated path, hitpoints, etc for the unit.
But the alternative lockstep model has huge disadvantages:

  • gamespeed is locked with player(highest ping or slowest pc)
  • the code path must be exactly the same
  • floating point must be reproducible across cpus and architectures
    But as a first step, we could easy implement the lockstep mechanism from aoe2, we send the messages back from the server without the predictions.

I don't understand where is the problem with de-spawning units? send a message with unit A is killed/will be killed at 2:45

is the prediction loop necessary/possible? We cant know when the message exactly arrives at the server and will be processed. We wont send a timestamp with it, because this may be a big loophole for cheating. Therefore, A command will be executed when the server says so, not before. If we do prediction client side, we may be undo building creation, ..., I think that would be disappointing? experience for the player if he wants to build a castle and we give the visual feedback and 80ms later we remove it from the map because another player builds a tower there.

@VelorumS
Copy link
Contributor

I don't understand where is the problem with de-spawning units? send a message with unit A is killed/will be killed at 2:45

Message from server: "Unit is in the prod queue, will be ready in 10 seconds". Ok, client spawns a unit in 10 seconds. Message from server: "Sorry, actually rax were destroyed and there never was any unit spawned". Client makes the unit vanish.

About prediction: you click - they move. No time to wait for the server roundtrip, because after playing with 20ms ping, the players will be getting nausea on 100ms.

@janisozaur
Copy link
Contributor

The best thing you could do about floats is not to use them for game state. It may be hard, but other than following IEEE754 to a t, or using something like libfixmath*, you face differences in implementations. Check out also https://www.reddit.com/r/gamedev/comments/3tx6gh/article_i_wrote_minimizing_the_pain_of_lockstep/

* I know you guys, you will try to roll your own.

@VelorumS
Copy link
Contributor

  • I know you guys, you will try to roll your own.

There is one already. I was like "wtf, these uint64_t values are making no sense when I'm printing them!".

@timo-42
Copy link

timo-42 commented Sep 16, 2016

new term proposal:
prediction: used when the server sends a path to client, client can interpolate the server predicted path
pre prediction: client gets user action and tries to predict the right action

@ChipmunkV good point. But we could send a message: "building destroyed","units newer existed: 4,5,6" (so no body will be shown instead when the units killed message would be used)

for referencing the ids: every client get a pool assigned with ids(global unique),so he chooses them and the server can reference them when they are killed before the queue message arrives at the server

New Algorithm:

  1. SDL_Events
  2. create messages and queue them locally and send them to server
  3. process server messages (prediction)
  4. process local queue (pre prediction)
  5. make pre predictions, who attacks who?
  6. draw screen

maybe client side pre prediction is not needed or it is only useful in some cases. Client side pre prediction would be extremly hard to implement.

now the server must look at the timestamps from clients and choose if they are plausible. Clients with different Pings have different time to send their messages.
What happens if a client ping drops suddenly? Do we drop their messages? Does he want to cheat? Messages from server to client in 20ms and from client to server 100ms. If Server accepts old messages from 100ms ago, client may exploit this time diffrence.
What happens if a client ping improves suddenly? Did he cheat prior?

If we allow pre prediction, we must replay all frames which happend since then. => this may be highly cpu intense

My opinion: Server accept messages as they come in as if they were triggered now. What clients may can do to reduce lag: if we know ping is 40ms, than we can assume out message(which should be correct) will be processed in 20ms. So we buffer it until then and we wont be so far off.
Note: 20ms is the same frame @30fps and the next one @60fps.

@janisozaur
Copy link
Contributor

https://xkcd.com/654/

@TheJJ
Copy link
Member Author

TheJJ commented Sep 26, 2016

@timohaas jup that sound like a good way.

We should abstract the communication in several ways, the net/ subsystem is just for network communication, then we'll have curve/ for all the curve generation and interpolation, and adapt the game logic to make use of them.

Then we need to create/extend the renderer/ subsystem to display the curves with the appropriate assets.

This is all gonna be very hard and needs lots of restructuring, but we can do it!

@TheJJ TheJJ removed the in progress label Sep 28, 2016
@mic-e
Copy link
Member

mic-e commented Oct 2, 2016

Possible protobuf alternative: https://capnproto.org/

@janisozaur
Copy link
Contributor

Yes, protobuf is not the only kid on the block. capnproto has nice summarising feature matrix and some decent comparisons on this page: https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html

@Tomatower
Copy link
Contributor

Tomatower commented Oct 10, 2016

I would implement a custom protocol based on a deterministic lockstep method.
Ontop of UDP as non-reliable but stateless and robust against package loss.

The idea is to transfer "single occurence, event like" information in a reliable way (more later) and "status updates" like unit positions etc. in a non-reliable way.

To transfer a piece of information reliable, you transmit it with every packet every frame (maybe more than one eventframe per frame), until you recieve an ACK from the other end, that events until the specific frame number have been transmitted.

Status updates are used to fill up a packet to MTU size (because if a packet is split on the way because it is bigger than the MTU, you again loose precious milliseconds)

So more in detail a example packet can look like that:

frame_ack: 6{ 
   {
      frame: 1 { Event: Move Unit 5 to X/Y: 10/100; Send Message "FOO" to user 2 }, 
      frame: 5 { Event: Unit 5: attack unit 7; Unit 4: die }
   }, 
   {
      unit 5:{ x: 3 y: 90; HP: 1325; Trajectory: 4/90, 4/91, 5/92}
      unit 7:{ x: 10; y:101; HP 123523; Trajectory: 0}
   }
}

This means for the remaining concept:

  • There has to exist a simulator, that can process input events to the actions of units.
  • There has to exist a renderer that can render this state
  • There has to exist a error-manager, that can interpolate between the local position and the network-authorized position (e.g. Server) so the units dont jump around on the screen and nobody can see possible errors.
  • The Client can issue actions to the server
  • The Server confirms the actions, but the client may already start the movement locally

Please give me your comments what you think about this concept.

@VelorumS
Copy link
Contributor

VelorumS commented Oct 10, 2016

@Tomatower, you may start writing net/.

I'm making a sketch of what should be between net/ and render/: https://github.com/ChipmunkV/openage/blob/dd6aa96d326e5cd9556357ae14c864709fb688b5/libopenage/curve/entities_conductor.h

(branch VelorumS@dd6aa96)

@Tomatower
Copy link
Contributor

Currently I was more thinking about applying the different packages pulled from the server (via a "get events for frame 5"-method), and then interpolating from there with the exact same codebase as it exists on the server.

If one needs to extrapolate etc. it would be better to apply a "what-if" interface:

"I am in Frame 8. Now the Server said: In Frame 5 unit 3 started to move. Where is it now?"

@timo-42
Copy link

timo-42 commented Oct 10, 2016

netcode Roadmap proposal:

  • client simple lobby gui (connect to ip)
  • server has simple lobby(waits until a player sends start command)
  • server can send gamestate without curves(map tiles, messages, player ressources)
  • client can receive gamestate without curces(map tiles, messages, player ressources)

at this point we have a running connection, dumb client

  • server can send unit curves
  • client can receive unit curves

at this point we can play on the server and the clients can render it

  • server can receive player commands
  • client can send player commands

at this point we have a running multiplayer game

  • client can predict unit curves(before sending command to server)
  • sending only stuff to client what they need
  • create a separate executable with only server stuff, no sdl libs, ...

@Tomatower we should not use UDP. It would make the network stack more complex. We dont need the speed down to the last ms.

@Tomatower
Copy link
Contributor

The first Network-Client should be to start a second low spec tool, that can blindly render, what the full spec Server (the current version: You can move units etc.). This will test the basic communication.

Then we should provide the possibility for the client to create input events to be transmitted to the server, and be mapped to another player than the one currently running.

At this point in time we start implementing the need to know principle

Then we can start designing a lobby, that enables to connect different players together, and start removing the input in the server (maybe keep the rendering without FOW?)

Then we can think about creating fancy stuff like hot-seat switching, AI via Network, ...

So as Checklist:

  • Low Spec Viewer
  • Input transmission
  • Lobby Technology
  • Multiple players ( > 1, up to ~16 for the first start)

@VelorumS
Copy link
Contributor

The unknown is an integration with the game data. The subsystem should use the right types of curves for the unit/player properties, defined in nyan. And it must be transparent, so if there is a change in the property definition - it's only in the .nyan file and the renderer.

gamestate without curves(map tiles, messages, player ressources)

Have you considered implementing them as curves?

@VelorumS
Copy link
Contributor

Currently I was more thinking about applying the different packages pulled from the server (via a "get events for frame 5"-method), and then interpolating from there with the exact same codebase as it exists on the server.
If one needs to extrapolate etc. it would be better to apply a "what-if" interface:
"I am in Frame 8. Now the Server said: In Frame 5 unit 3 started to move. Where is it now?"

In that networking model, server sends events, but some of them are already extrapolations. So, server is at frame 5, but already telling that the unit will finish being produced on the frame 105 (200ms logic frame, so - in 20s).

Then server corrects the predictions as it goes.

The nice thing is that the client can throw in its own extrapolations into the same array, and render that as usual. Later it will be overwritten by the corrections from the server anyways.

@TheJJ
Copy link
Member Author

TheJJ commented Oct 17, 2016

For transmitting nyan objects over the network, we can either submit all the applied patches and track all changes on all peers that way, or we can "bake" objects and just send those (e.g. the combined variant of all attributes of that unit type). Sending patches in a guaranteed way will probably reduce coding overhead for the prediction subsystem though.

The nyan system is synced by just sending the time of a patch application. The database has to be identical at the start, and patches are applied by event-curves ("apply patch Loom at time 2332s"). The client nyan database is then used for its local prediction, the server uses it for authroritative simulation anyway.

@Tomatower
Copy link
Contributor

Tomatower commented Jan 28, 2017

After having a more in-depth look into the features, i have found SCTP as a possible network wrapper. It has good support in Linux, and there are libraries available for windows.

It supports especially multiple streams and off-band messages.

@Tomatower
Copy link
Contributor

Tomatower commented Feb 9, 2017

I have an Idea about a possible Curve-API and Code in the Backend, and I welcome feedback

The Idea

The Data to be accessed seems to look like that without curve (a rought example, that shall be seen only as example for the API idea)

struct single_unit { //Obejct Store
	vector<2, float> position; //MultidimensionalContinuou
	float hp; //SimpleContinuous
	int ammo; //Discrete
};

struct gameContainer { //Object Store
	std::unordered_map<int, single_unit> units; //Identificators
};

void event(event); //Event triggering

Curve API

The derived API shall be like that:

class SingleUnit : public curve::Object {
	curve::Array<2, float> position;
	curve::Continuous<float> hp;
	curve::Discrete<int> ammo;
};

class GameContainer : public CurveObject {
	curve::unordered_map<int, SingleUnit> units;
};

class curve::Event {
	int event_type;
	vector<int32_t> data;
};

curve::event_iterator events(time start, time end);

extra parsers for the event iterator stuff can be implemented.

The API Interface Idea

The core definitions will look like:

event_iterator {
	//Standard iterator stuff
	bool valid(); //if there are still elements in the queue
};

Reading:
curve::s can be initialized with a certain timestamp and a referencing mother object. Then one can access their values.

Data is stored within the curve::s themselve, each do basically their own history managament

data functions on non-list types.

  1. add(time, value) insert a new value inbetween or at the end (depends on the timestamp)
  2. replace (time, value) insert a new value, and remove everything after
  3. remove(time) remove a keyframe. time has to match exactly

data getters on single-dimensional values

  1. get(time) get the value at this point in time
  2. get() - if object was bound to a time at construction

data getters on multi-dimensional values (Array and unordered map)

  1. get(time, i)
  2. get(i) - if object was bound to a time at construction

data functions on the unordered map:

  1. add(time, std::pair<int, value>) add a new key/value-pair. will fail if key already exists. A key has to be globally and over the whole time of the game unique.
  2. remove(time, key)
  3. iterate(time) - gives an iterator over the list at the given time

@TheJJ
Copy link
Member Author

TheJJ commented Feb 19, 2017

We escalated this even more, so the transmission over the network is still curves via keyframes, internally we create the predictions with #740.

@TheJJ TheJJ added this to the Architecture restructuring milestone May 13, 2017
@Vtec234 Vtec234 mentioned this issue May 22, 2017
7 tasks
@simonsan
Copy link
Contributor

simonsan commented Apr 12, 2020

Protobuf alternative
https://google.github.io/flatbuffers/

Why not use Protocol Buffers, or .. ?

Protocol Buffers is indeed relatively similar to FlatBuffers, with the primary difference being that FlatBuffers does not need a parsing/ unpacking step to a secondary representation before you can access data, often coupled with per-object memory allocation. The code is an order of magnitude bigger, too. Protocol Buffers has neither optional text import/export nor schema language features like unions.

Nice! This is a simple web front end for the FlatBuffers Compiler (flatc 1.10.0):
https://flatbuffers.ar.je/

@heinezen
Copy link
Member

@simonsan AoE also uses flat buffers in their multiplayer protocol I think.

@duanqn
Copy link
Contributor

duanqn commented Apr 30, 2020

Is it possible to take NAT into consideration when we choose protocols?
In China the IPv4 address is so limited compared to the large user base so we have NAT everywhere. A lot of games can't be played without a dedicated server because they don't work with NATs.

Also looking into the future it might be useful to make our game compatible with IPv6.

@heinezen
Copy link
Member

heinezen commented May 1, 2020

@duanqn I would support that and we would have to think about that anyway since other mechanisms such as DSLite make direct IP connections impossible. Committing to IPv6 would also be a good idea.

@TheJJ
Copy link
Member Author

TheJJ commented May 1, 2020

For now we assume that the dedicated server is reachable somehow (directly (v4/v6), VPN, portforward, ...), so we don't plan for any nat-traversal mechanisms. These can be added later as an extension when we deem them useful.

@duanqn
Copy link
Contributor

duanqn commented May 1, 2020

@TheJJ I agree that NAT is not a top concern for now, but it's probably beneficial to keep it in mind. My concern is that if we choose some protocols, then NAT traversal may become impossible. For example I think any TCP-based protocols cannot work with NAT.
I think direct IP multiplayer game is essential and is a more practical goal, especially in our development phase.

@heinezen
Copy link
Member

heinezen commented May 2, 2020

@duanqn Do you know which type of NAT we would be dealing with in Chinese networks? I have little knowledge on how the network over there work :)

For most applications, STUN/ICE/TURN and NAT64 should work for the UDP side, and also for TCP if I remember correctly. Some of these methods would also require a relay server to be reachable from China which could be tricky.

@duanqn
Copy link
Contributor

duanqn commented May 2, 2020

@heinezen Good question... I don't know actually. I think the network environment varies from region to region, but in general there is heavy presence of NAT and firewalls.
I don't know a lot about NAT traversal. I just googled those methods. I think TURN would definitely work. Getting a relay server with public IP is not hard (just get a machine from any cloud service provider), but it is also expensive. The cost is similar to having a dedicated server.
It would be better if we can use STUN. If my understanding is correct the STUN server is not involved in actual data transfer. However since I never heard about it I assume it's not working. I found some implementations on GitHub maybe I can ask my friends to test them.

@VelorumS
Copy link
Contributor

VelorumS commented May 2, 2020

@duanqn as a client use the Trickle ICE WebRTC sample for testing. It's just a local web page that can communicate with STUN and TURN servers. Can test with some public servers.

As a STUN/TURN server the resiprocate-turn-server works fine. It's in the Ubuntu repo.

My current job is a VoIP server, routers, recorders.

@duanqn
Copy link
Contributor

duanqn commented May 7, 2020

I asked one of my friend to test with https://github.com/jselbie/stunserver
It looks like the NAT mapping and firewalls are independant of remote IP:port pairs. So there is a good chance we can use STUN. But I haven't found a test application that actually does NAT hole-punching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: network Has interaction with the network (Internet) to-discuss Idea or suggestion that needs some discussion before implementation
Projects
None yet
Development

No branches or pull requests

9 participants