-
Notifications
You must be signed in to change notification settings - Fork 936
Is Docker Swarm a viable way to achieve HA for orchestrator? #1177
Comments
Thank you for this question. I'm not very familiar with docker swarm. Does a mounted network share guarantee persistence of the sqlite database file? Does docker swarm maintain copies of that file? If so, how are these copies synchronized? |
I'm not that familiar withh sqllite3, but I assumed that only a single instance of orchestrator should only ever write to a single sqllite3 instance. I therefore deployed orchestrator as a single instance service ( The network (NFS) share is mounted on Docker Swarm does not in any way interfere with this setup. All reads and writes to This setup does assume that the storage is HA, but granted this is already in place, Docker Swarm should ensure that orchestrator always runs somewhere. My question is probably more a question of reusing existing infrastructure (Docker Swarm), as I'd really like to avoid having to create separate VMs for just running orchestrator. |
That was my question. Right, so if storage is guaranteed to be HA, and you can guarantee a single instance of
I wonder about docker swarm cross-DC, and how it handles DC network partitioning. What if the orchestrator node gets network isolated; if/how does docker swarm know to start up a new node in another DC; and once the original DC is back online, how does docker swarm know to take it offline, and whether there will be a time period where two |
Assuming orchestrator runs on node A in a 3-node swarm with nodes A,B,C. If A gets isolated from B and C, then B and C has quorum. However, orchestrator will still run on A. New tasks cannot be scheduled on A, but existing tasks will run. So we still have orchestraor running on A. However, B and C will also discover that orchestrator is not running, and will schedule orchestrator on the remaining "healthy" nodes of the swarm. So, this is probably where it breaks. If the swarm broke (let's say due to network issues), then the storage likely also broke. In other words, you could have a split brain situation on the storage level. So, the issue here is not so much that multiple instances of orchestrator runs. That would also be the case with orchestrator running in raft mode. The issue is that you could have a split brain situation on the storage level, and that could result in havoc when the network normalises again. Although not what I was originally hoping for, does this make sense? |
Right. Change of plans. I never considered the case where two SQLite actually protects multiple process writes to the same backend database by acquiring a file lock. Now, it remains to clarify whether the shared mount supports file-level locks, and whther SQLite is able to use that lock. If not, things will break. If yes, then the next question is the matter of storage split brain. If storage cannot handle split brains and cannot correctly override changes once network split is alleviated, things will break. |
OK, so true HA for orchestrator would be hard to get up and running with Docker Swarm if split-brain handling in the underlying storage is uncertain. I also considered running orchestrator in raft mode in Docker Swarm, but with that comes a few new challenges :
So, I believe the only way to run orchestrator in a Docker Swarm right now is to run orchestrator as a single instance and have an underlying storage which is somehow capable of resolving split-brain situations. |
Agreed on the limitations of preset hostnames/IPs. Possibly I will address that. |
I was trying to setup Orchestrator in Docker Swarm on Raft mode and thought it worked well with a setup where, in my case 3 orchestrator services were deployed as separate services(orchestrator01-03) instead of one service with 3 replicas. This way initial service discovery is predictable. Would be amazing if Orchestrator tried to re-resolve fqdn's if it loses connection with other Raft nodes. |
Related: #253 |
So, in order for orchestrator to be able to run on top of Docker Swarm, I assume some combination of raft and dynamic discovery of orchestrator peers would have to be supported. Docker Swarm allows any task to discover the IP of it's replicas using a DNS lookup on If the IP address of the task which did the DNS query should be filtered from the list of IP addresses from the DNS answer, one could possibly do this by gathering IP addresses associated with network interfaces local to the task which did the DNS query (a task can be a member of multiple networks). From there, any IP address local to the task could be excluded from the list of IP addresses from the DNS answer. Tasks may come and go as a service is scaled up and down, but I guess this holds true for any raft topology. That is, a raft topology will continuously have to agree on the peers available, and which of the peers to elect as a leader. So, the list of IP addresses for peers would not be limited to static IPs, but also support lookup of hostnames that could return multiple IP addresses. The list of IP addresses which the provided hostname would resolve to could change as the service is scaled up and down. I'm not too familiar with Kubernetes, but if Kubernetes supports a similar type of service discovery, this may open up for HA enabling orchestrator on Kubernetes as well? |
@shlomi-noach I can see that some work has been done in hashicort/raft to move away from IP addresses, and use server IDs to manage the peers in a raft topology. I am not familiar with all the details in this, but maybe that is a step in the right direction. Assuming server IDs would be unique to each instance of orchestrator (e.g. generated upon scheduling / startup of a task replica), I'm not sure how "gone" service replicas in Docker Swarm (gone due to rescheduling / scaling / moving / dead hosts etc etc) would be cleaned up though. Would they just linger as "gone" peers in the list of orchestrator peers? Before I go on with this, is all this anything you would consider implementing in orchestrator? That is, HA enabling orchestrator by letting it run on top of Docker Swarm, Kubernetes or some other container orchestration tool? |
@sbrattla correct. That's why I linked #253, which discusses the use of IDs, and also mentions hashicorp/raft#236.
I don't know yet. I'm not familiar with Docker swarm.
The general answer is: "yes, I want to solve that", but that depends on my priorities. So I can make promises unfortunately. |
Thanks @shlomi-noach. I'm trying to following up on #253 , but running each orchestrator node behind it's own service in Docker Swarm unfortunately seems to break because...
Anyway, I'm happy for your efforts, and very much hoping that enough people are interested in this feature for your to want to allocate some of your time to this :-) |
I've just recently found out about orchestrator, and I have been testing it out trying to get familiar with it.
There are multiple ways to run orchestrator, and raft appears to be the simplest way to achieve HA. However, right now I'm running orchestrator as (single task/instance) service in a Docker Swarm. Is there anything which speaks against this way of achieving HA for orchestrator?
In essence, the HA part is delegated to Docker Swarm, and orchestrator runs with the default sqllite3. The database directory is bind mounted from a host directory (which in turn in a mounted network share), so data will persist.
Ignoring any performance aspects of this setup (underlying storage for database is a network share), is there anything about this setup which would/could break orchestrator?
The text was updated successfully, but these errors were encountered: