Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📚 [Documentation]: Doc related to load balancing #366

Open
huard opened this issue Aug 18, 2023 · 3 comments
Open

📚 [Documentation]: Doc related to load balancing #366

huard opened this issue Aug 18, 2023 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@huard
Copy link
Collaborator

huard commented Aug 18, 2023

Description

The pavics-sdi documentation includes a section on load balancing that is probably obsolete (see below).

An updated version should probably be included in the docs here instead, since I suggest to remove the corresponding content in pavics-sdi.

==============
Load balancing
==============

Here we'll cover the case where pavics-sdi is installed on more than one machine and you want to balance the load across these machines. This is done with `NGINX`_ and requires modifications to :file:`docker-compose.yml` and creating a configuration file for the `NGINX`_ server.

Modifying the :file:`docker-compose.yml`
========================================

To enable load balancing, we need a proxy to redirect requests to machines according to their usage. This is done by mapping proxy ports (5XXXX) to the *service* ports, such as those of flyingpigeon (8093) and malleefowl (8091).

.. code-block:: bash
   :caption: :file:`docker-compose.yml`

   proxy:
     image: nginx
     ports:
       - "58094:8094"
       - "58093:8093"
       - "58091:8091"
     volumes:
       - ./config/proxy/conf.d:/etc/nginx/conf.d
       - ./config/proxy/nginx.conf:/etc/nginx/nginx.conf
     restart: always



Modifying the Nginx configuration
---------------------------------

In the :file:`config/proxy` directory, there should be a file named :file:`nginx.conf`. This file can be edited for example to specify the number of *worker_processes*. In the :file:`conf.d` directory, there are a number of additional configuration file for each *load balanced* service, for example :file:`flyingpigeon.conf`, which would look like:

.. code-block:: bash
   :caption: :file:`config/proxy/conf.d/flyingpigeon.conf`

   upstream flyingpigeon {
       hash $http_machineid;
       server <server1 url>:8093;
       server <server2 url>:8093;
       server <server3 url>:8093;
   }
   server {
       listen 8093;
       location / {
           proxy_pass http://flyingpigeon;
       }
   }

This tell the proxy, listing on port 8093, to redirect requests to servers 1, 2 or 3 according to the ``machineid`` argument passed in the request header. That is, requests with the same ``machineid`` will be sent to the same server. This is important to control since output files are not automatically visible to all servers. So if for example process A downloads a file from a remote server and process B subsets the file, both have to be run on the same machine otherwise process B won't find the downloaded file.

.. note::

   * Server configuration is static
   * It is not possible to assign port numbers to environment variables (eg ``$PORT_NUMBER``)
   * When you change a configuration file and restart NGINX to pick up the new configuration, it implements a *graceful restart*. Both the old and new copies of NGINX run side-by-side for a short period of time. The old processes don’t accept any new connections and terminate once all their existing connections terminate.

.. _NGINX: https://nginx.org
Information Value
Server/Platform URL
Related issues/PR
Related documentation
@tlvu
Copy link
Collaborator

tlvu commented Aug 18, 2023

Interesting, another way to scale across organizations ! The different servers can be from different organisations.

@mishaschwartz
Copy link
Collaborator

Interesting, another way to scale across organizations ! The different servers can be from different organisations.

We would have to think hard about the authn/z implication of this though. We should think about how this will interact with the proposal here: DACCS-Climate/DACCS-executive-committee#8

@fmigneault
Copy link
Collaborator

Each location block of distinct servers could have their respective auth_request sending a sub-request toward their relevant Magpie instance, which would perform a login. If the login is successful (returns 200), the auth_request_set operation can then be used to set the header (eg: Magpie Cookie) for the subsequent request toward the service.

Something like this: https://serverfault.com/a/950019

Docs: http://nginx.org/en/docs/http/ngx_http_auth_request_module.html#auth_request_set

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants