The project Decentralized AI in Healthcare (DecAIHealth) is a project under the umbrella of Information Driven Healthcare, AI Sweden. The project includes the two partners Region Halland/Halmstad University (RH/HU) and Västra Götalandsregionen/Sahlgrenska University Hospital (VGR/SU), and is coordinated by AI Sweden.
The overarching purpose of this project is to evaluate the possibilities for jointly training and exchanging machine learning models between Swedish healthcare regions/hospitals. For this purpose, methods will be used for decentralized training of joint machine learning models between both health regions, for example through federated learning.
The project includes three main phases (initially two phases, but now extended to also include an initial Phase 0):
-
Phase 0 establishes the project's technical feasibility through a “proof-of-concept” that the network communication is working.
-
Phase 1 verifies that decentralized machine learning models can be jointly trained based on publicly available healthcare datasets, such as the tabular dataset MIMIC-IV and the image dataset SIIM-ISIC.
-
Phase 2 inititates by a mutal agreement upon clinical dataset and beneficial machine learning models, followed by decentralized training and validation of those models based on (both regions' own) clinical healthcare data.
The project will last until the end of 2022, and a tentative time plan for the project can be found below. However, it should be noted that this time plan might be subject to changes (in mutual agreement between all the partners within the project). In addition, this time plan will also be updated to reflect the progress of the project (by indicating completed and remaining tasks).
Id. | Date | Decription | Completed | Required |
---|---|---|---|---|
1 | 2022-04-22 | SU: "Dummy" server exposed externally through a fixed IP address and network port. | ✓ | ✓ |
2 | 2022-04-22 | Phase 0 completed: RH verifies that an arbitrary client is able to communicate with the server at SU. | ✓ | ✓ |
3 | 2022-04-22 | Flower framework installed on machines with minimal requirements (according to Hardware Requirements) at both RH and SU. Installation verified by a jointly trained model according to Simple Example. | ✓ | ✓ |
4 | 2022-05-06 | Flower framework installed on machines with requested requirements (according to Hardware Requirements) at both RH and SU. Installation verified by a jointly trained model according to MNIST Test Guide. | ✓ | ✓ |
5 | 2022-05-13 | Decentralized model jointly trained based on public imagary dataset (e.g., SIIM-ISIC). Model trained and validated according to ISIC Test Guide. |
✓ | ✓ |
6 | 2022-06-03 | Decentralized model jointly trained based on public tabular dataset (e.g., MIMIC-V). Model trained and validated according to MIMIC Test Guide. |
✗ | ✗ |
7 | 2022-06-10 | Phase 1 completed: test report, based on validation of jointly trained decentralized models, added to this repository. | ✓ | ✓ |
8 | 2022-06-30 | HU: An initial draft for an application for ethical review. | ✗ | ✓ |
The principles of federated learning (as it is known today), were initially proposed by a research team at Google [1]. Federated learning is essentially a machine learning technique for training algorithms across multiple decentralized devices (or clients), without exchanging any data samples between the clients. In contrast to traditional centralized machine learning techniques (where datasets are uploaded to a server and trained centrally), and classical decentralized approaches (which often assume that local datasets are distributed among clients), federated learning instead promotes the idea of training models locally (on local datasets) and only exchanging and distributing the parameters (or weights) of locally trained models.
In practice, federated learning is a client-server approach consisting of a federated server and two (or more) federated clients. Each federated client trains a local model for one (or a few) epochs. Each locally trained model's parameters (or weights) are then sent to the federated server. Next, the federated server aggregates a joint global model (through a particular aggregation function). Subsequently, the aggregated global model is sent back to each federated client, whereby the training continues locally. The training cycle of federated learning (also referred to as federated round) is conceptually illustrated in Figure 1. This training cycle is, subsequently, repeated until the global model has converged.
The Flower framework is a user-friendly framework designed for implementing and traning machinhe learning models in federated settings [2]. Flower is an open-source framework developed as a collaboration between the academic partners CaMLSys Lab, University of Oxford, University of Cambridge, and University College London, as well as the industrial partner Adap. This framework has been developed according to fundamental key characteristics required for a federated framework, such as scalability, usability, and independency regarding operating systems and hardware platforms. However, the Flower framework is more than just a federated framework as it can be regarded as " ...a unified approach to federated learning, analytics, and evaluation."
The Flower framework has been designed as an open-source, extendable, and device agnostic framework. Furthermore, the framework has been designed to be suitable even for devices running lightweight federated learning workloads, such as Raspberry Pi or NVIDIA Jetson, which require minimal or no special configuration. For this project, we have, however, identified the following hardware requirements for federated clients (at least one federated client each is required at both RH and SU):
-
Minimal: a machine (physical or virtual) running Ubuntu 20.04 LTS and with at least the following specification: 4 cores CPU, 16 GB of RAM, and 100 GB of storage.
-
Requested: in addition to the minimal requirements, a GPU with compute capability version 6.0+ and CUDA Toolkit 11.3.
Though we have listed minimal requirements, it is recommended to directly proceed with a system installation according to the requested requirements.
Besides the machines and requirements for federated clients, an additional machine (physical or virtual) is required to act as the federated server (only one federated server is needed at either RH or SU, and we strongly promote that the server is installed at SU). Minimal hardware requirements will be sufficient for the federated server.
An essential prerequisite for the Flower framework is a basic installation of python
(Python 3.6 or higher version). The instructions below further assumes an installation of the pip3
package installer for Python 3.x. To install latest stable version of Flowert (i.e., latest stable release found on PyPI):
pip3 install flwr
...or to install latest unstable releases (nightly build):
pip3 install flwr-nightly
The Flower framework is also agnostic towards which machine learning framework that is used in the background (e.g., PyTorch or TensorFlow). For this project, we will, however, use the PyTorch framework for training the models locally. To install the latest stable releases of PyTorch with GPU compute support of CUDA 11.3:
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cu113
...or to install the latest stable releases with CPU compute support only:
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
First of all, download this repository with included submodules:
git clone --recursive https://github.com/aidotse/decaihealth.git
From within the dowloaded folder (decaihealth
), the script files client.py
and server.py
can be used to verify the installation of both the Flower framework as well as the PyTorch machine learning framework. The script file client.py
will download the MNIST dataset (only downloaded once and stored in a local data
folder), and subsequently, train a simple neural network to classify the handwritten digits found in the MNIST dataset. This client script can be used to train a neural network both locally (with the --locally
argument) or in federated settings (without the --locally
argument). To verify that the PyTorch machine learning framework has been installed correctly, use the client script and train a neural network locally for 15 epochs (specified by the --epochs
argument):
python3 client.py --locally --load_mnist all --epochs 15
Running the example above will (eventually) result in a model trained to an accuracy of about 90%. In the example above, it is also noticeable that the --load_mnist
argument is used. This argument can be used to load only MNIST even digits (--load_mnist odd
) or MNIST odd digits (--load_mnist even
).
If re-running the example above, training a neural network locally on only even or odd digits, the model's accuracy will never reach above 50%. However, if two clients are trained in federated settings - one client trained with even digits and another client trained with odd digits - the joint global model can reach an accuracy of about 90% (i.e., an accuracy comparable to a model trained with all the digits).
To train two clients in federated settings, and to verify that the Flower framework has been installed correctly, use both the client and the server script and train a joint global model for 15 rounds (notice, federated rounds are here used instead of epochs, which is specified by the --rounds
argument):
-
Start the federated server (SU-side):
python3 server.py --rounds 15 --host <server-dns>
-
Start the first federated client that will train with even digits only (SU-side):
python3 client.py --load_mnist even --host <server-dns>
-
Start the second federated client that will train with odd digits only (RH-side):
python3 client.py --load_mnist odd --host <server-dns>
Notice that the server expects that two federated clients are connected (i.e., the server will not aggregate a global model until it has received local models from both clients). Also, in the example above, there is a --host <server-dns>
argument used. The actual DNS for the federated server has been sent out by e-mail and will not be exposed in this repository (for obvious security reasons)!
See guide in file: mnist_test_guide.md
See guide in file: isic_test_guide.md
Not essential to develop and test a model for an open tabular dataset as a model for tabular data already exist at RH.
[1] B. McMahan, et al. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282, PMLR, 2017.
[2] D. J. Beutel, et al. Flower: A friendly federated learning research framework, 2021.