add transmitter, client_choose, aggregation interface

1. add transmitter, client_choose, aggregation interface to Lib. 2. add example of how to use new added interface. Signed-off-by: Jie Pu <pujie2@huawei.com> Signed-off-by: XinYao1994 <xyao@cs.hku.hk>
kubeedge · Sep 3, 2021 · cabdfe0 · cabdfe0
1 parent 6f14617
commit cabdfe0
Show file tree

Hide file tree

Showing 13 changed files with 714 additions and 5 deletions.
diff --git a/build/crd-samples/sedna/federatedlearningjob_yolo_v1alpha1.yaml b/build/crd-samples/sedna/federatedlearningjob_yolo_v1alpha1.yaml
@@ -0,0 +1,54 @@
+apiVersion: sedna.io/v1alpha1
+kind: FederatedLearningJob
+metadata:
+  name: yolo-v5
+spec:
+  pretrainedModel: # option
+    name: "yolo-v5-pretrained-model"
+  transimitter: # option
+    ws: { } # option, by default
+    s3: # option, but at least one
+      aggDataPath: "s3://sedna/fl/aggregation_data"
+      credentialName: mysecret
+  aggregationWorker:
+    model:
+      name: "yolo-v5-model"
+    template:
+      spec:
+        nodeName: "sedna-control-plane"
+        containers:
+          - image: kubeedge/sedna-fl-aggregation:mistnetyolo
+            name: agg-worker
+            imagePullPolicy: IfNotPresent
+            env: # user defined environments
+              - name: "cut_layer"
+                value: "4"
+              - name: "epsilon"
+                value: "100"
+              - name: "aggregation_algorithm"
+                value: "mistnet"
+              - name: "batch_size"
+            resources: # user defined resources
+              limits:
+                memory: 8Gi
+  trainingWorkers:
+    - dataset:
+        name: "coco-dataset"
+      template:
+        spec:
+          nodeName: "edge-node"
+          containers:
+            - image: kubeedge/sedna-fl-train:mistnetyolo
+              name: train-worker
+              imagePullPolicy: IfNotPresent
+              args: [ "-i", "1" ]
+              env: # user defined environments
+                - name: "batch_size"
+                  value: "32"
+                - name: "learning_rate"
+                  value: "0.001"
+                - name: "epochs"
+                  value: "1"
+              resources: # user defined resources
+                limits:
+                  memory: 2Gi
diff --git a/examples/federated_learning/yolov5_coco128_mistnet/README.md b/examples/federated_learning/yolov5_coco128_mistnet/README.md
@@ -0,0 +1,238 @@
+# Collaboratively Train Yolo-v5 Using MistNet on COCO128 Dataset
+
+This case introduces how to train a federated learning job with an aggregation algorithm named MistNet in MNIST
+handwritten digit classification scenario. Data is scattered in different places (such as edge nodes, cameras, and
+others) and cannot be aggregated at the server due to data privacy and bandwidth. As a result, we cannot use all the
+data for training. In some cases, edge nodes have limited computing resources and even have no training capability. The
+edge cannot gain the updated weights from the training process. Therefore, traditional algorithms (e.g., federated
+average), which usually aggregate the updated weights trained by different edge clients, cannot work in this scenario.
+MistNet is proposed to address this issue.
+
+MistNet partitions a DNN model into two parts, a lightweight feature extractor at the edge side to generate meaningful
+features from the raw data, and a classifier including the most model layers at the cloud to be iteratively trained for
+specific tasks. MistNet achieves acceptable model utility while greatly reducing privacy leakage from the released
+intermediate features.
+
+## Object Detection Experiment
+
+> Assume that there are two edge nodes and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues.
+> Base on this scenario, we will demonstrate the mnist example.
+
+### Prepare Nodes
+
+```
+CLOUD_NODE="cloud-node-name"
+EDGE1_NODE="edge1-node-name"
+EDGE2_NODE="edge2-node-name"
+```
+
+### Install Sedna
+
+Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.
+
+### Prepare Dataset
+
+Download [dataset](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip) and do data partition
+
+```
+wget https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
+unzip coco128.zip -d data
+rm coco128.zip
+python partition.py ./data 2
+```
+
+move ```./data/1``` to `/data` of ```EDGE1_NODE```.
+
+```
+mkdir -p /data
+cd /data
+mv ./data/1 ./
+```
+
+move ```./data/2``` to `/data` of ```EDGE2_NODE```.
+
+```
+mkdir -p /data
+cd /data
+mv ./data/2 ./
+```
+
+### Prepare Images
+
+This example uses these images:
+
+1. aggregation worker: ```kubeedge/sedna-example-federated-learning-mistnet:v0.3.0```
+2. train worker: ```kubeedge/sedna-example-federated-learning-mistnet-client:v0.3.0```
+
+These images are generated by the script [build_images.sh](/examples/build_image.sh).
+
+### Create Federated Learning Job
+
+#### Create Dataset
+
+create dataset for `$EDGE1_NODE`
+
+```n
+kubectl create -f - <<EOF
+apiVersion: sedna.io/v1alpha1
+kind: Dataset
+metadata:
+  name: "coco-dataset"
+spec:
+  url: "/data/test.txt"
+  format: "txt"
+  nodeName: edge-node
+EOF
+```
+
+create dataset for `$EDGE2_NODE`
+
+```
+kubectl create -f - <<EOF
+apiVersion: sedna.io/v1alpha1
+kind: Dataset
+metadata:
+  name: "coco-dataset"
+spec:
+  url: "/data/test.txt"
+  format: "txt"
+  nodeName: edge-node
+EOF
+```
+
+#### Create Model
+
+create the directory `/model` in the host of `$EDGE1_NODE`
+
+```
+mkdir /model
+```
+
+create the directory `/model` in the host of `$EDGE2_NODE`
+
+```
+mkdir /model
+```
+
+```
+TODO: put pretrained model on nodes.
+```
+
+create model
+
+```
+kubectl create -f - <<EOF
+apiVersion: sedna.io/v1alpha1
+kind: Model
+metadata:
+  name: "yolo-v5-model"
+spec:
+  url: "/model/yolo.pb"
+  format: "pb"
+EOF
+```
+
+#### Start Federated Learning Job
+
+```
+kubectl create -f - <<EOF
+apiVersion: sedna.io/v1alpha1
+kind: FederatedLearningJob
+metadata:
+  name: mistnet-on-mnist-dataset
+spec:
+  stopCondition:
+    operator: "or" # and
+      conditions:
+        - operator: ">"
+          threshold: 100
+          metric: rounds
+        - operator: ">"
+          threshold: 0.95
+          metric: targetAccuracy
+        - operator: "<"
+          threshold: 0.03
+          metric: deltaLoss
+  aggregationTrigger:
+    condition:
+      operator: ">"
+      threshold: 5
+      metric: num_of_ready_clients
+  aggregationWorker:
+    model:
+      name: "mistnet-on-mnist-model"
+    template:
+      spec:
+        nodeName: $CLOUD_NODE
+        containers:
+          - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-aggregation:v0.4.0
+            name:  agg-worker
+            imagePullPolicy: IfNotPresent
+            env: # user defined environments
+              - name: "cut_layer"
+                value: "4"
+              - name: "epsilon"
+                value: "100"
+              - name: "aggregation_algorithm"
+                value: "mistnet"
+              - name: "batch_size"
+                value: "10"
+            resources:  # user defined resources
+              limits:
+                memory: 2Gi
+  trainingWorkers:
+    - dataset:
+        name: "edge1-surface-defect-detection-dataset"
+      template:
+        spec:
+          nodeName: $EDGE1_NODE
+          containers:
+            - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0
+              name:  train-worker
+              imagePullPolicy: IfNotPresent
+              env:  # user defined environments
+                - name: "batch_size"
+                  value: "32"
+                - name: "learning_rate"
+                  value: "0.001"
+                - name: "epochs"
+                  value: "2"
+              resources:  # user defined resources
+                limits:
+                  memory: 2Gi
+    - dataset:
+        name: "edge2-surface-defect-detection-dataset"
+      template:
+        spec:
+          nodeName: $EDGE2_NODE
+          containers:
+            - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0
+              name:  train-worker
+              imagePullPolicy: IfNotPresent
+              env:  # user defined environments
+                - name: "batch_size"
+                  value: "32"
+                - name: "learning_rate"
+                  value: "0.001"
+                - name: "epochs"
+                  value: "2"
+              resources:  # user defined resources
+                limits:
+                  memory: 2Gi
+EOF
+```
+
+```
+TODO: show the benifit of mistnet. for example, the compared results of fedavg & mistnet.
+
+```
+
+### Check Federated Learning Status
+
+```
+kubectl get federatedlearningjob surface-defect-detection
+```
+
+### Check Federated Learning Train Result
+
+After the job completed, you will find the model generated on the directory `/model` in `$EDGE1_NODE` and `$EDGE2_NODE`.
diff --git a/examples/federated_learning/yolov5_coco128_mistnet/aggregate.py b/examples/federated_learning/yolov5_coco128_mistnet/aggregate.py
@@ -0,0 +1,35 @@
+# Copyright 2021 The KubeEdge Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from interface import mistnet, s3_transmitter, simple_chooser
+from interface import Dataset, Estimator
+from sedna.service.server import AggregationServer
+
+
+def run_server():
+    data = Dataset()
+    estimator = Estimator()
+
+    server = AggregationServer(
+        data=data,
+        estimator=estimator,
+        aggregation=mistnet,
+        transmitter=s3_transmitter,
+        chooser=simple_chooser)
+
+    server.start()
+
+
+if __name__ == '__main__':
+    run_server()