Skip to content

Commit

Permalink
add transmitter, client_choose, aggregation interface
Browse files Browse the repository at this point in the history
1. add transmitter, client_choose, aggregation interface to Lib.
2. add example of how to use new added interface.

Signed-off-by: Jie Pu <pujie2@huawei.com>
Signed-off-by: XinYao1994 <xyao@cs.hku.hk>
  • Loading branch information
jaypume committed Sep 3, 2021
1 parent 6f14617 commit cabdfe0
Show file tree
Hide file tree
Showing 13 changed files with 714 additions and 5 deletions.
54 changes: 54 additions & 0 deletions build/crd-samples/sedna/federatedlearningjob_yolo_v1alpha1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
apiVersion: sedna.io/v1alpha1
kind: FederatedLearningJob
metadata:
name: yolo-v5
spec:
pretrainedModel: # option
name: "yolo-v5-pretrained-model"
transimitter: # option
ws: { } # option, by default
s3: # option, but at least one
aggDataPath: "s3://sedna/fl/aggregation_data"
credentialName: mysecret
aggregationWorker:
model:
name: "yolo-v5-model"
template:
spec:
nodeName: "sedna-control-plane"
containers:
- image: kubeedge/sedna-fl-aggregation:mistnetyolo
name: agg-worker
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "cut_layer"
value: "4"
- name: "epsilon"
value: "100"
- name: "aggregation_algorithm"
value: "mistnet"
- name: "batch_size"
resources: # user defined resources
limits:
memory: 8Gi
trainingWorkers:
- dataset:
name: "coco-dataset"
template:
spec:
nodeName: "edge-node"
containers:
- image: kubeedge/sedna-fl-train:mistnetyolo
name: train-worker
imagePullPolicy: IfNotPresent
args: [ "-i", "1" ]
env: # user defined environments
- name: "batch_size"
value: "32"
- name: "learning_rate"
value: "0.001"
- name: "epochs"
value: "1"
resources: # user defined resources
limits:
memory: 2Gi
238 changes: 238 additions & 0 deletions examples/federated_learning/yolov5_coco128_mistnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# Collaboratively Train Yolo-v5 Using MistNet on COCO128 Dataset

This case introduces how to train a federated learning job with an aggregation algorithm named MistNet in MNIST
handwritten digit classification scenario. Data is scattered in different places (such as edge nodes, cameras, and
others) and cannot be aggregated at the server due to data privacy and bandwidth. As a result, we cannot use all the
data for training. In some cases, edge nodes have limited computing resources and even have no training capability. The
edge cannot gain the updated weights from the training process. Therefore, traditional algorithms (e.g., federated
average), which usually aggregate the updated weights trained by different edge clients, cannot work in this scenario.
MistNet is proposed to address this issue.

MistNet partitions a DNN model into two parts, a lightweight feature extractor at the edge side to generate meaningful
features from the raw data, and a classifier including the most model layers at the cloud to be iteratively trained for
specific tasks. MistNet achieves acceptable model utility while greatly reducing privacy leakage from the released
intermediate features.

## Object Detection Experiment

> Assume that there are two edge nodes and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues.
> Base on this scenario, we will demonstrate the mnist example.
### Prepare Nodes

```
CLOUD_NODE="cloud-node-name"
EDGE1_NODE="edge1-node-name"
EDGE2_NODE="edge2-node-name"
```

### Install Sedna

Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.

### Prepare Dataset

Download [dataset](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip) and do data partition

```
wget https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
unzip coco128.zip -d data
rm coco128.zip
python partition.py ./data 2
```

move ```./data/1``` to `/data` of ```EDGE1_NODE```.

```
mkdir -p /data
cd /data
mv ./data/1 ./
```

move ```./data/2``` to `/data` of ```EDGE2_NODE```.

```
mkdir -p /data
cd /data
mv ./data/2 ./
```

### Prepare Images

This example uses these images:

1. aggregation worker: ```kubeedge/sedna-example-federated-learning-mistnet:v0.3.0```
2. train worker: ```kubeedge/sedna-example-federated-learning-mistnet-client:v0.3.0```

These images are generated by the script [build_images.sh](/examples/build_image.sh).

### Create Federated Learning Job

#### Create Dataset

create dataset for `$EDGE1_NODE`

```n
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: "coco-dataset"
spec:
url: "/data/test.txt"
format: "txt"
nodeName: edge-node
EOF
```

create dataset for `$EDGE2_NODE`

```
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: "coco-dataset"
spec:
url: "/data/test.txt"
format: "txt"
nodeName: edge-node
EOF
```

#### Create Model

create the directory `/model` in the host of `$EDGE1_NODE`

```
mkdir /model
```

create the directory `/model` in the host of `$EDGE2_NODE`

```
mkdir /model
```

```
TODO: put pretrained model on nodes.
```

create model

```
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
name: "yolo-v5-model"
spec:
url: "/model/yolo.pb"
format: "pb"
EOF
```

#### Start Federated Learning Job

```
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: FederatedLearningJob
metadata:
name: mistnet-on-mnist-dataset
spec:
stopCondition:
operator: "or" # and
conditions:
- operator: ">"
threshold: 100
metric: rounds
- operator: ">"
threshold: 0.95
metric: targetAccuracy
- operator: "<"
threshold: 0.03
metric: deltaLoss
aggregationTrigger:
condition:
operator: ">"
threshold: 5
metric: num_of_ready_clients
aggregationWorker:
model:
name: "mistnet-on-mnist-model"
template:
spec:
nodeName: $CLOUD_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-aggregation:v0.4.0
name: agg-worker
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "cut_layer"
value: "4"
- name: "epsilon"
value: "100"
- name: "aggregation_algorithm"
value: "mistnet"
- name: "batch_size"
value: "10"
resources: # user defined resources
limits:
memory: 2Gi
trainingWorkers:
- dataset:
name: "edge1-surface-defect-detection-dataset"
template:
spec:
nodeName: $EDGE1_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0
name: train-worker
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "batch_size"
value: "32"
- name: "learning_rate"
value: "0.001"
- name: "epochs"
value: "2"
resources: # user defined resources
limits:
memory: 2Gi
- dataset:
name: "edge2-surface-defect-detection-dataset"
template:
spec:
nodeName: $EDGE2_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0
name: train-worker
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "batch_size"
value: "32"
- name: "learning_rate"
value: "0.001"
- name: "epochs"
value: "2"
resources: # user defined resources
limits:
memory: 2Gi
EOF
```

```
TODO: show the benifit of mistnet. for example, the compared results of fedavg & mistnet.
```

### Check Federated Learning Status

```
kubectl get federatedlearningjob surface-defect-detection
```

### Check Federated Learning Train Result

After the job completed, you will find the model generated on the directory `/model` in `$EDGE1_NODE` and `$EDGE2_NODE`.
35 changes: 35 additions & 0 deletions examples/federated_learning/yolov5_coco128_mistnet/aggregate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Copyright 2021 The KubeEdge Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from interface import mistnet, s3_transmitter, simple_chooser
from interface import Dataset, Estimator
from sedna.service.server import AggregationServer


def run_server():
data = Dataset()
estimator = Estimator()

server = AggregationServer(
data=data,
estimator=estimator,
aggregation=mistnet,
transmitter=s3_transmitter,
chooser=simple_chooser)

server.start()


if __name__ == '__main__':
run_server()
Loading

0 comments on commit cabdfe0

Please sign in to comment.