Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive API Experiment - Pytorch Re-ID on Market #156

Merged
merged 50 commits into from
Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
926c217
add re-id experiment with pytorch
katerina-merkulova Aug 18, 2021
9132dc1
add re-id experiment with pytorch
katerina-merkulova Aug 18, 2021
41c77af
Create global variables for CA files/paths (#150)
dmitryagapov Aug 18, 2021
83589b3
fix lint
katerina-merkulova Aug 18, 2021
beaa8a9
fix lint
katerina-merkulova Aug 18, 2021
38bac04
@alexey-gruzdev has signed the CLA from Pull Request #156
github-actions[bot] Aug 19, 2021
471c082
fixes after review
katerina-merkulova Aug 19, 2021
6c4879e
Full traceback call stack on broad exception handling. (#157)
aleksandr-mokrov Aug 19, 2021
e3c2d77
fix interface
katerina-merkulova Aug 19, 2021
77fac58
fix shard decriptor logic
katerina-merkulova Aug 19, 2021
8a97dab
minor fixes
katerina-merkulova Aug 19, 2021
db031df
fix getitem logic
katerina-merkulova Aug 20, 2021
8ccf985
fix lint
katerina-merkulova Aug 20, 2021
8ddbaf1
rm cell outputs
katerina-merkulova Aug 22, 2021
cd4947d
Configure envoy health check period form director (#153)
aleksandr-mokrov Aug 23, 2021
9f52f3a
add shard descriptor's copy
katerina-merkulova Aug 23, 2021
7474ef0
minor changes
katerina-merkulova Aug 23, 2021
594770e
add downloading
katerina-merkulova Aug 23, 2021
82dbdad
add shard_config key to start_envoy
katerina-merkulova Aug 23, 2021
7840401
fix lint
katerina-merkulova Aug 23, 2021
2140515
fix lint
katerina-merkulova Aug 23, 2021
7ca4f9e
add check if dataset folder exists
katerina-merkulova Aug 23, 2021
847d4dc
improve check if dataset folder exists
katerina-merkulova Aug 23, 2021
4f48131
Tests for interactive API (#151)
itrushkin Aug 24, 2021
44f42cc
Unbalanced dataset splits (#125)
itrushkin Aug 25, 2021
0a695d5
Use epochs instead batch_num. Log current epoch number (#95)
maradionov Aug 26, 2021
57c1068
add re-id experiment with pytorch
katerina-merkulova Aug 18, 2021
186ae44
add re-id experiment with pytorch
katerina-merkulova Aug 18, 2021
344b1e7
fix lint
katerina-merkulova Aug 18, 2021
94c648c
fix lint
katerina-merkulova Aug 18, 2021
f2b5a03
fixes after review
katerina-merkulova Aug 19, 2021
e8c1d07
fix interface
katerina-merkulova Aug 19, 2021
6230ebc
fix shard decriptor logic
katerina-merkulova Aug 19, 2021
615dd06
minor fixes
katerina-merkulova Aug 19, 2021
48155f5
fix getitem logic
katerina-merkulova Aug 20, 2021
bfcf440
fix lint
katerina-merkulova Aug 20, 2021
379ba80
rm cell outputs
katerina-merkulova Aug 22, 2021
3bace4e
add shard descriptor's copy
katerina-merkulova Aug 23, 2021
5bc7305
minor changes
katerina-merkulova Aug 23, 2021
d04cbe9
add downloading
katerina-merkulova Aug 23, 2021
045ff68
add shard_config key to start_envoy
katerina-merkulova Aug 23, 2021
af390d9
fix lint
katerina-merkulova Aug 23, 2021
b2224f7
fix lint
katerina-merkulova Aug 23, 2021
8d50b74
add check if dataset folder exists
katerina-merkulova Aug 23, 2021
555ba6f
improve check if dataset folder exists
katerina-merkulova Aug 23, 2021
177cad0
Merge branch 'develop' of https://github.com/katerina-merkulova/openf…
katerina-merkulova Aug 26, 2021
3b48802
fix requirements
katerina-merkulova Aug 30, 2021
1fc763e
remove repeating downloading
katerina-merkulova Aug 31, 2021
f080fba
fix downloading logic
katerina-merkulova Aug 31, 2021
d75e1a9
fix lint
katerina-merkulova Aug 31, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
settings:
listen_ip: localhost
sample_shape: ['64', '128', '3']
target_shape: ['1501']
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
set -e

fx director start --disable-tls -c director_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
set -e
FQDN=$1
fx director start -c director_config.yaml -rc cert/root_ca.crt -pk cert/"${FQDN}".key -oc cert/"${FQDN}".crt
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Copyright (C) 2020-2021 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

"""Market shard descriptor."""

import re
from logging import getLogger
from pathlib import Path

import numpy as np
from PIL import Image

from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor


logger = getLogger(__name__)

# Previously download data and put to project folder
# URL: https://www.kaggle.com/pengcw1/market-1501

# search in whole project directory
DATAPATH = list(Path.cwd().parents[2].rglob('**/Market'))[0] # parent directory of project
igor-davidyuk marked this conversation as resolved.
Show resolved Hide resolved


class MarketShardDescriptor(ShardDescriptor):
"""
Market1501 Shard descriptor class.

Reference:
Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
URL: http://www.liangzheng.org/Project/project_reid.html

Dataset statistics:
identities: 1501 (+1 for background)
images: 12936 (train) + 3368 (query) + 15913 (gallery)
"""

def __init__(self, rank_worldsize: str = '1,1') -> None:
"""Initialize MarketShardDescriptor."""
super().__init__()

# Settings for sharding the dataset
self.rank, self.worldsize = tuple(int(num) for num in rank_worldsize.split(','))

self.pattern = re.compile(r'([-\d]+)_c(\d)')
self.dataset_dir = Path(DATAPATH)
self.train_dir = self.dataset_dir / 'bounding_box_train'
self.query_dir = self.dataset_dir / 'query'
self.gallery_dir = self.dataset_dir / 'bounding_box_test'
self.imgs_path = list(self.train_dir.glob('*.jpg'))[self.rank - 1::self.worldsize]
igor-davidyuk marked this conversation as resolved.
Show resolved Hide resolved

self._check_before_run()

self.train, self.num_train_pids, self.num_train_imgs = self._process_dir(
self.train_dir, relabel=True
)
self.query, self.num_query_pids, self.num_query_imgs = self._process_dir(
self.query_dir, relabel=False
)
self.gallery, self.num_gallery_pids, self.num_gallery_imgs = self._process_dir(
self.gallery_dir, relabel=False
)

num_total_pids = self.num_train_pids + self.num_query_pids
num_total_imgs = self.num_train_imgs + self.num_query_imgs + self.num_gallery_imgs

logger.info(
'=> Market1501 loaded\n'
'Dataset statistics:\n'
' ------------------------------\n'
' subset | # ids | # images\n'
' ------------------------------\n'
f' train | {self.num_train_pids} | {self.num_train_imgs}\n'
f' query | {self.num_query_pids} | {self.num_query_imgs}\n'
f' gallery | {self.num_gallery_pids} | {self.num_gallery_imgs}\n'
'------------------------------\n'
f'total | {num_total_pids} | {num_total_imgs}\n'
' ------------------------------'
)

def __len__(self):
"""Length of shard."""
return len(self.imgs_path)

def __getitem__(self, index: int):
"""Return a item by the index."""
img_path = self.imgs_path[index]
pid, _ = map(int, self.pattern.search(img_path.name).groups())

img = Image.open(img_path)
img = np.asarray(img)
return img, pid

@property
def sample_shape(self):
"""Return the sample shape info."""
return ['64', '128', '3']

@property
def target_shape(self):
"""Return the target shape info."""
return ['1501']

@property
def dataset_description(self) -> str:
"""Return the dataset description."""
return (f'Market dataset, shard number {self.rank} '
f'out of {self.worldsize}')

def _check_before_run(self):
"""Check if all files are available before going deeper."""
if not self.dataset_dir.exists():
raise RuntimeError(f'{self.dataset_dir} is not available')
if not self.train_dir.exists():
raise RuntimeError(f'{self.train_dir} is not available')
if not self.query_dir.exists():
raise RuntimeError(f'{self.query_dir} is not available')
if not self.gallery_dir.exists():
raise RuntimeError(f'{self.gallery_dir} is not available')

def _process_dir(self, dir_path, relabel=False, label_start=0):
"""Get data from directory."""
img_paths = list(dir_path.glob('*.jpg'))[self.rank - 1::self.worldsize]

pid_container = set()
for img_path in img_paths:
pid, _ = map(int, self.pattern.search(img_path.name).groups())
if pid == -1:
continue # junk images are just ignored
pid_container.add(pid)
pid2label = {pid: label for label, pid in enumerate(pid_container)}

dataset = []
for img_path in img_paths:
pid, camid = map(int, self.pattern.search(img_path.name).groups())
if pid == -1:
continue # junk images are just ignored
if label_start == 0:
assert 0 <= pid <= 1501 # pid == 0 means background
assert 1 <= camid <= 6
camid -= 1 # index starts from 0
if relabel:
pid = pid2label[pid] + label_start
dataset.append((img_path, pid, camid))

num_pids = len(pid_container)
num_imgs = len(dataset)
return dataset, num_pids, num_imgs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
template: market_shard_descriptor.MarketShardDescriptor
params:
rank_worldsize: 1,2
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
template: market_shard_descriptor.MarketShardDescriptor
params:
rank_worldsize: 2,2
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
set -e

fx envoy start -n env_one --disable-tls -dh localhost -dp 50051
katerina-merkulova marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash
set -e
ENVOY_NAME=$1
DIRECTOR_FQDN=$2

fx envoy start -n "$ENVOY_NAME" --shard-config-path shard_config.yaml -d "$DIRECTOR_FQDN":50051 -rc cert/root_ca.crt -pk cert/"$ENVOY_NAME".key -oc cert/"$ENVOY_NAME".crt
Loading