This repo contains the Amazon Mechanical Turk (AMT) workflow scripts for the paper:
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [PDF]
John Lambert*,
Zhuang Liu*,
Ozan Sener,
James Hays,
Vladlen Koltun
Presented at CVPR 2020. Link to MSeg Video (3min)
This repo is the fourth of 4 repos that introduce our work. It provides utilities to perform large-scale Mechanical Turk re-labeling.
mseg-api
: utilities to download the MSeg dataset, prepare the data on disk in a unified taxonomy, on-the-fly mapping to a unified taxonomy during training.mseg-semantic
: provides HRNet-W48 pre-trained models and training code (sufficient to train a winning entry on the WildDash benchmark)
One additional repo will be introduced in August 2020:
mseg-panoptic
: provides Panoptic-FPN and Mask-RCNN training, based on Detectron2
Install the mseg
module from mseg-api
.
-
mseg_mturk
can be installed as a python package usingpip install -e /path_to_root_directory_of_the_repo/
Make sure that you can run import mseg_mturk
in python, and you are good to go!
This repository contains the following items:
mseg_mturk
: Python module with HIT publishing + evaluation scriptshit_html
: auto-populated HTML to render HIT UI pageimage_elements
: auto-populated HTML element for each maskinstruction_files
: auto-populated instruction HTML pages for workerstemplate_html
: template HTML code that is used to auto-populate HIT specificationstests
: unit tests
- Total time spent relabeling: 1.34 years of uninterrupted work.
Most time-intensive tasks:
- 106 days (~3.5 months) COCO "person",
- 87 days (~3 months) for IDD "rider",
- 20 days for COCO "table,
- 19 days for COCO "chair",
- 19 days for COCO "counter".
- ...
We design a careful workflow to ensure a high quality bar for annotations submitted by Mechanical Turk workers.
Our re-labeling workflow proceeds in 6 main stages:
(1) Hand-classify sentinels for each task, and create a BatchResult class with SentinelHIT specification.
(2) Run `mseg_mturk/publish_tasks.py` to generate HIT html, HIT csv, and instructions html. Sentinels are embedded into the 100-image HIT csv.
(2) Submit HIT on Amazon Mechanical Turk (AMT).
(2) Analyze accuracy of each submitted HIT using `mseg_mturk/eval_result.py`.
For each one, for all 100 images, check if it is a sentinel.
If it is a sentinel, check correctness. Compute mean accuracy per HIT.
Set status in WorkerHITResult for each HIT to 'Approved' or 'Rejected'
based on 100% accuracy cutoff.
(3) Enter WorkerHITResult decisions into 'analyzed' version of csv. Upload analyzed csv to MTurk, and re-assign rejected jobs.
(4) Analyze multinomial worker agreement. For those HITs that were approved,
make a list of assigned labels per URL. Also record the number of approved
observations per image.
(5) Take mode from approved, consider this the relabeled category.
(6) Manually review batch quality.
(7) Record relabeled list for each (dataset, original_classname) tuple.
Via Google Drive, we provide access to the class examplar images we provided to MTurk annotators in their instructions: animals, rug vs. carpet, cabinet, nightstand, desk, chest-of-drawers, wardrobe, curtain vs. shower curtain, mountain vs. hill vs. snow, fence vs. guardrail, and all other shattered classes.
If you find this code useful for your research, please cite:
@InProceedings{MSeg_2020_CVPR,
author = {Lambert, John and Liu, Zhuang and Sener, Ozan and Hays, James and Koltun, Vladlen},
title = {{MSeg}: A Composite Dataset for Multi-domain Semantic Segmentation},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}
Many thanks to Qifeng Chen for his base AMT workflow, which he shared with us. We are also grateful to the Amazon Mechanical Turk workers who completed 1.34 years of uninterrupted annotation to make MSeg happen!