Skip to content
/ SPARE3D Public
forked from ai4ce/SPARE3D

A Dataset for SPAtial REasoning on Three-View Line Drawings

Notifications You must be signed in to change notification settings

Yf-Xue/SPARE3D

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings

Wenyu Han*, Siyuan Xiang*, Chenhui Liu, Ruoyu Wang, Chen Feng

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020

New York University Tandon School of Engineering

Abstract

Spatial reasoning is an important component of human intelligence. We can imagine the shapes of 3D objects and reason about their spatial relations by merely looking at their three-view line drawings in 2D, with different levels of competence. Can deep networks be trained to perform spatial reasoning tasks? How can we measure their "spatial intelligence"? To answer these questions, we present the SPARE3D dataset. Based on cognitive science and psychometrics, SPARE3D contains three types of 2D-3D reasoning tasks on view consistency, camera pose, and shape generation, with increasing difficulty. We then design a method to automatically generate a large number of challenging questions with ground truth answers for each task. They are used to provide supervision for training our baseline models using state-of-the-art architectures like ResNet. Our experiments show that although convolutional networks have achieved superhuman performance in many visual learning tasks, their spatial reasoning performance on SPARE3D tasks is either lower than average human performance or even close to random guesses. We hope SPARE3D can stimulate new problem formulations and network designs for spatial reasoning to empower intelligent robots to operate effectively in the 3D world via 2D sensors.

Dataset

You can download the dataset via our google drive link. This google drive folder contains three zip files:

  1. Task_data.zip is for training baseline;
  2. CSG_model_step.zip contains 11149 CSG models;
  3. Total_view_data contains view drawings of all ABC and CSG models from 11 pose we define in the paper.

Changes after CVPR'20

In our follow-up work led by Siyuan Xiang, Anbang Yang, and Yanfei Xue after CVPR'20, we found outliers in our previous dataset. We remove the ourliers and modify the dataset and the paper accordingly (highlighted in blue), although the main conclusions are not changed. We added the same number of questions that was removed due to outliers, ensuring the total number of questions in the dataset remain the same as in the original paper.

The changes we made to the dataset are detailed as follows:

  • For 3-view to Isometric task, we removed:

    • 6.2% questions whose line drawings contains extremely thick or thin lines
    • 6.8% questions with the same line drawing in different candidate answers
    • 1% questions that are generated from the same CAD models
  • For Isometric to Pose task, we removed:

    • 25% questions that are generated from symmetric CAD models, which will cause pose ambiguity
    • 2.5% questions which contain blank line drawings

Please feel free to report bugs or other problems to us by raising new issues in this project's GitHub repository.

Code (GitHub) & Dependencies

You can find all baseline models in the Code folder. All the baseline models were written for Python 3.7.4 and Pytorch 1.3.0 with CUDA enabled GPU. And the data generation code in Data_generate_script folder. The code depends on the following Python packages: Bagnet, Pythonocc, cairosvg and cv2.

The code is copyrighted by the authors. Permission to copy and use 
 this software for noncommercial use is hereby granted provided: (a)
 this notice is retained in all copies, (2) the publication describing
 the method (indicated below) is clearly cited, and (3) the
 distribution from which the code was obtained is clearly cited. For
 all other uses, please contact the authors.
 
 The software code is provided "as is" with ABSOLUTELY NO WARRANTY
 expressed or implied. Use at your own risk.

This code provides an implementation of the method described in the
following publication: 

Wenyu Han, Siyuan Xiang, Chenhui Liu, Ruoyu Wang, and Chen Feng, 
"SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings," 
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June, 2020.

Data generation

You could directly download the dataset we generate for each task through google drive link. You can also generate more data by using the code we provide in Data_generate_script. Commands to create the data:

python P2I.py -pathread "a floder consists of Step files" -pathwrite "a output folder"
python Three2I.py -pathread "a floder consists of Step files" -pathwrite "a output folder"
python I2P.py -pathread "a floder consists of Step files" -pathwrite "a output folder"

These commands will generate data in SVG format. We also provide a simple script to convert SVG to PNG format if you need (Notice: This code will delete the svg files after converting. If you need original SVG files, please make a copy before you use this script).

python svg2png.py -f "a folder of SVG files" 

Train

You can simple train our baseline models as following commands:

python I2P.py --Training_dataroot "path to training dataset" --Validating_dataroot "path to validating dataset" --outf "folder to output log"

You can use similar command to train all other baseline models listed in Code folder.

To cite our paper:

@inproceedings{SPARE3D_CVPR_2020,
  title={SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings},
  author={Han, Wenyu and Xiang, Siyuan and Liu, Chenhui and Wang, Ruoyu and Feng, Chen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14690--14699},
  year={2020}
}

Acknowledgment

Wenyu Han and Siyuan Xiang contributed equally to the coding, data preprocessing/generation, paper writing, and experiments in this project. Chenhui Liu contributed to the crowd-sourcing website and human performance data collection. Ruoyu Wang contributed to the experiments and paper writing. Chen Feng proposed the idea, initiated the project, and contributed to the coding and paper writing.

The research is supported by NSF CPS program under CMMI-1932187. Siyuan Xiang gratefully thanks the IDC Foundation for its scholarship. The authors gratefully thank our human test participants and the helpful comments from Zhaorong Wang, Zhiding Yu, Srikumar Ramalingam, and the anonymous reviewers.

About

A Dataset for SPAtial REasoning on Three-View Line Drawings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%