Skip to content

Official implementation of MobiCom 2023 paper "QfaR: Location-Guided Scanning of Visual Codes from Long Distances"

License

Notifications You must be signed in to change notification settings

snap-research/qfar

Repository files navigation

QfaR: Location-Guided Scanning of Visual Codes from Long Distances

[Paper] [Website]

Sizhuo Ma1, Jian Wang1, Wenzheng Chen3, Suman Banerjee2, Mohit Gupta2, Shree Nayar1

1Snap Inc., 2University of Wisconsin-Madison, 3University of Toronto

Overview

This is the official implementation of QfaR, a location-guided QR code scanner. The complete pipeline of the proposed method is described as follows: When the user wants to scan a code, the app will take a picture of the current scene. The captured image is sent to a QR code detector which crops out only the part of the image that contains the code. Simultaneously, the GPS location of the user will be sent to a database and look up a list of QR codes within the vicinity. The scanned code image is then matched to the list of codes using the intensity-based matching described above. In this repo, we generate random QR codes as our candidated codes.

workflow

Code Detection

We first use a YOLO network to detect a bounding box of the QR code in the scene. We then use a key point detection network to predict a heat map for potential key points (3 corners) The fourth corner is predicted by assuming the four key points form a parallelogram (weak-perspective projection). A homography is computed to transform the paralellogram to a square (rectification) such that codes captured at different viewing angles can be matched directly. Both networks are trained with simulation data with physics-based imaging models.

workflow

Intensity-Based Matching

How to find the correct code from the pruned list? Conventional decoders apply thresholding to the captured degraded code image to get a binary code, which contains lots of error bits and therefore cannot be decoded. Although the captured code image is heavily degraded, it still contains visual features such as blobs of black/white bits. Therefore, we propose to treat these captured code as "images" and match based on their intensities. Specifically, we find the candidate code D with shortest L2 distance to scanned code Im (template matching) D m = argmin D ∈ S d L 2 ( I m , D ) . Please refer to the paper for more insights and reasoning behind this design.

workflow

Quick Start

Clone the repository

git clone https://github.com/snap-research/qfar.git
cd qfar

Create conda environment

conda env create -f environment.yml

Notice this environment only contains CPU-only pytorch libraries. Please modify environment.yml accordingly if you need GPU inference.

Tested on:

  • python=3.8
  • pytorch=2.0.0
  • opencv=4.8

Download pretrained aligner/detector models

Download pretrained models at https://www.dropbox.com/scl/fo/3jfd836ax7evte48d1tl6/h?rlkey=q2g4by9zxddzgfgq9rasiyhnu&dl=1 and place the checkpoints under ./aligner and ./detector.

Run QfaR

conda activate qfar
python example_pipeline.py

This will run the pipeline described above (detector, aligner, decoder) on test images in the data/ folder. Please take a look at the code for detailed usage.

Output will be stored to results.txt. Below is an example.

IMG ID: 2
Matched code: 0
Matched ratio: 0.448256
Time for processing image 2: 0.015902
  • IMG ID: ID of the test image
  • Matched code: Index of the matched code. In the test examples, we always assume the ground truth code has an index of 0.
  • Matched ratio: Confidence value of this match

Contact

Sizhuo Ma (sizhuoma@gmail.com)

Reference

@inproceedings{ma2023qfar,
  author    = {Ma, Sizhuo and Wang, Jian and Chen, Wenzheng and Banerjee, Suman and Gupta, Mohit and Nayar, Shree},
  title     = {QfaR: Location-Guided Scanning of Visual Codes from Long Distances},
  year      = {2023},
  booktitle = {Proceedings of the 29th Annual International Conference on Mobile Computing and Networking},
  articleno = {4},
  numpages = {14},
  series = {MobiCom '23}
}

About

Official implementation of MobiCom 2023 paper "QfaR: Location-Guided Scanning of Visual Codes from Long Distances"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published