sample 4 correspondeces
sample 8 correspondeces
sample 20 correspondeces
all correspondences
sample 4 correspondeces
sample 8 correspondeces (outliers removed)
sample 20 correspondeces (outliers removed)
outliers before 20 correspondences (matching 6 and matching 15)
all correspondences
k = 4 | k = 8 | k = 20 | |
---|---|---|---|
DLT | 138.817 | 1.502 | 0.288 |
Normalized DLT | 138.817 | 1.437 | 0.280 |
k = 4 | k = 8 | k = 20 | |
---|---|---|---|
DLT | 344.971 | 12.681 | 5.713 |
Normalized DLT | 344.971 | 29.617 | 3.914 |
RANSAC: an algorithm to fit model to inliers while ignoring outliers
d is the threshold used to identify a point that fit well
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.522 | 0.543 | 0.623 | 0.447 | 0.627 | 0.520 | 0.668 | 0.416 | 0.784 | 0.541 | 0.569 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.220 | 0.164 | 0.438 | 0.382 | 0.492 | 0.342 | 0.320 | 0.571 | 0.314 | 0.252 | 0.350 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.939 | 1.148 | 2.247 | 0.930 | 2.023 | 1.566 | 1.494 | 0.688 | 2.263 | 1.948 | 1.525 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.869 | 0.886 | 2.163 | 1.444 | 1.817 | 1.218 | 0.907 | 1.418 | 1.244 | 2.353 | 1.432 |
LoFTR: Detector-Free Local Feature Matching with Transformers (CVPR 2021)
LoFTR is a detector-free model which removes the feature detector phase and directly produce dense descriptors or dense feature matches
-
Use CNN with FPN to extract multi-level features from both images to get coarse-level features and fine-level features.
-
Coarse-level features are passed through LoFTR modules to extract position and context dependent local features. LoFTR modules contains several times of self-attention and cross-attention layer.
-
After getting two transformed features from LoFTR module, do coarse matching to get matching between two transformed features.
-
Pass fine-level features and matchings received by previous step through coarse-to-fine module to get more precise matching, this module also uses transformer.
sample 4 correspondeces
sample 8 correspondeces
sample 20 correspondeces
all correspondences
sample 4 correspondeces
sample 8 correspondeces
sample 20 correspondeces
all correspondences
k = 4 | k = 8 | k = 20 | |
---|---|---|---|
DLT | 8.252 | 0.915 | 0.517 |
Normalized DLT | 8.252 | 0.917 | 0.520 |
k = 4 | k = 8 | k = 20 | |
---|---|---|---|
DLT | 32.104 | 35.475 | 1.037 |
Normalized DLT | 32.104 | 23.209 | 0.437 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.931 | 0.920 | 0.806 | 1.157 | 0.997 | 0.442 | 0.652 | 0.762 | 0.543 | 0.622 | 0.783 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 0.399 | 0.335 | 0.387 | 0.465 | 0.415 | 0.467 | 0.430 | 0.361 | 0.371 | 0.364 | 0.399 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 1.692 | 0.709 | 3.216 | 2.014 | 1.546 | 2.157 | 0.977 | 3.911 | 1.261 | 2.823 | 2.031 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | average | |
---|---|---|---|---|---|---|---|---|---|---|---|
error | 1.173 | 1.303 | 0.727 | 1.467 | 1.132 | 0.980 | 1.191 | 0.751 | 1.190 | 0.624 | 1.05 |
The result of using normalized or not shows that with more pairs of matching, normalize operation can often get a lower reprojection error
The result shows that the 4 pairs of matching we get from RANSAC can have a great improvement(e.g., 344.971 to 1.525 in image pair 1-0.png 1-2.png with threshold d=1). So it shows that RANSAC can effectively remove outliers that may influence the result.
In Image pair 1-0.png 1-2.png, we need to see the connections and remove outlier manually without using RANSAC, but with RANSAC, we don't have to do this work anymore.
The result shows that the 4 pairs of matching we get from Deep Learning model can have a great improvement(e.g., 344.971 to 32.104 in image pair 1-0.png 1-2.png). So it shows that LoFTR can work better than using ratio test after SIFT
With setting threshold d=2, it shows that the result of using Deep learning + RANSAC is worse than using RANSAC. But when threshold d=1, Deep learning + RANSAC is nearly or better than using RANSAC. The reason I considered is that the matchings we get from Deep Learning is so good that we need a more strictly threshold to find good inliers in matchings.
Experiment Environment
OS: Windows 10
GPU: NVIDIA GeForce GTX 1050
conda create --name homography python=3.8
conda activate homography
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
pip install -r requirements.txt
There will be three outputs, allpairs.png(in result folder), pairsforhomography.png(in result folder), and reprojection errors of DLT and normalized DLT. allpairs.png is the images that shows all correspondences. pairsforhomography.png is the images that shows correspondences for calculating homography. Reprojection errors is showed on terminal
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file]
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --pair [number of pairs]
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy --pair 8
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --pair [number of pairs] --outlier [point of outliers]
python 1.py --img1 ./images/1-0.png --img2 ./images/1-2.png --correspondence ./groundtruth_correspondences/correspondence_02.npy --pair 20 --outlier 6 15
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --ransac
Example: Use 4 pairs of points selected by RANSAC after ratio test, the result may not be same from every execution
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy --ransac
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --dl
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy --dl
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --pair [number of pairs] --dl
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy --pair 8 --dl
python 1.py --img1 [path to image1] --img2 [path to image2] --correspondence [path to correspondence file] --dl --ransac
Example: Use 4 pairs of points selected by RANSAC after getting matching from deep learning, the result may not be same from every execution
python 1.py --img1 ./images/1-0.png --img2 ./images/1-1.png --correspondence ./groundtruth_correspondences/correspondence_01.npy --dl --ransac
the order of choosing corners is left-up, left down, right down, right up
- Choose the corners by user
- Calculate homography between corner points gotten by previos step and the corner of output image(640x480).
- Do backward warping, from every pixel of output image, use homography to find mapping point in input image. Then using bilinear interpolation to get the pixel value.
- Store the image as book.png
Because we need to traverse all the pixels in output image, and the computation time of every pixel is same, so the time complexity is O(width_of_output*height_of_output)
- Running the code
python 2.py --img ./images/book.png
- Choose corners(left-up -> left down -> right down -> right up) and then press esc
- Result will be showed and store in result folder named book.png
LoFTR: Detector-Free Local Feature Matching with Transformers (CVPR 2021)