Eighth Workshop on Image Matching: Local Features & Beyond

Breakdown on the Phototourism dataset, multi-view stereo task, by sequence.

Back to the leaderboard

Stereo — All sequences — Sorted by mAP^{15^o}
Method	BM	FCS	LMS	LB	MC	MR	PSM	RS	SF	SPC	USC	AVG	Date	Type	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	0.0558	0.0133	0.0177	0.0575	0.0207	0.0112	0.0210	0.0329	0.0139	0.0102	0.0628	0.0288	19-04-24	F	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	0.0755	0.0333	0.0449	0.0508	0.0472	0.0189	0.0217	0.0491	0.0427	0.0253	0.0338	0.0403	19-04-26	F	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	0.0681	0.0207	0.0123	0.0602	0.0289	0.0156	0.0210	0.0415	0.0299	0.0196	0.0514	0.0336	19-05-14	F	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	0.0779	0.0335	0.0867	0.0616	0.0472	0.0223	0.0317	0.0420	0.0432	0.0244	0.0366	0.0461	19-05-07	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	0.0798	0.0376	0.0921	0.0665	0.0482	0.0257	0.0307	0.0434	0.0398	0.0309	0.0439	0.0490	19-05-07	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	0.0794	0.0331	0.0832	0.0547	0.0455	0.0204	0.0305	0.0443	0.0409	0.0215	0.0408	0.0449	19-06-01	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	0.0798	0.0358	0.0892	0.0735	0.0486	0.0234	0.0263	0.0491	0.0421	0.0261	0.0449	0.0490	19-06-05	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	0.0732	0.0184	0.0411	0.0366	0.0295	0.0124	0.0288	0.0324	0.0151	0.0173	0.0222	0.0297	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	0.0765	0.0247	0.0367	0.0508	0.0364	0.0183	0.0236	0.0277	0.0199	0.0167	0.0252	0.0324	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	0.0626	0.0094	0.0215	0.0178	0.0238	0.0040	0.0236	0.0277	0.0056	0.0134	0.0298	0.0217	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	0.0685	0.0148	0.0300	0.0282	0.0236	0.0099	0.0283	0.0377	0.0116	0.0173	0.0229	0.0266	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	0.0554	0.0092	0.0156	0.0432	0.0185	0.0116	0.0227	0.0367	0.0091	0.0050	0.0262	0.0230	19-05-07	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	0.0632	0.0171	0.0257	0.0540	0.0213	0.0128	0.0227	0.0372	0.0095	0.0098	0.0280	0.0274	19-05-09	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	0.0569	0.0178	0.0297	0.0571	0.0236	0.0107	0.0263	0.0320	0.0207	0.0092	0.0310	0.0286	19-04-26	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	0.0708	0.0202	0.0264	0.0310	0.0293	0.0185	0.0239	0.0448	0.0263	0.0144	0.0386	0.0313	19-05-19	F	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	0.0753	0.0245	0.0255	0.0303	0.0323	0.0126	0.0197	0.0448	0.0295	0.0148	0.0411	0.0319	19-05-19	F	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	0.0794	0.0265	0.0537	0.0669	0.0429	0.0213	0.0217	0.0358	0.0346	0.0180	0.0509	0.0411	19-05-23	F	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	0.0767	0.0281	0.0608	0.0286	0.0366	0.0164	0.0217	0.0548	0.0407	0.0196	0.0338	0.0380	19-05-29	F	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	0.0759	0.0279	0.0798	0.0404	0.0415	0.0198	0.0227	0.0515	0.0384	0.0219	0.0381	0.0416	19-05-30	F	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	0.0501	0.0085	0.0072	0.0233	0.0110	0.0074	0.0158	0.0324	0.0098	0.0079	0.0343	0.0189	19-04-24	F	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	0.0804	0.0252	0.0577	0.0637	0.0500	0.0192	0.0229	0.0453	0.0413	0.0165	0.0393	0.0420	19-06-25	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	0.0812	0.0295	0.0577	0.0707	0.0502	0.0211	0.0273	0.0429	0.0371	0.0203	0.0378	0.0432	19-06-24	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	0.0828	0.0288	0.0586	0.0790	0.0463	0.0202	0.0246	0.0529	0.0394	0.0180	0.0416	0.0448	19-06-20	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	0.0640	0.0126	0.0098	0.0286	0.0201	0.0103	0.0146	0.0291	0.0158	0.0130	0.0305	0.0226	19-05-10	F	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	0.0638	0.0117	0.0056	0.0268	0.0222	0.0107	0.0134	0.0324	0.0129	0.0092	0.0313	0.0218	19-04-29	F/M	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	0.0824	0.0279	0.0503	0.0592	0.0577	0.0244	0.0266	0.0510	0.0463	0.0192	0.0383	0.0439	19-05-09	F	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	0.0830	0.0301	0.0537	0.0547	0.0492	0.0223	0.0288	0.0558	0.0427	0.0196	0.0358	0.0432	19-05-28	F	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	0.0902	0.0569	0.1138	0.1452	0.0860	0.0941	0.0478	0.1140	0.0654	0.0549	0.0373	0.0823	19-05-28	F/M	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	0.0683	0.0234	0.0418	0.0411	0.0437	0.0143	0.0256	0.0448	0.0346	0.0157	0.0512	0.0368	19-05-08	F	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	0.0751	0.0263	0.0391	0.0463	0.0392	0.0177	0.0239	0.0429	0.0317	0.0159	0.0459	0.0367	19-04-24	F	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	0.0810	0.0295	0.0615	0.0477	0.0545	0.0198	0.0241	0.0486	0.0402	0.0201	0.0406	0.0425	19-04-24	F	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	0.0835	0.0265	0.0438	0.0439	0.0457	0.0221	0.0244	0.0491	0.0386	0.0196	0.0429	0.0400	19-04-24	F	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	0.0998	0.0495	0.1035	0.1459	0.0795	0.0762	0.0404	0.1021	0.0577	0.0426	0.0295	0.0752	19-05-29	F/M	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	0.0548	0.0155	0.0168	0.0414	0.0226	0.0158	0.0202	0.0401	0.0156	0.0113	0.0504	0.0277	19-04-24	F	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	0.0736	0.0250	0.0329	0.0494	0.0419	0.0194	0.0224	0.0467	0.0266	0.0163	0.0381	0.0357	19-04-24	F	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	0.0558	0.0112	0.0127	0.0327	0.0169	0.0128	0.0161	0.0334	0.0135	0.0092	0.0310	0.0223	19-05-17	F	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	0.0726	0.0283	0.0644	0.0672	0.0429	0.0126	0.0285	0.0439	0.0409	0.0194	0.0353	0.0415	19-06-07	F	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	0.0865	0.0418	0.0825	0.0766	0.0551	0.0189	0.0339	0.0563	0.0506	0.0292	0.0398	0.0519	19-06-07	F	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	0.0804	0.0297	0.0543	0.0776	0.0398	0.0135	0.0295	0.0477	0.0355	0.0171	0.0305	0.0414	19-04-24	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	0.0800	0.0218	0.0590	0.0804	0.0394	0.0124	0.0273	0.0424	0.0322	0.0173	0.0300	0.0402	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	0.0814	0.0288	0.0644	0.0696	0.0429	0.0139	0.0285	0.0420	0.0361	0.0182	0.0318	0.0416	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	0.0661	0.0133	0.0221	0.0710	0.0248	0.0105	0.0283	0.0391	0.0168	0.0104	0.0366	0.0308	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	0.0777	0.0178	0.0445	0.0829	0.0315	0.0143	0.0273	0.0415	0.0245	0.0171	0.0358	0.0377	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	0.0781	0.0322	0.0653	0.0484	0.0394	0.0103	0.0314	0.0410	0.0340	0.0180	0.0272	0.0387	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	0.0855	0.0236	0.0485	0.0463	0.0498	0.0192	0.0239	0.0491	0.0421	0.0192	0.0383	0.0405	19-07-29	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	0.0771	0.0245	0.0519	0.0644	0.0439	0.0145	0.0270	0.0424	0.0309	0.0186	0.0330	0.0389	19-05-30	F	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	0.0925	0.0427	0.0499	0.1323	0.0571	0.0362	0.0363	0.0796	0.0442	0.0422	0.0361	0.0590	19-05-30	F/M	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	0.1082	0.0423	0.0474	0.1466	0.0596	0.0415	0.0412	0.0954	0.0488	0.0399	0.0333	0.0640	19-05-28	F/M	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	0.0438	0.0088	0.0136	0.0272	0.0213	0.0107	0.0197	0.0348	0.0085	0.0069	0.0361	0.0210	19-04-24	F	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	0.0794	0.0373	0.0736	0.0686	0.0636	0.0242	0.0312	0.0529	0.0554	0.0286	0.0492	0.0513	19-06-07	F	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Results for individual sequences:

Stereo — sequence 'british_museum'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7963.6	0.2278	0.0016	0.0176	0.0558	0.1072	0.1622	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	4451.2	0.2861	0.0014	0.0278	0.0755	0.1462	0.2270	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	4427.3	0.2965	0.0014	0.0205	0.0681	0.1293	0.1839	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	4846.2	0.2745	0.0014	0.0309	0.0779	0.1452	0.2203	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	6052.9	0.2859	0.0022	0.0305	0.0798	0.1487	0.2227	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	4888.7	0.2739	0.0033	0.0329	0.0794	0.1440	0.2184	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	6008.1	0.2850	0.0010	0.0301	0.0798	0.1483	0.2201	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.2996	0.0012	0.0288	0.0732	0.1258	0.1798	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2046.9	0.2911	0.0014	0.0282	0.0765	0.1293	0.1814	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.3004	0.0010	0.0241	0.0626	0.1152	0.1714	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.3021	0.0008	0.0215	0.0685	0.1252	0.1741	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	421.4	0.2522	0.0018	0.0198	0.0554	0.0994	0.1458	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	421.4	0.2844	0.0020	0.0221	0.0632	0.1127	0.1663	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	490.2	0.2320	0.0018	0.0237	0.0569	0.0998	0.1507	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2024.5	0.2448	0.0014	0.0288	0.0708	0.1266	0.1847	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2024.5	0.2511	0.0025	0.0274	0.0753	0.1293	0.1886	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	1696.2	0.2468	0.0016	0.0241	0.0794	0.1403	0.2064	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7670.5	0.2575	0.0022	0.0272	0.0767	0.1362	0.1925	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7767.3	0.2786	0.0033	0.0297	0.0759	0.1409	0.2144	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7494.1	0.2209	0.0012	0.0164	0.0501	0.0972	0.1557	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7968.4	0.2670	0.0025	0.0292	0.0804	0.1426	0.2031	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7968.4	0.2776	0.0027	0.0333	0.0812	0.1432	0.2049	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7968.4	0.2820	0.0027	0.0325	0.0828	0.1467	0.2025	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	2910.1	0.2466	0.0012	0.0229	0.0640	0.1133	0.1597	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	2910.1	0.2453	0.0018	0.0211	0.0638	0.1105	0.1585	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7421.8	0.3115	0.0025	0.0321	0.0824	0.1393	0.2101	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7421.6	0.2869	0.0029	0.0288	0.0830	0.1458	0.2021	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7421.8	0.4302	0.0022	0.0299	0.0902	0.2009	0.3291	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7421.8	0.2174	0.0016	0.0233	0.0683	0.1197	0.1690	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7968.5	0.2287	0.0022	0.0305	0.0751	0.1242	0.1804	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7968.5	0.2588	0.0025	0.0290	0.0810	0.1368	0.1964	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7968.5	0.2474	0.0029	0.0319	0.0835	0.1338	0.1912	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7421.8	0.4256	0.0014	0.0325	0.0998	0.2107	0.3453	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7968.3	0.2004	0.0016	0.0182	0.0548	0.0984	0.1426	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7968.5	0.2323	0.0018	0.0307	0.0736	0.1317	0.1876	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.2230	0.0025	0.0207	0.0558	0.0974	0.1456	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1883.6	0.2463	0.0014	0.0266	0.0726	0.1368	0.2041	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1883.6	0.2758	0.0018	0.0333	0.0865	0.1536	0.2387	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1156.4	0.2668	0.0012	0.0309	0.0804	0.1540	0.2215	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.2655	0.0018	0.0295	0.0800	0.1426	0.2154	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.2610	0.0018	0.0327	0.0814	0.1393	0.2078	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.2517	0.0025	0.0276	0.0661	0.1209	0.1884	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.2605	0.0014	0.0284	0.0777	0.1420	0.2158	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.2564	0.0018	0.0295	0.0781	0.1346	0.1933	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7968.4	0.2508	0.0035	0.0346	0.0855	0.1428	0.2025	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1488.2	0.2483	0.0031	0.0297	0.0771	0.1415	0.2078	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1417.5	0.4226	0.0020	0.0309	0.0925	0.1990	0.3160	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1883.6	0.4184	0.0027	0.0327	0.1082	0.2182	0.3436	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7830.3	0.2009	0.0006	0.0157	0.0438	0.0871	0.1344	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7421.8	0.3273	0.0020	0.0295	0.0794	0.1548	0.2428	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'florence_cathedral_side'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7859.0	0.1603	0.0002	0.0038	0.0133	0.0290	0.0558	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	9429.2	0.2110	0.0004	0.0081	0.0333	0.0821	0.1478	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	7066.7	0.1963	0.0009	0.0070	0.0207	0.0488	0.0915	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	5416.8	0.2177	0.0002	0.0076	0.0335	0.0834	0.1601	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	7491.8	0.2297	0.0004	0.0115	0.0376	0.1003	0.1759	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	5474.9	0.2170	0.0000	0.0081	0.0331	0.0839	0.1608	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7198.9	0.2282	0.0007	0.0079	0.0358	0.0913	0.1721	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.2195	0.0000	0.0058	0.0184	0.0540	0.1039	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2034.4	0.2158	0.0004	0.0079	0.0247	0.0625	0.1219	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.2233	0.0000	0.0036	0.0094	0.0189	0.0405	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.2214	0.0002	0.0052	0.0148	0.0373	0.0715	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	456.1	0.1907	0.0000	0.0022	0.0092	0.0254	0.0547	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	455.8	0.2052	0.0002	0.0040	0.0171	0.0421	0.0774	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	495.2	0.1824	0.0004	0.0063	0.0178	0.0385	0.0771	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2048.0	0.1821	0.0013	0.0052	0.0202	0.0452	0.0976	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2048.0	0.1859	0.0004	0.0056	0.0245	0.0533	0.1091	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2616.2	0.1802	0.0009	0.0079	0.0265	0.0598	0.1156	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7678.8	0.1910	0.0007	0.0079	0.0281	0.0630	0.1230	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7754.4	0.2086	0.0007	0.0081	0.0279	0.0733	0.1410	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7886.4	0.1637	0.0000	0.0031	0.0085	0.0198	0.0400	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7873.7	0.2032	0.0009	0.0072	0.0252	0.0628	0.1347	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7873.7	0.2053	0.0009	0.0081	0.0295	0.0762	0.1446	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7873.7	0.2063	0.0013	0.0081	0.0288	0.0738	0.1433	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	6225.9	0.1775	0.0007	0.0045	0.0126	0.0283	0.0576	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	6225.9	0.1766	0.0002	0.0029	0.0117	0.0232	0.0488	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7783.0	0.2254	0.0009	0.0085	0.0279	0.0711	0.1444	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7782.9	0.2182	0.0002	0.0074	0.0301	0.0742	0.1478	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7783.0	0.2894	0.0004	0.0139	0.0569	0.1278	0.2254	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7783.0	0.1843	0.0004	0.0049	0.0234	0.0587	0.1197	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7874.2	0.1878	0.0004	0.0074	0.0263	0.0652	0.1271	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7874.2	0.2022	0.0007	0.0074	0.0295	0.0731	0.1422	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7874.2	0.1987	0.0013	0.0076	0.0265	0.0646	0.1323	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7783.0	0.2891	0.0009	0.0137	0.0495	0.1134	0.2015	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7873.7	0.1747	0.0002	0.0045	0.0155	0.0371	0.0839	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7874.2	0.1880	0.0004	0.0085	0.0250	0.0580	0.1161	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.1678	0.0004	0.0036	0.0112	0.0263	0.0589	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	2010.2	0.1917	0.0004	0.0090	0.0283	0.0731	0.1352	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	2010.2	0.2063	0.0004	0.0128	0.0418	0.0902	0.1595	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1663.1	0.2016	0.0007	0.0076	0.0297	0.0650	0.1350	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.1979	0.0004	0.0054	0.0218	0.0632	0.1221	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.2006	0.0002	0.0072	0.0288	0.0717	0.1399	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.1768	0.0000	0.0025	0.0133	0.0387	0.0796	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.1865	0.0004	0.0047	0.0178	0.0544	0.1075	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.1981	0.0007	0.0074	0.0322	0.0751	0.1473	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7873.7	0.1997	0.0009	0.0065	0.0236	0.0704	0.1435	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1803.0	0.1889	0.0002	0.0049	0.0245	0.0628	0.1246	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1851.4	0.2766	0.0011	0.0119	0.0427	0.0963	0.1649	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	2010.2	0.2769	0.0011	0.0119	0.0423	0.0972	0.1748	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7830.2	0.1628	0.0004	0.0031	0.0088	0.0250	0.0542	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7783.0	0.2360	0.0002	0.0101	0.0373	0.0920	0.1750	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'lincoln_memorial_statue'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7799.0	0.2834	0.0004	0.0054	0.0177	0.0510	0.1100	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	1292.5	0.3649	0.0007	0.0132	0.0449	0.1019	0.1688	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	1292.5	0.3617	0.0000	0.0038	0.0123	0.0398	0.0733	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	4352.2	0.3358	0.0007	0.0206	0.0867	0.1967	0.3125	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	5662.7	0.3539	0.0004	0.0203	0.0921	0.2110	0.3300	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	4536.2	0.3348	0.0004	0.0190	0.0832	0.1932	0.3023	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	5764.5	0.3546	0.0009	0.0177	0.0892	0.1981	0.3184	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.3337	0.0004	0.0087	0.0411	0.1167	0.2124	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2037.7	0.3195	0.0000	0.0069	0.0367	0.1102	0.2019	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.3501	0.0007	0.0040	0.0215	0.0646	0.1348	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.3436	0.0000	0.0060	0.0300	0.0928	0.1871	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	281.1	0.3411	0.0002	0.0036	0.0156	0.0358	0.0702	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	281.1	0.3563	0.0002	0.0069	0.0257	0.0590	0.1053	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	466.8	0.3046	0.0002	0.0074	0.0297	0.0675	0.1158	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	1255.5	0.3341	0.0004	0.0054	0.0264	0.0700	0.1297	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	1255.5	0.3334	0.0007	0.0067	0.0255	0.0680	0.1259	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	1363.3	0.3367	0.0013	0.0110	0.0537	0.1384	0.2444	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7751.6	0.3307	0.0007	0.0130	0.0608	0.1449	0.2408	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7776.5	0.3451	0.0002	0.0174	0.0798	0.1724	0.2853	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	3920.7	0.3164	0.0002	0.0025	0.0072	0.0201	0.0382	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7770.9	0.3104	0.0009	0.0139	0.0577	0.1397	0.2383	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7770.9	0.3154	0.0004	0.0130	0.0577	0.1388	0.2350	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7770.9	0.3175	0.0004	0.0163	0.0586	0.1426	0.2484	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	1108.1	0.3384	0.0002	0.0025	0.0098	0.0217	0.0400	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	1108.1	0.3413	0.0000	0.0013	0.0056	0.0161	0.0358	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	5820.3	0.3555	0.0002	0.0118	0.0503	0.1326	0.2325	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	5820.2	0.3529	0.0007	0.0130	0.0537	0.1446	0.2466	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	5820.3	0.4037	0.0013	0.0268	0.1138	0.2535	0.3948	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	5820.3	0.2939	0.0004	0.0112	0.0418	0.1028	0.1961	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7771.0	0.2881	0.0009	0.0089	0.0391	0.0973	0.1847	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7771.0	0.3033	0.0011	0.0154	0.0615	0.1429	0.2482	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7771.0	0.3002	0.0007	0.0116	0.0438	0.1259	0.2240	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	5820.3	0.4036	0.0027	0.0221	0.1035	0.2397	0.3783	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7770.9	0.2680	0.0002	0.0036	0.0168	0.0463	0.1033	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7771.0	0.2908	0.0002	0.0078	0.0329	0.0883	0.1737	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.3013	0.0000	0.0027	0.0127	0.0317	0.0722	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1701.4	0.3054	0.0011	0.0179	0.0644	0.1373	0.2305	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1701.4	0.3225	0.0013	0.0215	0.0825	0.1791	0.2906	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	808.2	0.3362	0.0020	0.0179	0.0543	0.1290	0.2084	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.3310	0.0009	0.0148	0.0590	0.1375	0.2267	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.3164	0.0007	0.0154	0.0644	0.1426	0.2300	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.3392	0.0007	0.0076	0.0221	0.0503	0.0858	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.3414	0.0009	0.0134	0.0445	0.0993	0.1699	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.3028	0.0018	0.0143	0.0653	0.1417	0.2361	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7770.9	0.3016	0.0011	0.0121	0.0485	0.1288	0.2426	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1165.2	0.3127	0.0007	0.0116	0.0519	0.1209	0.2039	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	975.0	0.3958	0.0004	0.0134	0.0499	0.1107	0.1791	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1701.4	0.3948	0.0009	0.0145	0.0474	0.1100	0.1961	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7723.4	0.2617	0.0002	0.0029	0.0136	0.0405	0.0814	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	5820.3	0.3634	0.0009	0.0163	0.0736	0.1838	0.3023	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'london_bridge'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7962.8	0.2545	0.0003	0.0066	0.0575	0.1473	0.2469	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	3773.8	0.2875	0.0003	0.0063	0.0508	0.1233	0.2044	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	3702.9	0.2989	0.0000	0.0077	0.0602	0.1515	0.2479	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	5004.7	0.2913	0.0003	0.0101	0.0616	0.1469	0.2378	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	6698.9	0.2998	0.0003	0.0136	0.0665	0.1539	0.2472	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	5133.2	0.2910	0.0003	0.0094	0.0547	0.1375	0.2295	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	6647.1	0.2996	0.0010	0.0146	0.0735	0.1633	0.2531	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.3143	0.0000	0.0028	0.0366	0.1010	0.1619	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2046.5	0.3102	0.0000	0.0063	0.0508	0.1118	0.1845	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.3120	0.0003	0.0024	0.0178	0.0529	0.0996	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.3150	0.0003	0.0035	0.0282	0.0769	0.1309	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	297.1	0.2823	0.0003	0.0035	0.0432	0.1208	0.1957	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	297.4	0.2929	0.0003	0.0059	0.0540	0.1327	0.2110	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	488.8	0.2749	0.0000	0.0108	0.0571	0.1365	0.2162	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	1983.5	0.2510	0.0000	0.0021	0.0310	0.0884	0.1765	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	1983.5	0.2553	0.0000	0.0017	0.0303	0.0815	0.1598	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	1917.2	0.2811	0.0000	0.0104	0.0669	0.1664	0.2618	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7668.0	0.2665	0.0000	0.0045	0.0286	0.0870	0.1640	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7748.9	0.2821	0.0000	0.0070	0.0404	0.1107	0.1960	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	6982.2	0.2415	0.0000	0.0021	0.0233	0.0769	0.1462	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7954.8	0.2972	0.0003	0.0122	0.0637	0.1435	0.2329	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7954.8	0.3025	0.0003	0.0164	0.0707	0.1602	0.2531	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7954.8	0.3049	0.0010	0.0160	0.0790	0.1744	0.2625	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	2895.2	0.2848	0.0000	0.0045	0.0286	0.0783	0.1452	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	2895.2	0.2843	0.0000	0.0035	0.0268	0.0790	0.1504	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7678.9	0.3487	0.0003	0.0118	0.0592	0.1306	0.2190	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7678.9	0.3322	0.0003	0.0104	0.0547	0.1327	0.2124	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7678.9	0.3781	0.0007	0.0327	0.1452	0.2914	0.4185	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7678.9	0.2757	0.0000	0.0077	0.0411	0.1166	0.2037	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7955.7	0.2688	0.0000	0.0059	0.0463	0.1229	0.2159	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7955.7	0.2906	0.0003	0.0094	0.0477	0.1271	0.2030	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7955.7	0.2906	0.0000	0.0073	0.0439	0.1079	0.1915	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7678.9	0.3766	0.0017	0.0338	0.1459	0.2765	0.3830	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7954.4	0.2514	0.0000	0.0073	0.0414	0.1083	0.1981	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7955.7	0.2771	0.0000	0.0080	0.0494	0.1139	0.1905	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.2345	0.0000	0.0028	0.0327	0.1003	0.1744	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1799.3	0.2803	0.0000	0.0108	0.0672	0.1692	0.2552	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1799.3	0.3046	0.0000	0.0129	0.0766	0.1748	0.2775	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1094.0	0.2886	0.0000	0.0139	0.0776	0.1842	0.2761	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.2873	0.0000	0.0139	0.0804	0.1995	0.2900	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.2741	0.0000	0.0125	0.0696	0.1570	0.2451	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.2832	0.0000	0.0070	0.0710	0.1661	0.2584	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.2901	0.0003	0.0097	0.0829	0.1964	0.2942	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.2590	0.0003	0.0101	0.0484	0.1194	0.1978	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7954.8	0.2883	0.0003	0.0073	0.0463	0.1142	0.2079	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1418.4	0.2823	0.0007	0.0097	0.0644	0.1591	0.2604	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1364.5	0.3725	0.0014	0.0320	0.1323	0.2625	0.3795	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1799.3	0.3718	0.0021	0.0320	0.1466	0.2834	0.3959	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7915.0	0.2465	0.0003	0.0045	0.0272	0.0766	0.1501	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7678.9	0.3590	0.0007	0.0125	0.0686	0.1647	0.2692	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'milan_cathedral'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7818.1	0.1838	0.0000	0.0045	0.0207	0.0573	0.1258	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	5628.3	0.2179	0.0002	0.0112	0.0472	0.1130	0.2004	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	5111.7	0.2172	0.0000	0.0039	0.0289	0.0864	0.1583	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	5932.2	0.2226	0.0004	0.0069	0.0472	0.1211	0.2173	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	7782.8	0.2308	0.0006	0.0098	0.0482	0.1195	0.2232	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	5965.0	0.2213	0.0006	0.0085	0.0455	0.1213	0.2205	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7241.2	0.2293	0.0008	0.0087	0.0486	0.1246	0.2270	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.2291	0.0006	0.0065	0.0295	0.0815	0.1618	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2036.2	0.2261	0.0014	0.0081	0.0364	0.0923	0.1770	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.2290	0.0004	0.0047	0.0238	0.0589	0.1189	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.2290	0.0000	0.0049	0.0236	0.0652	0.1358	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	357.9	0.1940	0.0004	0.0045	0.0185	0.0514	0.0900	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	358.0	0.2008	0.0002	0.0026	0.0213	0.0567	0.1010	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	495.5	0.1892	0.0002	0.0049	0.0236	0.0665	0.1242	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2039.6	0.1845	0.0002	0.0047	0.0293	0.0760	0.1455	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2039.6	0.1879	0.0000	0.0055	0.0323	0.0833	0.1591	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2681.2	0.2070	0.0008	0.0071	0.0429	0.1106	0.2063	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7698.8	0.1956	0.0004	0.0077	0.0366	0.1069	0.1998	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7761.0	0.2077	0.0006	0.0077	0.0415	0.1142	0.2093	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7653.5	0.1726	0.0000	0.0010	0.0110	0.0343	0.0703	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7823.4	0.2100	0.0006	0.0112	0.0500	0.1209	0.2217	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7823.4	0.2152	0.0008	0.0098	0.0502	0.1268	0.2232	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7823.4	0.2183	0.0004	0.0091	0.0463	0.1203	0.2252	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	4481.9	0.1919	0.0006	0.0039	0.0201	0.0537	0.1026	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	4481.9	0.1919	0.0004	0.0033	0.0222	0.0522	0.1012	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7725.5	0.2565	0.0002	0.0112	0.0577	0.1413	0.2433	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7725.4	0.2453	0.0004	0.0085	0.0492	0.1248	0.2297	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7725.5	0.2981	0.0022	0.0213	0.0860	0.1917	0.3067	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7725.5	0.2007	0.0004	0.0085	0.0437	0.1100	0.2033	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7823.6	0.1951	0.0002	0.0085	0.0392	0.1081	0.2012	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7823.6	0.2093	0.0008	0.0104	0.0545	0.1272	0.2248	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7823.6	0.2055	0.0016	0.0100	0.0457	0.1146	0.2110	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7725.5	0.2978	0.0016	0.0199	0.0795	0.1754	0.2967	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7823.8	0.1808	0.0004	0.0061	0.0226	0.0736	0.1465	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7823.6	0.1965	0.0008	0.0106	0.0419	0.1000	0.1876	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.1683	0.0004	0.0030	0.0169	0.0500	0.0957	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1997.1	0.2065	0.0006	0.0087	0.0429	0.1146	0.2006	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1997.1	0.2205	0.0004	0.0106	0.0551	0.1301	0.2307	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1483.6	0.2081	0.0010	0.0089	0.0398	0.1059	0.1939	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.2100	0.0006	0.0102	0.0394	0.1020	0.1882	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.2044	0.0012	0.0075	0.0429	0.1037	0.1935	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.2040	0.0002	0.0053	0.0248	0.0654	0.1228	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.2094	0.0002	0.0049	0.0315	0.0909	0.1600	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.1956	0.0002	0.0096	0.0394	0.0996	0.1866	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7823.4	0.2071	0.0008	0.0118	0.0498	0.1215	0.2209	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1726.8	0.2066	0.0008	0.0073	0.0439	0.1091	0.2014	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1691.7	0.2989	0.0006	0.0126	0.0571	0.1335	0.2299	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1997.1	0.2981	0.0008	0.0154	0.0596	0.1388	0.2419	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7630.6	0.1704	0.0006	0.0047	0.0213	0.0526	0.1065	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7725.5	0.2642	0.0008	0.0132	0.0636	0.1585	0.2744	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'mount_rushmore'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7809.6	0.3603	0.0000	0.0011	0.0112	0.0549	0.1480	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	7344.7	0.4323	0.0002	0.0046	0.0189	0.0798	0.1760	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	5588.5	0.4334	0.0000	0.0017	0.0156	0.0701	0.1541	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	6006.5	0.4210	0.0004	0.0063	0.0223	0.0733	0.1676	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	8732.2	0.4307	0.0002	0.0067	0.0257	0.0842	0.1825	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	6145.1	0.4196	0.0011	0.0051	0.0204	0.0714	0.1615	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7542.0	0.4288	0.0002	0.0051	0.0234	0.0735	0.1672	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.4118	0.0002	0.0019	0.0124	0.0451	0.1202	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2032.3	0.4054	0.0000	0.0029	0.0183	0.0535	0.1318	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.4175	0.0000	0.0002	0.0040	0.0288	0.0787	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.4160	0.0002	0.0036	0.0099	0.0354	0.1023	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	354.8	0.3949	0.0004	0.0015	0.0116	0.0413	0.0861	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	354.8	0.4041	0.0000	0.0006	0.0128	0.0474	0.1011	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	490.9	0.3744	0.0006	0.0019	0.0107	0.0505	0.1166	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	1988.9	0.4000	0.0006	0.0029	0.0185	0.0783	0.1573	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	1988.9	0.4039	0.0000	0.0017	0.0126	0.0758	0.1514	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2900.8	0.3947	0.0000	0.0029	0.0213	0.0817	0.1739	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7598.1	0.4131	0.0002	0.0025	0.0164	0.0844	0.1893	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7668.5	0.4263	0.0000	0.0025	0.0198	0.0966	0.2061	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7279.8	0.3824	0.0000	0.0002	0.0074	0.0453	0.1069	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7791.1	0.3969	0.0002	0.0029	0.0192	0.0842	0.1867	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7791.1	0.4029	0.0004	0.0038	0.0211	0.0859	0.1821	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7791.1	0.4058	0.0002	0.0040	0.0202	0.0811	0.1638	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	4379.1	0.4001	0.0000	0.0017	0.0103	0.0381	0.0985	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	4379.1	0.4021	0.0000	0.0013	0.0107	0.0381	0.0909	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7719.0	0.4673	0.0011	0.0061	0.0244	0.0627	0.1309	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7719.0	0.4417	0.0000	0.0042	0.0223	0.0661	0.1495	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7719.0	0.4917	0.0034	0.0328	0.0941	0.2051	0.3091	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7719.0	0.3774	0.0002	0.0023	0.0143	0.0777	0.1865	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7791.0	0.3707	0.0004	0.0038	0.0177	0.0855	0.1863	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7791.0	0.3932	0.0000	0.0036	0.0198	0.0825	0.1888	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7791.0	0.3893	0.0002	0.0051	0.0221	0.0781	0.1785	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7719.0	0.4938	0.0029	0.0234	0.0762	0.1878	0.2958	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7788.9	0.3556	0.0002	0.0017	0.0158	0.0699	0.1575	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7791.0	0.3788	0.0002	0.0042	0.0194	0.0783	0.1699	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.3787	0.0002	0.0013	0.0128	0.0585	0.1175	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1710.5	0.3756	0.0000	0.0027	0.0126	0.0451	0.1181	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1710.5	0.3999	0.0002	0.0040	0.0189	0.0667	0.1324	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	936.8	0.3766	0.0004	0.0032	0.0135	0.0455	0.1036	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.3750	0.0000	0.0017	0.0124	0.0436	0.1091	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.3702	0.0006	0.0044	0.0139	0.0421	0.0941	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.3734	0.0004	0.0017	0.0105	0.0440	0.1011	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.3770	0.0004	0.0029	0.0143	0.0495	0.1215	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.3662	0.0002	0.0032	0.0103	0.0387	0.0832	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7791.1	0.3879	0.0002	0.0038	0.0192	0.0859	0.1909	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1342.3	0.3785	0.0008	0.0032	0.0145	0.0484	0.1232	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1173.5	0.4783	0.0015	0.0112	0.0362	0.0926	0.1773	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1710.5	0.4875	0.0013	0.0114	0.0415	0.1078	0.1977	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7514.8	0.3499	0.0002	0.0011	0.0107	0.0512	0.1192	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7719.0	0.4747	0.0004	0.0046	0.0242	0.0678	0.1408	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'piazza_san_marco'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7783.7	0.1563	0.0000	0.0029	0.0210	0.0660	0.1326	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	6671.6	0.1798	0.0002	0.0032	0.0217	0.0768	0.1584	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	6007.0	0.1804	0.0005	0.0019	0.0210	0.0721	0.1423	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	6269.4	0.1740	0.0000	0.0051	0.0317	0.0906	0.1747	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	7888.4	0.1817	0.0000	0.0049	0.0307	0.0953	0.1842	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	6368.5	0.1735	0.0005	0.0051	0.0305	0.0836	0.1793	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7314.6	0.1821	0.0002	0.0046	0.0263	0.0838	0.1774	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.1768	0.0000	0.0049	0.0288	0.0865	0.1769	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2023.0	0.1780	0.0002	0.0029	0.0236	0.0850	0.1803	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.1767	0.0000	0.0049	0.0236	0.0736	0.1606	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.1764	0.0000	0.0051	0.0283	0.0831	0.1728	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	399.4	0.1725	0.0005	0.0046	0.0227	0.0690	0.1372	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	399.4	0.1739	0.0000	0.0056	0.0227	0.0648	0.1289	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	483.8	0.1587	0.0000	0.0039	0.0263	0.0807	0.1489	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2041.5	0.1735	0.0000	0.0027	0.0239	0.0785	0.1535	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2041.5	0.1755	0.0000	0.0027	0.0197	0.0743	0.1503	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2667.1	0.1670	0.0000	0.0034	0.0217	0.0811	0.1672	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7657.5	0.1713	0.0007	0.0029	0.0217	0.0743	0.1603	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7728.7	0.1780	0.0002	0.0037	0.0227	0.0741	0.1608	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7816.3	0.1673	0.0000	0.0015	0.0158	0.0602	0.1291	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7807.0	0.1728	0.0000	0.0044	0.0229	0.0797	0.1701	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7807.0	0.1750	0.0002	0.0044	0.0273	0.0858	0.1718	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7807.0	0.1759	0.0000	0.0046	0.0246	0.0821	0.1691	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	4511.4	0.1573	0.0005	0.0015	0.0146	0.0526	0.1067	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	4511.4	0.1575	0.0002	0.0015	0.0134	0.0463	0.1023	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7764.3	0.1902	0.0000	0.0054	0.0266	0.0841	0.1788	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7764.3	0.1820	0.0000	0.0054	0.0288	0.0838	0.1762	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7764.3	0.2300	0.0024	0.0105	0.0478	0.1187	0.2269	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7764.3	0.1621	0.0002	0.0034	0.0256	0.0789	0.1577	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7808.0	0.1635	0.0000	0.0049	0.0239	0.0807	0.1589	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7808.0	0.1738	0.0002	0.0032	0.0241	0.0841	0.1706	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7808.0	0.1702	0.0005	0.0049	0.0244	0.0809	0.1679	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7764.3	0.2280	0.0007	0.0083	0.0404	0.1067	0.2069	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7807.2	0.1537	0.0000	0.0032	0.0202	0.0653	0.1289	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7808.0	0.1644	0.0002	0.0039	0.0224	0.0770	0.1547	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.1738	0.0000	0.0017	0.0161	0.0619	0.1328	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	2003.7	0.1646	0.0000	0.0049	0.0285	0.0872	0.1728	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	2003.7	0.1680	0.0010	0.0063	0.0339	0.0945	0.1745	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1557.9	0.1680	0.0002	0.0058	0.0295	0.0863	0.1723	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.1676	0.0002	0.0058	0.0273	0.0831	0.1737	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.1666	0.0005	0.0051	0.0285	0.0885	0.1742	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.1583	0.0005	0.0054	0.0283	0.0797	0.1608	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.1627	0.0002	0.0039	0.0273	0.0850	0.1715	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.1618	0.0007	0.0066	0.0314	0.0848	0.1684	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7807.0	0.1723	0.0005	0.0027	0.0239	0.0821	0.1664	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1807.8	0.1672	0.0000	0.0041	0.0270	0.0821	0.1713	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1793.0	0.2083	0.0005	0.0080	0.0363	0.0970	0.1815	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	2003.7	0.2083	0.0002	0.0088	0.0412	0.0992	0.1830	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7680.7	0.1457	0.0010	0.0051	0.0197	0.0595	0.1284	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7764.3	0.1950	0.0002	0.0056	0.0312	0.0911	0.1884	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'reichstag'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7961.9	0.2624	0.0010	0.0067	0.0329	0.0806	0.1588	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	3421.1	0.3793	0.0010	0.0134	0.0491	0.1140	0.1931	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	3391.7	0.3844	0.0005	0.0091	0.0415	0.1011	0.1841	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	4835.5	0.3876	0.0005	0.0134	0.0420	0.0939	0.1769	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	6059.3	0.4055	0.0010	0.0105	0.0434	0.1097	0.1927	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	4958.1	0.3867	0.0024	0.0162	0.0443	0.1011	0.1855	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	6035.7	0.4046	0.0005	0.0119	0.0491	0.0997	0.1850	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.3924	0.0014	0.0129	0.0324	0.0653	0.1049	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2048.0	0.3846	0.0014	0.0110	0.0277	0.0539	0.0901	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.3789	0.0005	0.0086	0.0277	0.0491	0.0858	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.3859	0.0019	0.0124	0.0377	0.0668	0.1040	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	269.7	0.3271	0.0005	0.0076	0.0367	0.0763	0.1392	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	269.6	0.3587	0.0019	0.0138	0.0372	0.0720	0.1249	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	478.1	0.3051	0.0019	0.0138	0.0320	0.0620	0.1116	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2012.6	0.3221	0.0005	0.0105	0.0448	0.0968	0.1631	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2012.6	0.3312	0.0010	0.0124	0.0448	0.0973	0.1621	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2245.6	0.3314	0.0014	0.0134	0.0358	0.0925	0.1645	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7741.2	0.3447	0.0014	0.0129	0.0548	0.1130	0.1979	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7793.9	0.3813	0.0014	0.0119	0.0515	0.1211	0.2198	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7094.3	0.2725	0.0000	0.0095	0.0324	0.0787	0.1474	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7977.9	0.3547	0.0029	0.0134	0.0453	0.1011	0.1893	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7977.9	0.3664	0.0010	0.0129	0.0429	0.1063	0.1884	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7977.9	0.3723	0.0010	0.0129	0.0529	0.1173	0.1941	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	3023.9	0.3308	0.0010	0.0091	0.0291	0.0677	0.1178	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	3023.9	0.3265	0.0019	0.0086	0.0324	0.0639	0.1164	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7837.6	0.4452	0.0000	0.0119	0.0510	0.1149	0.2084	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7837.5	0.4163	0.0005	0.0148	0.0558	0.1092	0.1927	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7837.6	0.5902	0.0029	0.0258	0.1140	0.2475	0.3972	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7837.6	0.3057	0.0010	0.0124	0.0448	0.1040	0.1807	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7977.9	0.3042	0.0005	0.0129	0.0429	0.0963	0.1707	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7977.9	0.3519	0.0019	0.0157	0.0486	0.1078	0.1845	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7977.9	0.3389	0.0014	0.0153	0.0491	0.0920	0.1750	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7837.6	0.5897	0.0029	0.0277	0.1021	0.2365	0.3882	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7978.0	0.2607	0.0005	0.0091	0.0401	0.0863	0.1545	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7977.9	0.3106	0.0019	0.0153	0.0467	0.0925	0.1674	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.2757	0.0005	0.0105	0.0334	0.0668	0.1144	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1803.4	0.3515	0.0014	0.0129	0.0439	0.0949	0.1645	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1803.4	0.3931	0.0010	0.0134	0.0563	0.1221	0.2232	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1226.4	0.3795	0.0010	0.0186	0.0477	0.0968	0.1698	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.3806	0.0019	0.0157	0.0424	0.0920	0.1593	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.3577	0.0014	0.0124	0.0420	0.0906	0.1669	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.3605	0.0014	0.0124	0.0391	0.0897	0.1459	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.3780	0.0014	0.0138	0.0415	0.0920	0.1722	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.3281	0.0014	0.0119	0.0410	0.0873	0.1474	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7977.9	0.3456	0.0024	0.0162	0.0491	0.1121	0.1845	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1475.9	0.3525	0.0014	0.0143	0.0424	0.0897	0.1574	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1463.2	0.5845	0.0014	0.0196	0.0796	0.1955	0.3257	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1803.4	0.5843	0.0000	0.0253	0.0954	0.2117	0.3743	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7719.0	0.2391	0.0005	0.0091	0.0348	0.0672	0.1307	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7837.6	0.4705	0.0014	0.0114	0.0529	0.1330	0.2437	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'sagrada_familia'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7862.3	0.1433	0.0006	0.0062	0.0139	0.0305	0.0537	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	7516.2	0.1775	0.0006	0.0112	0.0427	0.0898	0.1506	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	5945.2	0.1886	0.0008	0.0056	0.0299	0.0631	0.1073	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	7186.2	0.1899	0.0012	0.0133	0.0432	0.0927	0.1523	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	10329.8	0.1992	0.0004	0.0102	0.0398	0.0913	0.1488	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	6964.7	0.1898	0.0008	0.0118	0.0409	0.0844	0.1459	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7698.3	0.1983	0.0002	0.0112	0.0421	0.0873	0.1525	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.1673	0.0000	0.0033	0.0151	0.0423	0.0755	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2041.3	0.1677	0.0006	0.0052	0.0199	0.0492	0.0894	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.1632	0.0000	0.0006	0.0056	0.0120	0.0234	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.1660	0.0000	0.0015	0.0116	0.0257	0.0506	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	313.1	0.1656	0.0002	0.0027	0.0091	0.0230	0.0380	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	313.3	0.1782	0.0000	0.0015	0.0095	0.0212	0.0390	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	500.0	0.1646	0.0006	0.0058	0.0207	0.0432	0.0726	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2012.8	0.1532	0.0012	0.0087	0.0263	0.0571	0.1004	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2012.8	0.1531	0.0002	0.0079	0.0295	0.0593	0.1002	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	3588.3	0.1627	0.0006	0.0098	0.0346	0.0732	0.1245	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7695.2	0.1613	0.0010	0.0124	0.0407	0.0863	0.1373	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7755.4	0.1765	0.0012	0.0139	0.0384	0.0830	0.1452	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7643.7	0.1420	0.0000	0.0023	0.0098	0.0197	0.0336	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7893.9	0.1677	0.0006	0.0114	0.0413	0.0828	0.1369	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7893.9	0.1719	0.0004	0.0110	0.0371	0.0828	0.1400	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7893.9	0.1741	0.0006	0.0100	0.0394	0.0840	0.1417	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	5943.8	0.1572	0.0000	0.0021	0.0158	0.0388	0.0585	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	5943.8	0.1558	0.0002	0.0037	0.0129	0.0320	0.0521	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7855.1	0.1970	0.0015	0.0137	0.0463	0.0979	0.1629	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7855.1	0.1876	0.0012	0.0129	0.0427	0.0880	0.1471	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7855.1	0.2372	0.0008	0.0201	0.0654	0.1307	0.2199	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7855.1	0.1557	0.0012	0.0083	0.0346	0.0726	0.1251	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7896.1	0.1521	0.0008	0.0108	0.0317	0.0718	0.1224	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7896.1	0.1635	0.0012	0.0129	0.0402	0.0838	0.1427	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7896.1	0.1595	0.0008	0.0098	0.0386	0.0788	0.1284	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7855.1	0.2369	0.0019	0.0151	0.0577	0.1299	0.2085	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7894.1	0.1382	0.0004	0.0044	0.0156	0.0340	0.0583	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7896.1	0.1515	0.0012	0.0089	0.0266	0.0571	0.0952	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.1373	0.0002	0.0037	0.0135	0.0259	0.0494	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1988.9	0.1634	0.0012	0.0137	0.0409	0.0803	0.1328	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1988.9	0.1785	0.0010	0.0149	0.0506	0.0952	0.1575	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1474.5	0.1708	0.0008	0.0110	0.0355	0.0739	0.1247	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.1693	0.0006	0.0102	0.0322	0.0672	0.1164	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.1709	0.0008	0.0106	0.0361	0.0759	0.1297	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.1594	0.0004	0.0050	0.0168	0.0349	0.0581	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.1654	0.0010	0.0058	0.0245	0.0577	0.0942	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.1706	0.0004	0.0095	0.0340	0.0757	0.1222	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7893.9	0.1605	0.0012	0.0120	0.0421	0.0853	0.1413	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1833.9	0.1629	0.0006	0.0102	0.0309	0.0656	0.1154	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1714.4	0.2270	0.0012	0.0129	0.0442	0.0932	0.1566	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1988.9	0.2269	0.0008	0.0162	0.0488	0.0985	0.1699	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7732.5	0.1390	0.0002	0.0027	0.0085	0.0243	0.0407	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7855.1	0.2059	0.0017	0.0160	0.0554	0.1149	0.1882	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'st_pauls_cathedral'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	7857.2	0.1513	0.0002	0.0023	0.0102	0.0288	0.0648	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	5685.9	0.1725	0.0006	0.0069	0.0253	0.0623	0.1205	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	5171.6	0.1727	0.0000	0.0056	0.0196	0.0562	0.1076	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	5454.0	0.1751	0.0006	0.0069	0.0244	0.0610	0.1161	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	7231.5	0.1829	0.0006	0.0100	0.0309	0.0773	0.1425	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	5540.6	0.1744	0.0008	0.0048	0.0215	0.0593	0.1155	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	7004.6	0.1819	0.0006	0.0079	0.0261	0.0721	0.1366	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.1833	0.0006	0.0042	0.0173	0.0506	0.0994	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2033.1	0.1823	0.0006	0.0033	0.0167	0.0476	0.0902	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.1809	0.0004	0.0031	0.0134	0.0439	0.0861	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.1835	0.0002	0.0038	0.0173	0.0453	0.0957	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	375.1	0.1583	0.0004	0.0015	0.0050	0.0167	0.0351	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	375.0	0.1705	0.0004	0.0029	0.0098	0.0288	0.0564	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	464.1	0.1517	0.0006	0.0019	0.0092	0.0234	0.0501	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	2034.0	0.1503	0.0006	0.0033	0.0144	0.0397	0.0794	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	2034.0	0.1527	0.0002	0.0040	0.0148	0.0395	0.0777	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2300.1	0.1557	0.0002	0.0058	0.0180	0.0539	0.1053	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7657.3	0.1595	0.0013	0.0056	0.0196	0.0499	0.0999	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7732.6	0.1689	0.0004	0.0061	0.0219	0.0531	0.1028	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	7683.3	0.1406	0.0004	0.0021	0.0079	0.0198	0.0370	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	7868.9	0.1704	0.0004	0.0052	0.0165	0.0501	0.0911	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	7868.9	0.1742	0.0002	0.0046	0.0203	0.0522	0.1030	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	7868.9	0.1760	0.0004	0.0054	0.0180	0.0516	0.1059	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	3924.0	0.1537	0.0002	0.0021	0.0130	0.0326	0.0606	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	3924.0	0.1530	0.0000	0.0021	0.0092	0.0276	0.0560	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7698.4	0.1993	0.0006	0.0046	0.0192	0.0543	0.1057	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7698.4	0.1851	0.0004	0.0052	0.0196	0.0503	0.1011	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7698.4	0.2546	0.0006	0.0132	0.0549	0.1356	0.2486	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7698.4	0.1530	0.0004	0.0040	0.0157	0.0470	0.0905	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	7869.2	0.1563	0.0004	0.0046	0.0159	0.0439	0.0963	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	7869.2	0.1689	0.0006	0.0046	0.0201	0.0533	0.1057	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	7869.2	0.1652	0.0002	0.0046	0.0196	0.0508	0.1003	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7698.4	0.2544	0.0013	0.0127	0.0426	0.1174	0.2185	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	7868.9	0.1447	0.0002	0.0027	0.0113	0.0309	0.0677	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	7869.2	0.1570	0.0006	0.0040	0.0163	0.0449	0.0873	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.1404	0.0002	0.0021	0.0092	0.0265	0.0522	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1979.4	0.1608	0.0008	0.0044	0.0194	0.0520	0.0982	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1979.4	0.1729	0.0006	0.0052	0.0292	0.0735	0.1393	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1371.5	0.1657	0.0006	0.0058	0.0171	0.0547	0.1019	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.1646	0.0004	0.0048	0.0173	0.0491	0.0961	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.1633	0.0004	0.0050	0.0182	0.0460	0.0973	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.1530	0.0004	0.0021	0.0104	0.0303	0.0610	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.1597	0.0004	0.0033	0.0171	0.0443	0.0831	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.1604	0.0006	0.0040	0.0180	0.0491	0.0892	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	7868.9	0.1665	0.0004	0.0056	0.0192	0.0543	0.1019	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1688.6	0.1604	0.0008	0.0040	0.0186	0.0529	0.1001	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1636.0	0.2492	0.0010	0.0125	0.0422	0.1026	0.1964	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1979.4	0.2488	0.0023	0.0109	0.0399	0.1007	0.1899	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7714.1	0.1404	0.0004	0.0019	0.0069	0.0217	0.0531	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7698.4	0.2088	0.0008	0.0084	0.0286	0.0771	0.1437	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Stereo — sequence 'united_states_capitol'
Method	Date	Type	#kp	MS	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	8000.0	0.2440	0.0025	0.0197	0.0628	0.1460	0.2595	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	3709.7	0.2778	0.0010	0.0108	0.0338	0.0862	0.1631	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	3466.6	0.2864	0.0023	0.0151	0.0514	0.1238	0.2239	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	4917.8	0.2625	0.0013	0.0101	0.0366	0.0870	0.1548	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	6729.8	0.2722	0.0003	0.0119	0.0439	0.1036	0.1904	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	4989.5	0.2615	0.0008	0.0124	0.0408	0.0880	0.1578	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	6501.6	0.2716	0.0008	0.0134	0.0449	0.1059	0.1947	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	1024.0	0.3106	0.0003	0.0055	0.0222	0.0502	0.1004	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	2048.0	0.2956	0.0003	0.0076	0.0252	0.0519	0.1001	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	256.0	0.3175	0.0010	0.0088	0.0298	0.0691	0.1359	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	512.0	0.3172	0.0010	0.0083	0.0229	0.0524	0.1064	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	266.9	0.2694	0.0013	0.0096	0.0262	0.0618	0.1215	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	266.8	0.2837	0.0010	0.0068	0.0280	0.0610	0.1193	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	470.5	0.2536	0.0008	0.0093	0.0310	0.0645	0.1223	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	1965.2	0.2493	0.0013	0.0096	0.0386	0.0988	0.1727	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	1965.2	0.2515	0.0015	0.0113	0.0411	0.0920	0.1636	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	2032.7	0.2592	0.0010	0.0119	0.0509	0.1142	0.2057	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	7685.2	0.2562	0.0015	0.0111	0.0338	0.0792	0.1455	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	7757.0	0.2652	0.0010	0.0146	0.0381	0.0872	0.1551	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	6953.4	0.2425	0.0008	0.0083	0.0343	0.0797	0.1427	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	8000.0	0.2632	0.0008	0.0111	0.0393	0.0872	0.1621	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	8000.0	0.2669	0.0005	0.0121	0.0378	0.0905	0.1652	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	8000.0	0.2691	0.0008	0.0124	0.0416	0.0983	0.1783	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	2709.8	0.2629	0.0010	0.0129	0.0305	0.0721	0.1314	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	2709.8	0.2645	0.0005	0.0083	0.0313	0.0782	0.1437	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	7363.6	0.3099	0.0010	0.0119	0.0383	0.0986	0.1793	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	7363.5	0.2925	0.0010	0.0116	0.0358	0.0817	0.1520	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	7363.6	0.3932	0.0008	0.0116	0.0373	0.0976	0.1828	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	7363.6	0.2443	0.0013	0.0144	0.0512	0.1067	0.1858	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	8000.0	0.2469	0.0018	0.0154	0.0459	0.1016	0.1762	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	8000.0	0.2602	0.0010	0.0116	0.0406	0.0870	0.1652	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	8000.0	0.2576	0.0005	0.0131	0.0429	0.0946	0.1654	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	7363.6	0.3903	0.0013	0.0101	0.0295	0.0751	0.1455	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	8000.0	0.2346	0.0005	0.0146	0.0504	0.1109	0.1924	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	8000.0	0.2485	0.0005	0.0119	0.0381	0.0875	0.1641	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	2048.0	0.2357	0.0005	0.0088	0.0310	0.0814	0.1523	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	1808.9	0.2642	0.0015	0.0119	0.0353	0.0817	0.1541	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	1808.9	0.2738	0.0010	0.0116	0.0398	0.1046	0.1942	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	1159.0	0.2718	0.0013	0.0093	0.0305	0.0714	0.1369	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	1024.0	0.2726	0.0015	0.0098	0.0300	0.0688	0.1273	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	2048.0	0.2598	0.0020	0.0106	0.0318	0.0661	0.1266	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	256.0	0.2737	0.0018	0.0113	0.0366	0.0840	0.1357	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	512.0	0.2779	0.0010	0.0088	0.0358	0.0754	0.1359	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	8000.0	0.2448	0.0005	0.0103	0.0272	0.0603	0.1104	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	8000.0	0.2584	0.0013	0.0129	0.0383	0.0890	0.1692	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	1440.8	0.2671	0.0015	0.0134	0.0330	0.0804	0.1536	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	1442.2	0.3892	0.0008	0.0081	0.0361	0.0890	0.1755	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	1808.9	0.3861	0.0013	0.0086	0.0333	0.0888	0.1689	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	7945.9	0.2294	0.0018	0.0113	0.0361	0.0900	0.1659	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	7363.6	0.3179	0.0010	0.0149	0.0492	0.1109	0.1979	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Breakdown: Phototourism | Stereo | Sequences

home