Fourth Workshop on Image Matching: Local Features & Beyond

Breakdown on the Phototourism dataset, multi-view stereo task, by sequence.

Back to the leaderboard

MVS — All sequences — Sorted by mAP^{15^o}
Method	BM	FCS	LMS	LB	MC	MR	PSM	RS	SF	SPC	USC	AVG	Date	Type	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	0.3576	0.4385	0.5551	0.3696	0.5935	0.3131	0.2543	0.4708	0.5709	0.6054	0.1847	0.4285	19-04-24	F	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	0.3772	0.5865	0.6595	0.3789	0.6131	0.3458	0.4122	0.4310	0.7381	0.6481	0.1606	0.4865	19-04-26	F	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	0.0844	0.3523	0.1150	0.3040	0.3429	0.1207	0.2678	0.3511	0.4466	0.4849	0.0936	0.2694	19-05-14	F	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	0.2518	0.4914	0.6187	0.4915	0.4482	0.3186	0.3673	0.3340	0.5189	0.5070	0.1321	0.4072	19-05-07	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	0.1904	0.4486	0.6308	0.4622	0.4263	0.3543	0.3746	0.3690	0.5083	0.5044	0.1323	0.4001	19-05-07	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	0.2522	0.4901	0.6258	0.4932	0.4486	0.3217	0.3760	0.3284	0.5127	0.5209	0.1431	0.4102	19-06-01	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	0.1962	0.4584	0.6617	0.4545	0.4435	0.3182	0.3976	0.3394	0.4576	0.5162	0.1208	0.3967	19-06-05	F	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	0.0858	0.2017	0.2292	0.1254	0.1748	0.0330	0.0476	0.1275	0.0621	0.2125	0.0602	0.1236	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	0.1014	0.2859	0.2730	0.1300	0.2152	0.0592	0.1378	0.1566	0.1365	0.2347	0.0615	0.1629	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	0.0001	0.0019	0.0148	0.0195	0.0075	0.0010	%!f(int64=0000)	0.0223	%!f(int64=0000)	0.0005	0.0112	0.0072	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	0.0409	0.0558	0.0850	0.0988	0.0773	0.0140	0.0010	0.0895	0.0018	0.1425	0.0375	0.0585	19-05-05	F	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	0.0449	0.0499	0.1188	0.1584	0.0372	0.0190	0.0010	0.1209	0.0985	0.0742	0.0002	0.0657	19-05-07	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	0.0499	0.0654	0.1666	0.2027	0.0645	0.0207	0.0024	0.1424	0.0889	0.1297	0.0009	0.0849	19-05-09	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	0.0933	0.1873	0.2702	0.2906	0.2627	0.0799	0.0148	0.2254	0.2236	0.1925	0.0055	0.1678	19-04-26	F	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	0.2347	0.5479	0.6881	0.2262	0.4348	0.2438	0.2070	0.3115	0.6402	0.5461	0.1021	0.3802	19-05-19	F	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	0.2313	0.5550	0.6536	0.2170	0.4446	0.2620	0.1933	0.3006	0.6428	0.5339	0.1218	0.3778	19-05-19	F	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	0.3285	0.6044	0.7392	0.5208	0.6511	0.4007	0.3091	0.4042	0.7566	0.6161	0.2137	0.5040	19-05-23	F	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	0.4287	0.6316	0.7908	0.4149	0.6392	0.3792	0.4449	0.4937	0.7711	0.6565	0.1619	0.5284	19-05-29	F	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	0.4294	0.5902	0.7407	0.3967	0.6356	0.4294	0.4477	0.5004	0.5997	0.6336	0.1572	0.5055	19-05-30	F	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	0.1587	0.3061	0.3016	0.1642	0.2027	0.1402	0.1302	0.3006	0.3802	0.3913	0.0613	0.2307	19-04-24	F	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	0.4094	0.6191	0.7261	0.5028	0.7047	0.5206	0.4309	0.4283	0.7671	0.6324	0.1863	0.5389	19-06-25	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	0.4028	0.5943	0.7361	0.5324	0.7128	0.4911	0.4473	0.4141	0.7515	0.6505	0.2032	0.5396	19-06-24	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	0.3993	0.6064	0.7317	0.5410	0.7159	0.5201	0.4580	0.4270	0.7335	0.6406	0.1957	0.5427	19-06-20	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	0.1532	0.3674	0.2183	0.2667	0.4313	0.1428	0.1834	0.3325	0.4663	0.3444	0.0502	0.2688	19-05-10	F	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	0.1351	0.3428	0.1588	0.2417	0.4441	0.1367	0.1903	0.3177	0.4562	0.3357	0.0491	0.2553	19-04-29	F/M	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	0.3339	0.6322	0.7204	0.5712	0.7115	0.4788	0.4852	0.4540	0.7826	0.5961	0.1735	0.5399	19-05-09	F	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	0.3768	0.5872	0.7283	0.5217	0.6756	0.4511	0.4682	0.4026	0.7268	0.5668	0.1371	0.5129	19-05-28	F	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	0.6071	0.8317	0.8542	0.8174	0.8809	0.6305	0.6854	0.8043	0.9044	0.8776	0.2345	0.7389	19-05-28	F/M	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	0.3375	0.6042	0.6834	0.4911	0.7246	0.4662	0.4772	0.4788	0.7983	0.6165	0.1497	0.5298	19-05-08	F	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	0.4169	0.6473	0.6744	0.4696	0.7138	0.4907	0.4137	0.4632	0.7740	0.6234	0.1616	0.5317	19-04-24	F	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	0.4090	0.6438	0.7726	0.4702	0.6945	0.5106	0.4612	0.4477	0.8030	0.6376	0.1794	0.5481	19-04-24	F	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	0.3292	0.5935	0.7109	0.4380	0.7093	0.4667	0.4000	0.4192	0.7682	0.5832	0.1778	0.5087	19-04-24	F	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	0.5711	0.8059	0.8677	0.7937	0.8609	0.6222	0.6156	0.7760	0.8970	0.8567	0.2190	0.7169	19-05-29	F/M	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	0.2714	0.5530	0.4768	0.3859	0.6433	0.3181	0.2797	0.4216	0.6002	0.4979	0.1131	0.4146	19-04-24	F	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	0.3304	0.5315	0.6238	0.4131	0.6819	0.4022	0.3557	0.3929	0.6857	0.5503	0.1402	0.4643	19-04-24	F	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	0.1458	0.4113	0.4038	0.1747	0.2606	0.1601	0.0645	0.1915	0.4469	0.3590	0.0652	0.2439	19-05-17	F	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	0.3344	0.5417	0.7106	0.5541	0.6105	0.3891	0.2737	0.3591	0.7074	0.6131	0.1618	0.4778	19-06-07	F	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	0.4071	0.6746	0.7953	0.6602	0.6507	0.4309	0.3033	0.4084	0.7685	0.7030	0.1814	0.5440	19-06-07	F	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	0.3763	0.3927	0.6357	0.5936	0.5046	0.3569	0.3163	0.3722	0.4853	0.5128	0.1470	0.4267	19-04-24	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	0.3752	0.3704	0.6258	0.6135	0.4946	0.3670	0.2659	0.3370	0.4623	0.5174	0.1302	0.4145	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	0.3557	0.3989	0.5727	0.5711	0.5072	0.3859	0.3269	0.3598	0.4956	0.5199	0.1501	0.4222	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	0.1089	0.0256	0.2964	0.3699	0.0160	0.1125	0.0002	0.1210	0.2019	0.1638	0.0016	0.1289	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	0.3242	0.1806	0.5745	0.5669	0.3108	0.2704	0.0990	0.2769	0.3634	0.4048	0.0585	0.3118	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	0.3283	0.3907	0.5753	0.4620	0.4401	0.3203	0.3695	0.3651	0.4828	0.4707	0.1208	0.3932	19-04-26	F	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	0.3459	0.6743	0.7612	0.3762	0.7104	0.4867	0.3722	0.3781	0.8166	0.6483	0.1591	0.5208	19-07-29	F	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	0.4057	0.5237	0.6628	0.5740	0.5633	0.3562	0.3349	0.3850	0.6564	0.5875	0.1590	0.4735	19-05-30	F	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	0.7052	0.6671	0.6918	0.8131	0.7382	0.5033	0.3152	0.6750	0.7949	0.7887	0.2596	0.6320	19-05-30	F/M	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	0.6989	0.6873	0.6997	0.8126	0.7504	0.5183	0.3557	0.7210	0.7942	0.8112	0.2540	0.6458	19-05-28	F/M	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	0.1458	0.3515	0.3862	0.2207	0.4503	0.2126	0.2072	0.3438	0.4977	0.4159	0.0760	0.3007	19-04-24	F	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	0.3075	0.7850	0.8314	0.6763	0.7492	0.5393	0.5014	0.4667	0.8463	0.7185	0.1965	0.6017	19-06-07	F	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Results for individual sequences:

MVS — sequence 'british_museum'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	99.1	4960.4	99.5	3.76	0.1442	0.2674	0.3576	0.4264	0.4881	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	99.4	3516.7	99.2	3.30	0.1580	0.2773	0.3772	0.4609	0.5263	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	98.9	2095.0	98.0	3.06	0.0102	0.0410	0.0844	0.1457	0.2142	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	95.8	3457.5	94.5	2.89	0.0700	0.1689	0.2518	0.3336	0.4026	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	96.3	3622.8	95.5	2.85	0.0386	0.1086	0.1904	0.2725	0.3439	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	95.8	3445.4	94.8	2.88	0.0663	0.1633	0.2522	0.3313	0.4039	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	95.9	3611.2	95.0	2.83	0.0336	0.1068	0.1962	0.2832	0.3553	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	91.7	804.0	89.5	2.33	0.0149	0.0464	0.0858	0.1385	0.1972	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	94.2	1517.5	92.2	2.29	0.0165	0.0519	0.1014	0.1625	0.2240	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	25.9	55.0	7.0	1.76	0.0000	0.0001	0.0001	0.0002	0.0004	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	79.2	351.5	70.0	2.30	0.0044	0.0191	0.0409	0.0704	0.1062	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	82.1	277.2	66.8	2.73	0.0055	0.0210	0.0449	0.0785	0.1183	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	81.4	267.9	73.8	2.82	0.0077	0.0258	0.0499	0.0899	0.1269	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	87.6	294.0	70.0	2.90	0.0169	0.0540	0.0933	0.1398	0.1870	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	99.0	1768.9	99.5	3.31	0.0849	0.1626	0.2347	0.3106	0.3816	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	99.1	2019.6	99.8	3.19	0.0778	0.1531	0.2313	0.3113	0.3792	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	99.4	1230.2	99.0	3.31	0.1306	0.2388	0.3285	0.4143	0.4816	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	99.5	5718.4	100.0	3.60	0.1898	0.3316	0.4287	0.5047	0.5667	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	97.5	6175.5	96.0	3.63	0.2179	0.3460	0.4294	0.4954	0.5531	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	97.5	3255.1	94.0	3.39	0.0334	0.0898	0.1587	0.2268	0.2955	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	99.5	7420.1	99.2	3.35	0.2044	0.3255	0.4094	0.4839	0.5486	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	99.6	7694.9	99.8	3.32	0.1978	0.3176	0.4028	0.4793	0.5407	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	99.4	7787.7	99.2	3.31	0.1911	0.3166	0.3993	0.4709	0.5348	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	97.6	1898.7	92.0	3.02	0.0354	0.0877	0.1532	0.2208	0.2907	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	96.4	1816.7	92.8	2.95	0.0275	0.0754	0.1351	0.1993	0.2716	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	99.3	6043.3	99.8	2.92	0.1531	0.2511	0.3339	0.4077	0.4683	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	98.6	5381.4	97.2	2.97	0.1763	0.2915	0.3768	0.4434	0.5078	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.3	6167.1	98.0	3.12	0.3613	0.5157	0.6071	0.6787	0.7271	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	99.4	4407.3	99.2	3.14	0.1621	0.2620	0.3375	0.4073	0.4742	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	99.5	5722.5	100.0	3.45	0.2066	0.3316	0.4169	0.4933	0.5506	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	99.6	6837.1	100.0	3.38	0.2024	0.3305	0.4090	0.4957	0.5559	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	99.6	6314.7	100.0	3.22	0.1492	0.2527	0.3292	0.4030	0.4699	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.5	6170.7	99.8	3.08	0.3315	0.4759	0.5711	0.6451	0.7017	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	98.1	4336.2	97.2	3.25	0.1122	0.2007	0.2714	0.3424	0.4135	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	99.1	5718.0	99.8	3.20	0.1452	0.2462	0.3304	0.4033	0.4724	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	97.9	1573.7	95.5	3.48	0.0411	0.0924	0.1458	0.2164	0.2877	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	99.2	1563.9	97.5	3.36	0.1320	0.2481	0.3344	0.4009	0.4741	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	99.4	1479.1	98.5	3.50	0.1942	0.3205	0.4071	0.4807	0.5482	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	95.6	1003.0	93.5	3.33	0.1745	0.2897	0.3763	0.4470	0.5024	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	95.9	915.3	93.5	3.28	0.1755	0.2912	0.3752	0.4456	0.4965	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	95.2	1634.1	94.8	3.26	0.1567	0.2703	0.3557	0.4228	0.4766	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	75.9	173.6	47.5	2.92	0.0402	0.0793	0.1089	0.1323	0.1505	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	93.2	457.5	87.5	3.14	0.1349	0.2415	0.3242	0.3830	0.4355	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	96.2	5766.0	94.5	3.25	0.1269	0.2422	0.3283	0.3987	0.4625	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	99.4	7869.6	99.8	3.33	0.1557	0.2578	0.3459	0.4145	0.4826	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	98.8	1100.9	97.2	3.50	0.1885	0.3207	0.4057	0.4768	0.5428	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	99.3	1092.9	98.5	3.81	0.4603	0.6270	0.7052	0.7533	0.7989	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	99.3	1258.2	99.0	3.89	0.4530	0.6140	0.6989	0.7556	0.8001	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	97.8	3199.3	94.2	2.89	0.0386	0.0920	0.1458	0.2112	0.2830	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.5	6693.3	100.0	2.97	0.1253	0.2196	0.3075	0.3843	0.4521	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'florence_cathedral_side'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	92.3	4821.8	81.7	3.06	0.3638	0.4103	0.4385	0.4615	0.4841	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	97.8	7748.9	95.2	3.16	0.4741	0.5381	0.5865	0.6261	0.6657	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	93.7	3959.1	87.2	2.60	0.2233	0.2980	0.3523	0.4039	0.4457	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	92.5	4421.4	91.5	3.06	0.3081	0.4217	0.4914	0.5396	0.5713	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	93.0	6092.1	93.2	2.89	0.2586	0.3721	0.4486	0.5162	0.5684	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	91.7	4463.2	92.8	3.06	0.3054	0.4182	0.4901	0.5333	0.5709	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	92.3	5775.1	92.2	2.88	0.2732	0.3889	0.4584	0.5096	0.5530	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	85.3	588.5	74.0	2.42	0.0332	0.1237	0.2017	0.2653	0.3064	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	89.3	1200.4	81.2	2.41	0.0868	0.2067	0.2859	0.3434	0.3916	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	55.7	76.4	22.8	2.26	0.0000	0.0005	0.0019	0.0044	0.0074	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	72.9	239.8	57.2	2.36	0.0047	0.0225	0.0558	0.0939	0.1246	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	37.4	179.8	32.8	1.87	0.0302	0.0436	0.0499	0.0542	0.0581	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	64.3	194.4	40.2	2.28	0.0324	0.0521	0.0654	0.0736	0.0813	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	75.3	309.9	47.2	2.82	0.1363	0.1706	0.1873	0.1945	0.2006	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	95.3	1940.1	87.2	3.14	0.4212	0.5014	0.5479	0.5849	0.6129	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	95.6	2123.3	88.5	3.12	0.4293	0.5089	0.5550	0.5946	0.6318	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	97.5	2091.2	91.5	3.20	0.4790	0.5510	0.6044	0.6469	0.6828	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	97.7	6265.0	94.8	3.44	0.5418	0.5993	0.6316	0.6680	0.7056	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	94.6	6460.8	92.2	3.53	0.4991	0.5524	0.5902	0.6203	0.6526	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	87.7	3858.0	71.0	2.64	0.2375	0.2792	0.3061	0.3266	0.3509	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	98.4	7558.2	95.5	3.37	0.4973	0.5718	0.6191	0.6604	0.6953	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	98.0	7589.5	94.8	3.34	0.4923	0.5526	0.5943	0.6297	0.6652	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	98.8	7655.2	94.8	3.33	0.5094	0.5654	0.6064	0.6459	0.6754	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	89.4	4132.4	78.0	2.76	0.2820	0.3401	0.3674	0.3938	0.4214	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	87.3	3781.1	74.8	2.66	0.2627	0.3147	0.3428	0.3638	0.3850	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	98.5	7259.7	96.0	3.26	0.5283	0.5921	0.6322	0.6655	0.6918	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	96.3	5673.6	93.0	3.27	0.4892	0.5437	0.5872	0.6208	0.6492	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.7	7276.0	96.8	3.56	0.7187	0.7936	0.8317	0.8501	0.8687	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	98.0	6651.5	95.2	3.32	0.5159	0.5732	0.6042	0.6374	0.6741	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	98.3	7094.6	95.8	3.50	0.5467	0.6084	0.6473	0.6791	0.7146	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	98.8	7774.4	96.8	3.46	0.5471	0.6084	0.6438	0.6724	0.7028	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	98.0	7250.3	96.5	3.33	0.5012	0.5587	0.5935	0.6245	0.6627	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.1	7224.6	97.2	3.53	0.7153	0.7766	0.8059	0.8325	0.8581	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	96.7	6040.3	92.5	3.31	0.4774	0.5243	0.5530	0.5861	0.6184	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	97.6	6488.0	95.0	3.26	0.4359	0.4882	0.5315	0.5723	0.6088	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	89.0	1666.1	74.5	2.97	0.3189	0.3770	0.4113	0.4356	0.4592	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	96.0	1862.0	87.0	3.53	0.4306	0.5019	0.5417	0.5752	0.6014	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	96.8	1880.9	90.8	3.70	0.5507	0.6319	0.6746	0.7049	0.7246	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	88.7	1320.5	80.5	3.28	0.3088	0.3606	0.3927	0.4212	0.4471	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	85.5	850.7	74.5	3.13	0.2989	0.3428	0.3704	0.3905	0.4057	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	89.5	1528.8	82.8	3.29	0.3072	0.3622	0.3989	0.4326	0.4630	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	17.9	81.3	20.5	1.31	0.0209	0.0239	0.0256	0.0267	0.0280	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	73.0	355.0	55.2	2.75	0.1514	0.1694	0.1806	0.1922	0.2005	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	91.0	5043.3	89.2	3.10	0.2632	0.3433	0.3907	0.4282	0.4653	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	98.9	8281.7	97.0	3.49	0.5546	0.6281	0.6743	0.7062	0.7327	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	94.7	1537.1	84.5	3.38	0.4163	0.4815	0.5237	0.5539	0.5838	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	97.5	1587.8	90.2	3.73	0.5526	0.6232	0.6670	0.6939	0.7185	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	97.4	1657.0	92.8	3.80	0.5767	0.6521	0.6873	0.7179	0.7447	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	88.6	3721.2	72.2	2.72	0.2854	0.3243	0.3515	0.3740	0.3960	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	98.8	7951.4	95.8	3.48	0.6800	0.7519	0.7850	0.8047	0.8217	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'lincoln_memorial_statue'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	95.9	5196.5	87.5	3.21	0.4596	0.5232	0.5551	0.5818	0.6020	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	97.6	1076.8	93.2	4.25	0.5305	0.6167	0.6595	0.6920	0.7132	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	90.9	793.1	87.2	3.12	0.0365	0.0796	0.1150	0.1471	0.1801	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	94.3	4540.1	94.8	3.05	0.4675	0.5678	0.6187	0.6494	0.6767	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	95.0	6023.8	95.5	2.94	0.4699	0.5761	0.6308	0.6671	0.7006	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	94.2	4643.3	93.5	3.05	0.4602	0.5636	0.6258	0.6527	0.6798	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	95.2	6163.4	95.2	2.94	0.4443	0.5866	0.6617	0.6864	0.7140	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	80.2	534.1	66.0	2.38	0.0837	0.1753	0.2292	0.2669	0.2874	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	84.3	1020.8	71.5	2.35	0.1008	0.2074	0.2730	0.3139	0.3425	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	32.0	63.5	31.2	1.73	0.0016	0.0077	0.0148	0.0205	0.0239	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	69.3	201.4	53.5	2.34	0.0196	0.0570	0.0850	0.1064	0.1212	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	69.3	160.9	46.0	2.87	0.0744	0.1020	0.1188	0.1301	0.1364	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	73.1	195.6	53.2	2.93	0.1044	0.1440	0.1666	0.1778	0.1867	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	79.6	329.7	66.5	3.09	0.1974	0.2482	0.2702	0.2898	0.3026	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	97.8	1258.5	93.2	3.95	0.5709	0.6464	0.6881	0.7169	0.7356	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	97.1	1362.8	92.2	3.85	0.5274	0.6036	0.6536	0.6790	0.7021	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	98.9	1277.6	97.5	3.83	0.6315	0.7058	0.7392	0.7719	0.7970	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	99.0	6822.6	99.0	4.00	0.6910	0.7537	0.7908	0.8168	0.8375	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	96.9	6726.0	95.8	4.04	0.6736	0.7241	0.7407	0.7595	0.7736	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	95.6	2357.2	91.8	3.47	0.2133	0.2653	0.3016	0.3404	0.3774	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	98.1	6861.5	97.2	3.47	0.6357	0.6994	0.7261	0.7485	0.7656	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	98.1	7181.7	96.5	3.44	0.6370	0.7004	0.7361	0.7596	0.7767	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	98.4	7138.7	96.2	3.44	0.6257	0.6971	0.7317	0.7544	0.7747	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	88.9	820.2	83.2	3.42	0.1204	0.1771	0.2183	0.2457	0.2696	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	86.8	725.6	80.2	3.21	0.0821	0.1254	0.1588	0.1892	0.2132	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	98.8	5106.3	97.0	3.43	0.6233	0.6881	0.7204	0.7408	0.7649	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	97.7	4447.9	94.0	3.41	0.6325	0.6999	0.7283	0.7484	0.7625	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.6	5099.5	98.8	3.54	0.7674	0.8225	0.8542	0.8722	0.8874	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	98.5	4395.5	96.8	3.52	0.5948	0.6508	0.6834	0.7094	0.7323	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	98.1	5615.2	96.8	3.55	0.6031	0.6511	0.6744	0.6970	0.7121	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	98.6	6905.6	96.8	3.56	0.6800	0.7454	0.7726	0.7886	0.8055	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	98.5	6278.0	94.2	3.47	0.6097	0.6791	0.7109	0.7307	0.7504	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.5	5009.4	99.2	3.56	0.7708	0.8430	0.8677	0.8873	0.8984	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	94.9	3769.6	86.8	3.27	0.3983	0.4470	0.4768	0.5010	0.5233	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	98.0	5567.0	92.2	3.37	0.5373	0.5954	0.6238	0.6453	0.6680	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	92.4	1593.7	85.2	3.36	0.3010	0.3615	0.4038	0.4352	0.4573	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	96.4	1535.2	90.0	4.02	0.6261	0.6877	0.7106	0.7276	0.7416	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	98.1	1490.6	91.2	4.17	0.7057	0.7695	0.7953	0.8085	0.8189	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	93.6	791.2	87.2	4.40	0.5501	0.6104	0.6357	0.6530	0.6638	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	93.5	967.9	87.5	4.26	0.5339	0.5940	0.6258	0.6436	0.6547	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	93.1	1772.5	88.5	3.71	0.4758	0.5381	0.5727	0.5973	0.6115	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	84.0	238.9	79.5	4.24	0.2156	0.2626	0.2964	0.3136	0.3343	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	92.7	519.7	85.0	4.58	0.4984	0.5515	0.5745	0.5930	0.6051	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	94.4	6401.4	90.5	3.23	0.4430	0.5314	0.5753	0.6083	0.6304	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	99.1	6734.6	97.2	3.64	0.6515	0.7284	0.7612	0.7863	0.8062	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	97.1	1106.8	90.0	4.02	0.5655	0.6321	0.6628	0.6907	0.7063	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	97.5	798.1	93.2	4.59	0.5836	0.6523	0.6918	0.7197	0.7406	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	96.8	1237.2	91.8	4.20	0.6141	0.6727	0.6997	0.7208	0.7390	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	94.5	3245.4	85.0	2.98	0.3038	0.3571	0.3862	0.4119	0.4384	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.4	5063.3	99.8	3.62	0.7325	0.7950	0.8314	0.8568	0.8713	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'london_bridge'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	94.1	3666.5	89.2	3.27	0.2433	0.3206	0.3696	0.4181	0.4534	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	96.8	2633.4	94.2	3.19	0.2211	0.3162	0.3789	0.4291	0.4686	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	94.6	2154.4	89.8	2.83	0.1277	0.2320	0.3040	0.3615	0.4028	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	96.4	3939.5	97.0	3.31	0.2558	0.4064	0.4915	0.5478	0.5956	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	96.4	5022.2	97.0	3.05	0.2057	0.3689	0.4622	0.5308	0.5852	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	96.6	4016.4	96.5	3.30	0.2528	0.4043	0.4932	0.5578	0.6048	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	96.6	4915.3	97.0	3.05	0.1935	0.3499	0.4545	0.5304	0.5837	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	92.2	922.8	91.5	2.50	0.0184	0.0747	0.1254	0.1797	0.2241	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	94.0	1667.0	94.8	2.48	0.0207	0.0800	0.1300	0.1810	0.2237	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	60.8	109.8	39.0	2.29	0.0027	0.0104	0.0195	0.0277	0.0360	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	83.5	432.5	77.5	2.48	0.0154	0.0543	0.0988	0.1371	0.1746	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	73.9	159.8	46.8	2.97	0.0904	0.1372	0.1584	0.1737	0.1847	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	77.7	169.0	55.8	3.10	0.0979	0.1649	0.2027	0.2218	0.2363	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	84.8	328.1	65.5	3.12	0.1809	0.2550	0.2906	0.3187	0.3396	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	91.6	1260.9	87.5	3.05	0.1135	0.1838	0.2262	0.2627	0.2948	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	92.0	1448.8	87.2	3.03	0.1051	0.1759	0.2170	0.2485	0.2815	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	96.4	1242.2	93.8	3.35	0.3447	0.4644	0.5208	0.5691	0.6083	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	96.2	4976.8	94.5	3.36	0.2634	0.3549	0.4149	0.4597	0.4984	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	95.7	5314.1	94.5	3.38	0.2466	0.3423	0.3967	0.4400	0.4687	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	88.4	2595.0	78.5	2.82	0.0736	0.1280	0.1642	0.1981	0.2300	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	97.8	6327.6	98.2	3.32	0.3391	0.4378	0.5028	0.5517	0.5945	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	97.9	6574.8	98.2	3.32	0.3616	0.4671	0.5324	0.5772	0.6103	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	97.5	6681.3	98.2	3.31	0.3863	0.4826	0.5410	0.5843	0.6152	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	94.2	2102.2	89.5	3.02	0.1235	0.2106	0.2667	0.3153	0.3640	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	93.6	2021.3	88.8	2.93	0.1019	0.1855	0.2417	0.2926	0.3357	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	98.2	5949.2	98.0	3.28	0.4002	0.5159	0.5712	0.6138	0.6520	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	98.1	5188.2	97.8	3.33	0.3329	0.4564	0.5217	0.5654	0.6039	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	98.7	5984.1	99.2	3.40	0.6285	0.7553	0.8174	0.8442	0.8620	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	98.1	4628.8	97.5	3.39	0.3420	0.4375	0.4911	0.5313	0.5713	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	97.3	4744.3	97.0	3.39	0.3242	0.4196	0.4696	0.5165	0.5491	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	98.3	6062.4	97.5	3.33	0.2956	0.4020	0.4702	0.5168	0.5550	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	97.5	5746.2	98.2	3.28	0.2718	0.3646	0.4380	0.4885	0.5317	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	98.4	6129.5	99.2	3.35	0.6083	0.7361	0.7937	0.8251	0.8564	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	94.7	3614.1	87.8	3.27	0.2479	0.3303	0.3859	0.4305	0.4718	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	97.0	5010.7	97.0	3.27	0.2514	0.3525	0.4131	0.4563	0.4968	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	82.3	888.5	71.5	2.87	0.0827	0.1396	0.1747	0.2032	0.2290	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	97.5	1398.6	95.2	3.91	0.3730	0.4976	0.5541	0.6002	0.6341	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	97.8	1336.9	95.2	4.03	0.4823	0.6033	0.6602	0.7031	0.7303	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	96.0	922.0	92.8	3.88	0.4148	0.5400	0.5936	0.6320	0.6642	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	95.9	874.8	94.5	3.93	0.4395	0.5600	0.6135	0.6477	0.6756	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	96.3	1484.5	94.8	3.76	0.4041	0.5096	0.5711	0.6172	0.6479	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	87.0	225.8	63.5	3.37	0.2607	0.3387	0.3699	0.3857	0.3984	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	93.8	481.0	88.2	3.78	0.4028	0.5143	0.5669	0.5948	0.6117	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	96.7	4626.6	96.2	3.46	0.2205	0.3698	0.4620	0.5172	0.5649	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	98.2	6228.3	97.5	3.34	0.2030	0.3112	0.3762	0.4243	0.4616	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	97.4	1092.8	95.2	3.88	0.4075	0.5158	0.5740	0.6199	0.6544	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	98.2	1071.6	98.0	4.10	0.6167	0.7569	0.8131	0.8474	0.8664	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	98.2	1316.8	97.0	4.10	0.6218	0.7594	0.8126	0.8456	0.8671	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	91.7	2428.3	84.5	2.89	0.0975	0.1677	0.2207	0.2657	0.3093	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	98.5	5803.5	99.2	3.45	0.4695	0.6102	0.6763	0.7138	0.7400	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'milan_cathedral'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	98.0	4605.7	90.2	3.54	0.3742	0.5242	0.5935	0.6420	0.6702	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	98.7	4561.8	96.0	3.04	0.3799	0.5325	0.6131	0.6570	0.7016	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	95.7	2854.1	88.2	2.77	0.1503	0.2682	0.3429	0.3976	0.4477	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	96.7	4810.8	94.5	3.20	0.2055	0.3503	0.4482	0.5225	0.5721	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	97.2	6061.8	96.0	3.00	0.1761	0.3279	0.4263	0.5026	0.5647	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	96.7	4781.9	94.0	3.18	0.1998	0.3591	0.4486	0.5188	0.5728	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	97.5	5649.4	96.5	2.98	0.1823	0.3418	0.4435	0.5222	0.5855	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	91.1	744.9	83.0	2.49	0.0203	0.0951	0.1748	0.2460	0.3072	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	93.5	1402.3	87.5	2.48	0.0304	0.1251	0.2152	0.2912	0.3442	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	35.7	86.7	22.5	1.79	0.0003	0.0028	0.0075	0.0123	0.0165	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	81.4	320.6	62.5	2.45	0.0054	0.0336	0.0773	0.1190	0.1548	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	62.4	166.5	32.5	2.55	0.0169	0.0303	0.0372	0.0412	0.0455	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	66.8	191.2	41.0	2.69	0.0256	0.0489	0.0645	0.0749	0.0830	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	84.0	321.6	60.5	3.18	0.1424	0.2233	0.2627	0.2884	0.3109	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	94.0	1382.2	86.8	3.04	0.2484	0.3696	0.4348	0.4782	0.5125	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	94.1	1605.5	86.0	2.99	0.2640	0.3892	0.4446	0.4968	0.5305	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	99.3	1977.7	97.2	3.44	0.4228	0.5685	0.6511	0.7022	0.7335	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	98.8	5426.8	96.8	3.45	0.4157	0.5569	0.6392	0.6904	0.7250	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	97.8	5934.5	95.0	3.45	0.4255	0.5604	0.6356	0.6788	0.7094	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	83.7	3000.3	67.2	2.65	0.1012	0.1613	0.2027	0.2277	0.2493	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	99.9	7184.5	97.8	3.52	0.4876	0.6357	0.7047	0.7474	0.7866	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	99.8	7388.7	98.2	3.50	0.4924	0.6350	0.7128	0.7541	0.7900	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	99.7	7437.9	98.2	3.50	0.4985	0.6414	0.7159	0.7607	0.7938	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	92.2	3106.0	82.5	2.95	0.2665	0.3724	0.4313	0.4724	0.5107	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	91.5	2962.0	82.0	2.89	0.2635	0.3814	0.4441	0.4868	0.5202	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	99.9	7159.1	99.0	3.49	0.4967	0.6466	0.7115	0.7636	0.7925	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	99.5	5971.8	97.8	3.59	0.4545	0.6070	0.6756	0.7158	0.7507	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	100.0	6725.5	100.0	3.67	0.6659	0.8305	0.8809	0.9102	0.9293	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	99.7	5863.3	98.5	3.66	0.5024	0.6494	0.7246	0.7711	0.8054	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	99.4	5803.2	98.0	3.65	0.4853	0.6450	0.7138	0.7569	0.7808	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	99.6	6992.4	98.5	3.53	0.4737	0.6270	0.6945	0.7446	0.7752	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	99.5	6584.7	97.8	3.51	0.4818	0.6342	0.7093	0.7595	0.7870	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.9	6656.3	99.8	3.67	0.6488	0.8004	0.8609	0.8883	0.9076	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	97.8	5042.7	92.0	3.47	0.4324	0.5683	0.6433	0.6835	0.7085	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	99.4	5916.9	96.5	3.47	0.4612	0.6155	0.6819	0.7305	0.7654	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	84.8	1020.4	71.5	2.93	0.1483	0.2179	0.2606	0.2910	0.3163	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	98.1	1738.9	91.5	3.66	0.3882	0.5408	0.6105	0.6517	0.6871	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	98.5	1621.8	93.8	3.73	0.4257	0.5707	0.6507	0.7016	0.7358	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	95.2	1172.9	85.2	3.39	0.3032	0.4336	0.5046	0.5534	0.5797	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	94.2	841.4	83.2	3.32	0.3051	0.4243	0.4946	0.5351	0.5616	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	95.7	1524.3	88.2	3.44	0.3035	0.4387	0.5072	0.5523	0.5886	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	33.8	96.8	12.8	1.96	0.0081	0.0133	0.0160	0.0176	0.0188	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	85.7	388.5	66.2	3.01	0.1857	0.2701	0.3108	0.3430	0.3634	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	96.5	4767.1	92.5	3.42	0.1902	0.3431	0.4401	0.5021	0.5568	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	99.8	7483.0	99.2	3.52	0.4681	0.6285	0.7104	0.7622	0.8020	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	97.0	1367.4	90.5	3.57	0.3466	0.4874	0.5633	0.6130	0.6465	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	98.9	1291.3	94.5	3.80	0.5055	0.6663	0.7382	0.7775	0.8023	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	99.1	1457.4	95.0	4.01	0.5125	0.6774	0.7504	0.7938	0.8180	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	93.2	3014.7	82.5	3.10	0.2704	0.3890	0.4503	0.4908	0.5190	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.7	7245.2	99.8	3.57	0.5255	0.6810	0.7492	0.7906	0.8229	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'mount_rushmore'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	88.0	2560.1	84.5	3.50	0.1900	0.2628	0.3131	0.3593	0.3974	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	92.7	3241.5	89.0	3.66	0.1738	0.2667	0.3458	0.4051	0.4399	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	86.1	2007.6	77.8	2.77	0.0279	0.0683	0.1207	0.1664	0.2017	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	94.0	4259.6	90.2	3.55	0.1301	0.2326	0.3186	0.3834	0.4361	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	95.2	6731.8	94.0	3.36	0.1456	0.2653	0.3543	0.4342	0.4860	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	94.0	4065.3	93.0	3.56	0.1257	0.2435	0.3217	0.3834	0.4381	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	95.0	5694.8	93.2	3.29	0.1185	0.2274	0.3182	0.3912	0.4499	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	81.6	454.4	78.8	2.50	0.0010	0.0092	0.0330	0.0776	0.1192	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	86.3	853.8	87.8	2.52	0.0024	0.0192	0.0592	0.1117	0.1638	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	32.2	54.0	27.5	1.80	0.0000	0.0002	0.0010	0.0038	0.0062	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	70.7	191.1	60.2	2.42	0.0003	0.0040	0.0140	0.0316	0.0504	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	59.2	117.2	42.2	2.65	0.0043	0.0122	0.0190	0.0266	0.0313	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	60.0	120.6	45.0	2.73	0.0046	0.0122	0.0207	0.0285	0.0342	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	68.0	164.0	57.0	2.87	0.0293	0.0585	0.0799	0.0972	0.1116	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	83.7	814.9	81.0	3.23	0.1240	0.1972	0.2438	0.2877	0.3239	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	86.0	956.1	78.8	3.22	0.1350	0.2085	0.2620	0.3051	0.3352	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	93.0	1511.3	91.0	3.58	0.2318	0.3353	0.4007	0.4460	0.4868	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	90.0	3140.7	91.2	3.66	0.2129	0.3063	0.3792	0.4237	0.4648	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	92.5	3961.1	94.5	3.84	0.2539	0.3624	0.4294	0.4835	0.5198	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	78.1	1604.7	69.8	2.96	0.0640	0.1070	0.1402	0.1691	0.1905	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	94.5	4923.5	92.8	3.59	0.3455	0.4482	0.5206	0.5666	0.5975	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	93.5	5179.7	92.5	3.56	0.3277	0.4360	0.4911	0.5325	0.5606	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	94.4	5162.8	94.2	3.58	0.3358	0.4460	0.5201	0.5683	0.5981	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	78.3	1244.1	66.0	2.86	0.0693	0.1117	0.1428	0.1707	0.1927	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	75.7	1066.4	65.0	2.79	0.0656	0.1071	0.1367	0.1638	0.1835	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	94.9	4393.5	95.0	3.66	0.2958	0.4132	0.4788	0.5322	0.5700	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	93.0	3832.8	92.5	3.77	0.2789	0.3830	0.4511	0.5060	0.5436	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	95.2	5279.1	95.5	3.63	0.4208	0.5494	0.6305	0.6803	0.7085	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	93.9	4283.5	91.5	3.64	0.3019	0.4028	0.4662	0.5188	0.5631	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	94.2	4258.4	91.2	3.66	0.3333	0.4335	0.4907	0.5437	0.5759	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	94.8	5031.3	91.8	3.61	0.3225	0.4429	0.5106	0.5660	0.6041	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	92.9	4193.4	89.5	3.57	0.2987	0.3976	0.4667	0.5171	0.5530	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	94.9	5005.6	95.8	3.68	0.4124	0.5522	0.6222	0.6646	0.6979	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	85.6	2470.7	83.0	3.29	0.1986	0.2710	0.3181	0.3563	0.3825	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	91.1	3276.1	86.8	3.50	0.2535	0.3399	0.4022	0.4485	0.4811	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	76.7	689.6	71.5	2.97	0.0798	0.1236	0.1601	0.1927	0.2159	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	92.0	1005.4	89.2	4.21	0.2201	0.3210	0.3891	0.4421	0.4823	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	91.9	990.3	87.2	4.23	0.2608	0.3599	0.4309	0.4813	0.5167	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	89.9	494.1	87.8	4.14	0.1925	0.2929	0.3569	0.4057	0.4463	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	91.2	580.9	87.0	4.24	0.2113	0.3033	0.3670	0.4165	0.4602	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	91.5	1048.4	89.8	4.08	0.2214	0.3216	0.3859	0.4300	0.4717	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	75.2	113.6	64.2	3.42	0.0501	0.0837	0.1125	0.1370	0.1558	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	83.0	246.6	83.2	3.84	0.1380	0.2119	0.2704	0.3132	0.3438	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	94.4	3764.8	92.0	3.85	0.1489	0.2427	0.3203	0.3796	0.4292	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	94.3	5187.3	92.2	3.56	0.3011	0.4134	0.4867	0.5406	0.5767	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	91.7	701.3	87.2	4.19	0.2019	0.2927	0.3562	0.4068	0.4445	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	94.9	686.9	90.2	4.37	0.3028	0.4263	0.5033	0.5579	0.5958	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	93.4	960.1	88.2	4.36	0.3139	0.4466	0.5183	0.5656	0.5954	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	83.3	1508.4	76.5	3.23	0.1146	0.1709	0.2126	0.2475	0.2775	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	95.5	5563.0	96.2	3.63	0.3288	0.4579	0.5393	0.5927	0.6348	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'piazza_san_marco'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	90.2	5212.9	84.8	2.37	0.1090	0.1960	0.2543	0.3059	0.3479	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	96.6	5929.9	96.0	2.47	0.1927	0.3246	0.4122	0.4794	0.5291	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	92.3	4004.4	89.5	2.29	0.0765	0.1823	0.2678	0.3353	0.3876	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	95.9	5493.2	93.5	2.41	0.1385	0.2715	0.3673	0.4293	0.4868	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	96.8	6287.8	96.5	2.42	0.1290	0.2761	0.3746	0.4409	0.4992	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	96.1	5554.5	94.2	2.41	0.1407	0.2800	0.3760	0.4440	0.5006	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	97.1	5970.1	95.5	2.42	0.1452	0.2975	0.3976	0.4676	0.5287	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	65.6	526.3	56.5	2.09	0.0058	0.0262	0.0476	0.0674	0.0848	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	81.8	1718.9	78.0	2.08	0.0178	0.0764	0.1378	0.1946	0.2381	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	%!f(int64=0)	0.0	0.0	0.00	0.0000	0.0000	0.0000	0.0000	0.0000	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	51.8	112.2	22.2	2.13	0.0001	0.0006	0.0010	0.0026	0.0034	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	11.6	63.5	21.8	1.08	0.0001	0.0005	0.0010	0.0014	0.0017	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	27.6	95.0	29.8	1.63	0.0003	0.0011	0.0024	0.0040	0.0048	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	55.6	141.1	34.2	2.27	0.0036	0.0089	0.0148	0.0176	0.0215	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	88.4	2023.8	85.5	2.28	0.0373	0.1222	0.2070	0.2820	0.3337	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	87.8	2147.9	84.5	2.25	0.0328	0.1092	0.1933	0.2619	0.3165	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	93.9	2525.5	92.0	2.37	0.1127	0.2203	0.3091	0.3771	0.4310	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	97.2	7092.2	95.8	2.63	0.2507	0.3783	0.4449	0.5045	0.5562	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	97.5	7443.0	97.2	2.71	0.2714	0.3850	0.4477	0.5046	0.5518	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	87.5	4718.8	86.8	2.18	0.0192	0.0666	0.1302	0.1867	0.2426	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	98.1	8339.5	98.0	2.52	0.2242	0.3458	0.4309	0.4866	0.5436	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	98.3	8545.4	98.5	2.53	0.2424	0.3622	0.4473	0.5034	0.5488	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	98.6	8696.8	98.5	2.56	0.2526	0.3750	0.4580	0.5077	0.5571	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	83.0	2717.4	77.2	2.24	0.0687	0.1341	0.1834	0.2265	0.2614	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	82.4	2633.9	74.2	2.24	0.0704	0.1389	0.1903	0.2296	0.2672	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	98.9	8508.5	98.0	2.65	0.2755	0.4066	0.4852	0.5414	0.5906	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	98.5	6919.9	97.0	2.70	0.2535	0.3866	0.4682	0.5198	0.5635	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.6	8324.3	99.5	2.85	0.5046	0.6215	0.6854	0.7333	0.7671	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	98.1	7387.0	95.0	2.70	0.2740	0.4013	0.4772	0.5292	0.5787	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	97.5	7295.3	94.8	2.56	0.2476	0.3512	0.4137	0.4679	0.5185	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	98.4	8333.2	99.2	2.57	0.2565	0.3783	0.4612	0.5187	0.5603	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	97.3	7765.5	97.0	2.47	0.2093	0.3237	0.4000	0.4572	0.5014	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.5	8211.0	98.8	2.84	0.4407	0.5491	0.6156	0.6625	0.7047	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	91.2	5351.7	87.2	2.35	0.1394	0.2188	0.2797	0.3339	0.3677	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	95.8	6975.9	94.8	2.41	0.1664	0.2763	0.3557	0.4210	0.4733	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	79.3	1396.0	77.8	2.12	0.0083	0.0303	0.0645	0.0962	0.1245	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	89.6	2269.5	82.8	2.32	0.0670	0.1863	0.2737	0.3329	0.3877	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	90.5	2160.2	82.2	2.37	0.0919	0.2143	0.3033	0.3675	0.4159	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	87.3	1317.6	80.2	2.46	0.1420	0.2472	0.3163	0.3597	0.3947	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	83.0	857.1	71.5	2.45	0.1234	0.2109	0.2659	0.3044	0.3314	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	89.2	1695.3	82.5	2.47	0.1420	0.2534	0.3269	0.3727	0.4123	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	3.2	26.7	4.0	0.58	0.0000	0.0001	0.0002	0.0002	0.0003	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	41.1	246.5	45.5	1.79	0.0461	0.0804	0.0990	0.1101	0.1176	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	95.8	6173.1	93.8	2.45	0.1470	0.2804	0.3695	0.4303	0.4829	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	98.5	9920.9	99.0	2.37	0.1257	0.2710	0.3722	0.4436	0.5105	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	88.9	1603.9	79.5	2.49	0.1575	0.2699	0.3349	0.3854	0.4214	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	88.9	1295.3	79.5	2.60	0.1524	0.2508	0.3152	0.3562	0.3950	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	89.9	1560.5	80.5	2.63	0.2112	0.3038	0.3557	0.3998	0.4307	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	83.4	3696.9	74.5	2.25	0.0784	0.1534	0.2072	0.2466	0.2803	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.2	9477.8	98.5	2.62	0.2472	0.4044	0.5014	0.5657	0.6243	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'reichstag'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	95.7	4128.4	90.5	3.48	0.3067	0.4024	0.4708	0.5249	0.5651	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	96.4	2565.4	97.0	2.93	0.2516	0.3680	0.4310	0.4832	0.5263	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	94.7	1743.9	93.0	2.79	0.1505	0.2669	0.3511	0.4131	0.4629	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	94.4	3590.9	97.5	3.20	0.1229	0.2517	0.3340	0.3847	0.4260	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	95.6	4299.7	97.0	3.04	0.1147	0.2860	0.3690	0.4264	0.4724	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	94.8	3675.1	96.2	3.21	0.1162	0.2502	0.3284	0.3836	0.4200	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	94.9	4309.6	97.2	3.05	0.1096	0.2550	0.3394	0.3964	0.4485	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	86.8	863.4	91.0	2.36	0.0117	0.0724	0.1275	0.1759	0.2156	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	90.8	1653.1	94.0	2.36	0.0143	0.0818	0.1566	0.2076	0.2548	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	62.6	121.4	45.8	2.30	0.0016	0.0113	0.0223	0.0319	0.0417	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	78.7	369.2	77.8	2.36	0.0089	0.0459	0.0895	0.1200	0.1482	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	73.2	162.5	53.0	2.91	0.0547	0.0973	0.1209	0.1389	0.1532	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	75.6	171.2	61.7	3.01	0.0498	0.1074	0.1424	0.1635	0.1826	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	82.3	341.6	79.0	3.14	0.0984	0.1749	0.2254	0.2604	0.2864	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	95.3	1699.5	95.0	2.97	0.1524	0.2545	0.3115	0.3624	0.4017	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	95.1	1892.9	93.2	2.91	0.1425	0.2266	0.3006	0.3497	0.3919	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	96.6	1813.1	97.0	3.26	0.2109	0.3279	0.4042	0.4626	0.5019	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	97.8	5933.7	96.5	3.33	0.3222	0.4285	0.4937	0.5483	0.5888	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	96.5	6522.3	97.5	3.41	0.3363	0.4431	0.5004	0.5460	0.5798	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	88.6	2551.1	83.0	2.76	0.1663	0.2417	0.3006	0.3463	0.3810	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	97.2	6984.9	98.5	3.28	0.2495	0.3605	0.4283	0.4785	0.5177	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	97.9	7051.2	98.2	3.31	0.2399	0.3521	0.4141	0.4673	0.5101	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	97.1	7002.9	99.5	3.28	0.2307	0.3494	0.4270	0.4742	0.5106	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	93.3	2182.3	92.0	2.83	0.1669	0.2666	0.3325	0.3850	0.4194	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	92.4	2068.1	91.2	2.80	0.1584	0.2514	0.3177	0.3605	0.4051	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	98.0	6330.9	98.5	3.28	0.2695	0.3867	0.4540	0.4986	0.5341	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	97.0	5371.2	97.8	3.39	0.2258	0.3288	0.4026	0.4630	0.5033	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.2	6703.2	90.2	3.64	0.6642	0.7632	0.8043	0.8240	0.8371	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	97.4	5624.5	95.0	3.38	0.3033	0.4097	0.4788	0.5296	0.5633	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	97.3	5817.2	97.0	3.41	0.3031	0.3966	0.4632	0.5150	0.5497	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	97.8	6797.9	98.2	3.33	0.2767	0.3813	0.4477	0.4950	0.5404	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	97.6	6504.8	98.8	3.25	0.2300	0.3503	0.4192	0.4664	0.5079	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	98.7	6598.4	92.5	3.61	0.6263	0.7309	0.7760	0.8065	0.8274	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	96.2	4660.9	92.2	3.28	0.2605	0.3551	0.4216	0.4699	0.5110	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	96.8	5811.1	97.0	3.19	0.2201	0.3241	0.3929	0.4449	0.4812	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	87.1	1175.4	84.7	2.80	0.0755	0.1394	0.1915	0.2308	0.2651	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	93.8	1601.8	93.5	3.47	0.2030	0.3008	0.3591	0.4018	0.4331	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	94.3	1567.3	93.2	3.64	0.2446	0.3533	0.4084	0.4488	0.4822	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	92.9	1088.8	93.0	3.54	0.2075	0.3153	0.3722	0.4203	0.4525	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	91.7	914.4	92.0	3.62	0.1855	0.2772	0.3370	0.3825	0.4207	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	93.1	1585.5	94.5	3.50	0.1955	0.2930	0.3598	0.4067	0.4375	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	76.8	189.6	66.5	3.05	0.0376	0.0915	0.1210	0.1440	0.1620	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	87.6	463.8	88.5	3.39	0.1346	0.2248	0.2769	0.3134	0.3449	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	95.1	4993.1	95.8	3.43	0.1759	0.2915	0.3651	0.4184	0.4571	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	97.3	7456.5	97.8	3.22	0.1949	0.3117	0.3781	0.4246	0.4590	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	93.7	1202.0	93.0	3.52	0.2204	0.3234	0.3850	0.4317	0.4649	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	97.4	1249.3	92.2	4.05	0.5040	0.6062	0.6750	0.7181	0.7449	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	97.4	1476.1	89.5	4.10	0.5705	0.6767	0.7210	0.7541	0.7708	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	91.3	2316.1	87.2	3.08	0.1986	0.2845	0.3438	0.3844	0.4171	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	97.4	6743.8	98.5	3.34	0.2794	0.3979	0.4667	0.5176	0.5543	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'sagrada_familia'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	94.8	5940.4	86.0	3.34	0.4935	0.5487	0.5709	0.5878	0.6045	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	98.3	6872.1	94.0	3.48	0.6073	0.7024	0.7381	0.7566	0.7772	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	96.7	4768.0	91.2	2.86	0.3090	0.3994	0.4466	0.4838	0.5194	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	89.5	7264.4	87.5	3.15	0.3794	0.4680	0.5189	0.5427	0.5621	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	89.9	10021.2	90.5	2.98	0.3677	0.4655	0.5083	0.5418	0.5670	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	89.6	7110.8	87.8	3.11	0.3851	0.4757	0.5127	0.5392	0.5611	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	89.3	7456.1	90.0	2.85	0.2961	0.4099	0.4576	0.4874	0.5115	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	73.3	380.0	50.5	2.30	0.0129	0.0379	0.0621	0.0849	0.1032	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	79.6	938.9	66.2	2.32	0.0322	0.0897	0.1365	0.1714	0.2008	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	3.0	17.2	0.3	0.60	0.0000	0.0000	0.0000	0.0000	0.0000	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	29.8	103.7	18.2	1.65	0.0001	0.0009	0.0018	0.0031	0.0044	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	67.9	205.1	42.8	2.82	0.0704	0.0888	0.0985	0.1048	0.1102	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	67.4	194.6	44.8	2.73	0.0593	0.0786	0.0889	0.0958	0.1012	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	76.8	407.4	62.7	2.95	0.1847	0.2118	0.2236	0.2345	0.2443	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	96.2	1808.8	88.0	3.56	0.5144	0.6012	0.6402	0.6617	0.6751	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	96.3	1982.9	89.8	3.52	0.5206	0.6096	0.6428	0.6636	0.6857	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	98.5	3099.9	93.8	3.67	0.6417	0.7238	0.7566	0.7788	0.7897	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	99.0	6513.1	97.0	3.83	0.6514	0.7379	0.7711	0.7984	0.8137	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	92.6	6240.6	90.8	3.74	0.5166	0.5766	0.5997	0.6147	0.6272	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	90.3	4267.3	77.8	2.90	0.2972	0.3519	0.3802	0.3998	0.4188	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	98.9	7755.0	94.2	3.60	0.6742	0.7390	0.7671	0.7859	0.7980	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	98.4	7764.4	94.2	3.55	0.6458	0.7195	0.7515	0.7726	0.7866	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	98.1	7788.2	95.2	3.54	0.6345	0.7037	0.7335	0.7563	0.7683	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	91.3	5160.2	77.5	2.96	0.3819	0.4405	0.4663	0.4851	0.5012	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	90.3	4920.5	74.8	2.90	0.3649	0.4322	0.4562	0.4720	0.4852	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	99.2	7398.7	97.8	3.82	0.6873	0.7552	0.7826	0.7969	0.8147	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	97.7	6186.7	93.0	3.79	0.6268	0.6967	0.7268	0.7460	0.7563	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.9	7109.3	99.2	3.99	0.7882	0.8712	0.9044	0.9231	0.9325	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	98.7	6953.4	96.0	3.80	0.6939	0.7666	0.7983	0.8096	0.8279	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	98.7	7011.0	95.5	3.81	0.6903	0.7544	0.7740	0.7939	0.8091	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	99.4	7983.0	98.5	3.76	0.7048	0.7742	0.8030	0.8179	0.8344	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	98.8	7596.6	96.2	3.67	0.6693	0.7417	0.7682	0.7852	0.8004	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	99.6	7004.5	98.5	3.98	0.7868	0.8671	0.8970	0.9121	0.9276	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	95.5	6397.0	85.0	3.40	0.5238	0.5770	0.6002	0.6150	0.6292	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	97.5	7119.6	92.2	3.55	0.5976	0.6585	0.6857	0.7047	0.7228	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	88.8	1581.5	77.5	3.22	0.3584	0.4222	0.4469	0.4630	0.4728	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	97.1	2094.6	88.2	4.08	0.6137	0.6817	0.7074	0.7189	0.7322	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	97.3	2014.2	91.0	4.14	0.6567	0.7397	0.7685	0.7831	0.7912	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	88.0	1373.6	81.0	3.54	0.4117	0.4641	0.4853	0.4977	0.5074	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	87.2	981.2	79.2	3.54	0.3896	0.4424	0.4623	0.4737	0.4858	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	88.5	1838.3	82.2	3.56	0.4220	0.4732	0.4956	0.5074	0.5159	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	76.2	231.9	55.8	3.10	0.1653	0.1907	0.2019	0.2115	0.2179	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	83.8	472.3	71.8	3.38	0.3075	0.3490	0.3634	0.3772	0.3851	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	89.5	6354.4	87.5	3.34	0.3630	0.4459	0.4828	0.5083	0.5269	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	99.5	7966.5	98.5	3.81	0.6945	0.7823	0.8166	0.8363	0.8447	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	95.8	1868.7	85.8	3.83	0.5707	0.6319	0.6564	0.6721	0.6820	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	98.0	1599.9	91.2	4.09	0.6830	0.7700	0.7949	0.8121	0.8225	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	98.1	1818.0	92.8	4.22	0.6883	0.7644	0.7942	0.8133	0.8285	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	91.6	4425.8	79.0	3.08	0.4146	0.4687	0.4977	0.5097	0.5208	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.5	7543.2	99.2	3.93	0.7400	0.8114	0.8463	0.8602	0.8750	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'st_pauls_cathedral'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	98.1	5029.6	93.0	3.34	0.4401	0.5381	0.6054	0.6487	0.6816	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	99.4	3838.2	98.2	3.34	0.4444	0.5737	0.6481	0.6951	0.7375	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	97.9	2951.3	94.0	2.82	0.2455	0.3941	0.4849	0.5456	0.6024	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	94.5	3953.4	92.5	3.12	0.2786	0.4211	0.5070	0.5661	0.6065	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	95.3	5052.5	94.0	2.92	0.2677	0.4203	0.5044	0.5542	0.6053	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	94.7	4027.0	92.5	3.13	0.2925	0.4395	0.5209	0.5768	0.6201	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	94.9	4921.9	93.5	2.91	0.2615	0.4158	0.5162	0.5794	0.6231	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	86.6	764.3	72.5	2.36	0.0304	0.1220	0.2125	0.2769	0.3233	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	90.5	1439.9	84.2	2.34	0.0295	0.1255	0.2347	0.3221	0.3912	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	11.1	32.4	10.8	1.30	0.0000	0.0002	0.0005	0.0007	0.0009	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	75.8	338.4	53.0	2.39	0.0205	0.0828	0.1425	0.1791	0.2028	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	38.7	175.0	32.2	1.97	0.0421	0.0638	0.0742	0.0815	0.0863	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	44.5	209.0	43.5	2.06	0.0651	0.1088	0.1297	0.1422	0.1516	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	75.3	285.9	44.8	2.81	0.1219	0.1727	0.1925	0.2039	0.2102	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	97.4	1541.3	89.2	3.21	0.3546	0.4782	0.5461	0.6014	0.6410	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	96.5	1732.2	88.5	3.18	0.3593	0.4710	0.5339	0.5833	0.6193	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	99.2	1582.2	95.8	3.41	0.4207	0.5421	0.6161	0.6674	0.7073	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	99.3	5409.4	97.2	3.60	0.4946	0.5948	0.6565	0.7065	0.7519	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	97.1	5559.4	96.0	3.69	0.4895	0.5809	0.6336	0.6708	0.6942	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	92.1	3463.2	76.8	2.80	0.2696	0.3522	0.3913	0.4239	0.4531	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	99.4	6959.9	97.0	3.48	0.4576	0.5670	0.6324	0.6850	0.7258	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	99.2	7072.3	97.2	3.50	0.4674	0.5807	0.6505	0.6959	0.7393	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	98.9	7100.3	98.0	3.52	0.4724	0.5723	0.6406	0.6897	0.7307	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	89.8	2453.7	79.8	2.72	0.2058	0.2932	0.3444	0.3862	0.4224	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	89.4	2358.0	78.0	2.67	0.1967	0.2852	0.3357	0.3700	0.4064	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	99.3	6204.4	99.0	3.26	0.4145	0.5301	0.5961	0.6511	0.6984	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	98.2	5378.9	96.5	3.30	0.3933	0.5027	0.5668	0.6189	0.6594	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	99.9	6463.6	99.2	3.51	0.7123	0.8234	0.8776	0.9129	0.9317	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	99.2	5465.5	96.2	3.31	0.4321	0.5493	0.6165	0.6710	0.7125	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	99.2	5972.0	97.2	3.50	0.4401	0.5554	0.6234	0.6820	0.7212	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	99.7	6868.1	98.2	3.52	0.4578	0.5680	0.6376	0.6851	0.7191	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	99.3	6480.5	98.2	3.33	0.3764	0.4962	0.5832	0.6330	0.6822	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	100.0	6347.3	98.8	3.51	0.6772	0.8028	0.8567	0.8880	0.9082	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	96.4	4750.2	89.8	3.11	0.3349	0.4361	0.4979	0.5367	0.5821	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	98.8	5944.1	96.5	3.23	0.3548	0.4763	0.5503	0.6128	0.6506	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	87.5	1255.7	73.0	2.97	0.2381	0.3189	0.3590	0.3897	0.4106	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	98.0	1626.5	92.5	3.69	0.4208	0.5450	0.6131	0.6646	0.7078	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	98.8	1587.8	95.0	3.79	0.5113	0.6443	0.7030	0.7471	0.7763	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	93.5	1027.1	88.5	3.59	0.3638	0.4593	0.5128	0.5560	0.5902	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	92.9	830.4	85.2	3.58	0.3786	0.4686	0.5174	0.5500	0.5801	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	93.6	1468.0	90.5	3.51	0.3513	0.4497	0.5199	0.5611	0.5891	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	49.7	184.0	35.2	2.32	0.1264	0.1528	0.1638	0.1699	0.1742	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	89.4	434.7	69.5	3.34	0.3006	0.3722	0.4048	0.4272	0.4503	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	94.8	4882.2	94.5	3.30	0.2568	0.3925	0.4707	0.5367	0.5743	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	99.8	7262.0	98.0	3.54	0.4289	0.5625	0.6483	0.7029	0.7403	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	97.3	1298.0	89.0	3.59	0.4018	0.5150	0.5875	0.6367	0.6728	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	99.5	1230.0	95.0	3.96	0.6269	0.7390	0.7887	0.8344	0.8547	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	99.5	1464.3	96.0	4.00	0.6377	0.7520	0.8112	0.8422	0.8664	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	92.4	2982.1	80.2	2.85	0.2626	0.3615	0.4159	0.4536	0.4859	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	99.8	6758.6	98.2	3.39	0.5165	0.6437	0.7185	0.7677	0.8067	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

MVS — sequence 'united_states_capitol'
Method	Date	Type	Ims (%)	#Pts	SR	TL	mAP^{5^o}	mAP^{10^o}	mAP^{15^o}	mAP^{20^o}	mAP^{25^o}	ATE	By	Details	Link	Contact	Updated	Descriptor size
AKAZE (OpenCV) kp:8000, match:nn	19-04-24	F	91.0	1813.4	86.5	2.91	0.0720	0.1280	0.1847	0.2412	0.2880	—	Challenge organizers	AKAZE, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SphereDesc kp:8000, match:nn	19-04-26	F	89.0	928.2	82.5	2.76	0.0510	0.1023	0.1606	0.2114	0.2634	—	Anonymous	We use OpenCV's implementation of AKAZE detector, and for each keypoint we extract a descriptor via CNN.	N/A	Anonymous	N/A	256 float32
Brisk + SSS kp:8000, match:nn	19-05-14	F	80.4	701.1	71.0	2.48	0.0194	0.0518	0.0936	0.1299	0.1688	—	Anonymous	We use OpenCV's implementation of brisk detector with the default settings, and for each image there are at most 8K keypoints. For each keypoint, we extract a descriptor via a CNN model.	TBA	Anonymous	N/A	128 float32
D2-Net (single scale) kp:8000, match:nn	19-05-07	F	88.8	1326.9	86.8	2.69	0.0220	0.0745	0.1321	0.1910	0.2469	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multiscale) kp:8000, match:nn	19-05-07	F	89.8	1647.0	89.5	2.76	0.0207	0.0729	0.1323	0.1987	0.2639	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Trained on sequences overlapping with our test set: see 'no PT' for eligible results (this entry is provided only for reference). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (single scale, no PT dataset) kp:8000, match:nn	19-06-01	F	89.2	1400.0	86.0	2.71	0.0274	0.0861	0.1431	0.1964	0.2513	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Single-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
D2-Net (multi-scale, no PT dataset) kp:8000, match:nn	19-06-05	F	90.4	1616.3	89.0	2.82	0.0203	0.0651	0.1208	0.1831	0.2547	—	Challenge organizers	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Multi-scale features, brute-force nearest neighbour matching. Re-trained removing conflicting sequences (models: d2_tf_no_phototourism.pth). Paper: https://dsmn.ml/files/d2-net/d2-net.pdf	https://github.com/mihaidusmanu/d2-net	imagematching@uvic.ca	N/A	512 float32
DELF kp:1024, match:nn	19-05-05	F	81.0	423.7	81.0	2.35	0.0025	0.0256	0.0602	0.0995	0.1363	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:2048, match:nn	19-05-05	F	83.0	653.1	83.8	2.30	0.0044	0.0264	0.0615	0.1043	0.1467	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:256, match:nn	19-05-05	F	61.2	88.2	37.3	2.31	0.0007	0.0043	0.0112	0.0208	0.0286	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
DELF kp:512, match:nn	19-05-05	F	72.0	207.2	68.2	2.35	0.0014	0.0171	0.0375	0.0634	0.0849	—	Challenge organizers	DELF features for object retrieval, trained on the Google Landmarks dataset. Paper: https://arxiv.org/abs/1612.06321	https://github.com/tensorflow/models/tree/master/research/delf	imagematching@uvic.ca	N/A	40 float32
ELF-256D kp:512, match:nn	19-05-07	F	3.7	19.0	5.2	0.60	0.0001	0.0002	0.0002	0.0002	0.0003	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-512D kp:512, match:nn	19-05-09	F	25.9	62.5	12.5	1.74	0.0000	0.0004	0.0009	0.0013	0.0017	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are computed with the interpolation of the VGG pool3 feature map on the detected keypoints.	TBA	Anonymous	N/A	256 float32
ELF-SIFT kp:512, match:nn	19-04-26	F	28.2	79.4	34.2	1.77	0.0017	0.0038	0.0055	0.0067	0.0080	—	Anonymous	ELF detector: Keypoints are local maxima of a saliency map generated by the gradient of a feature map with respect to the image of a pre-trained CNN. Descriptors are HOG (as in SIFT).	N/A	Anonymous	N/A	128 uint8
SIFT + GeoDesc kp:2048, match:nn	19-05-19	F	82.5	676.1	81.5	2.69	0.0247	0.0590	0.1021	0.1516	0.1885	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:2048, match:nn	19-05-19	F	82.3	715.0	80.8	2.74	0.0359	0.0787	0.1218	0.1618	0.1988	—	Challenge organizers	HardNet extracted on SIFT keypoints. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger patches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
HarrisZ/RsGLOH2 kp:8000, match:sGOr2f	19-05-23	F	89.8	653.5	85.8	2.94	0.0999	0.1562	0.2137	0.2636	0.3083	—	Fabio Bellavia and Carlo Colombo	HarrisZ keypoints and Root squared (like RootSIFT) sGLOH2 descriptors using sGOr2f* matching strategy. Distance table entries lower than their respective row-wise and column-wise averages are discarded and then matching pairs are computed using the greedy nearest neighbour as in the WISW@CAIP2019 contest. Keypoints and matches are ordered according to their ranks (best ones first).	http://cvg.dsi.unifi.it/cvg/index.php?id=research#descriptor	bellavia.fabio@gmail.com	N/A	256 float32
HesAffNet - HardNet2 kp:8000, match:nn	19-05-29	F	90.6	2309.4	90.0	2.84	0.0540	0.1056	0.1619	0.2106	0.2607	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian + AffNet(affine shape) + OriNet (orientation), Code and weight from https://github.com/ducha-aiki/affnet, Paper: https://arxiv.org/abs/1711.06704; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset, Paper: https://arxiv.org/pdf/1901.09780.pdf; DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
Hessian - HardNet2 kp:8000, match:nn	19-05-30	F	86.7	2442.2	86.0	2.97	0.0555	0.1024	0.1572	0.2004	0.2431	—	Milan Pultar, Dmytro Mishkin, Jiří Matas	Detector: Hessian. Gravity vector orientation is assumed; Descriptor: HardNet trained on AMOS + mix of other datasets, similar to Code for train: https://github.com/pultarmi/HardNet_MultiDataset Paper: https://arxiv.org/pdf/1901.09780.pdf, DeepNets were plugged into MODS framework, but without view synthesis and matching https://github.com/ducha-aiki/mods-light-zmq; Number of max.keypoints to detect: 8k, detection done on 2x upsampled images.	TBA	ducha.aiki@gmail.com	N/A	128 uint8
ORB (OpenCV) kp:8000, match:nn	19-04-24	F	74.6	1198.0	60.0	2.39	0.0159	0.0394	0.0613	0.0846	0.1068	—	Challenge organizers	ORB, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
Scale-invariant desc. (Log-Polar, lambda=32) kp:8000, match:nn	19-06-25	F	92.6	2591.7	92.2	2.88	0.0767	0.1290	0.1863	0.2457	0.2978	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=64) kp:8000, match:nn	19-06-24	F	92.4	2676.2	91.8	2.90	0.0788	0.1378	0.2032	0.2611	0.3144	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
Scale-invariant desc. (Log-Polar, lambda=96) kp:8000, match:nn	19-06-20	F	92.7	2692.5	93.5	2.88	0.0727	0.1337	0.1957	0.2622	0.3221	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SIFT-AID (NN matcher) kp:8000, match:nn	19-05-10	F	78.1	550.6	70.2	2.51	0.0088	0.0263	0.0502	0.0817	0.1114	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT-AID (custom matcher) kp:8000, match:sift-aid	19-04-29	F/M	77.4	521.8	68.5	2.49	0.0060	0.0208	0.0491	0.0757	0.1008	—	Mariano Rodríguez, Gabriele Facciolo, Rafael Grompone Von Gioi, Pablo Musé, Jean-Michel Morel, Julie Delon	We extract the keypoints using OpenCV's implementation of SIFT. The AID descriptors are computed with a CNN from patches extracted at each keypoint location, the result is a binary descriptor of 6272 bits. The matching is computed as the Hamming distance between the descriptors, with the decision threshold set at 4000. Preprint: https://hal.archives-ouvertes.fr/hal-02016010. Code: https://github.com/rdguez-mariano/sift-aid	https://hal.archives-ouvertes.fr/hal-02016010	facciolo@cmla.ens-cachan.fr	N/A	6272 bits
SIFT + ContextDesc kp:8000, match:nn	19-05-09	F	91.9	1870.7	90.8	2.77	0.0663	0.1178	0.1735	0.2268	0.2712	—	Zixin Luo	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 uint8
SIFT-Dense-ContextDesc kp:8000, match:nn	19-05-28	F	88.5	1579.1	86.0	2.69	0.0515	0.0934	0.1371	0.1923	0.2406	—	Zixin Luo, Jiahui Zhang	Dense-ContextDesc is a variant of ContextDesc, where descriptors are densely extracted from full images, instead of image patches, while other settings stay unchanged as original ContextDesc. We find Dense-ContextDesc performs better regarding in particular illumination changes. Dense-ContextDesc is extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features are quantized to uint8 and extracted from the code provided by the authors. The model will be available on the authors' GitHub page.	TBA	zluoag@cse.ust.hk	N/A	TBA
SIFT + ContextDesc + Inlier Classification V2 kp:8000, match:custom	19-05-28	F/M	93.6	2254.2	95.8	2.89	0.0986	0.1665	0.2345	0.2961	0.3552	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an improved inlier classification and fundamental matrix estimation network based on [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT-GeoDesc-GitHub kp:8000, match:nn	19-05-08	F	90.8	1685.8	89.0	2.80	0.0634	0.1011	0.1497	0.2003	0.2523	—	Zixin Luo	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search. Features extracted from the code provided by the authors.	https://arxiv.org/abs/1807.06294	zluoag@cse.ust.hk	N/A	128 float32
SIFT + GeoDesc kp:8000, match:nn	19-04-24	F	91.3	2088.3	90.2	2.86	0.0640	0.1096	0.1616	0.2239	0.2710	—	Challenge organizers	GeoDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://github.com/lzx551402/geodesc	imagematching@uvic.ca	N/A	128 float32
SIFT + HardNet kp:8000, match:nn	19-04-24	F	92.5	2496.1	93.5	2.87	0.0659	0.1175	0.1794	0.2339	0.2888	—	Challenge organizers	HardNet extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/DagnyT/hardnet	imagematching@uvic.ca	N/A	128 float32
SIFT + L2-Net kp:8000, match:nn	19-04-24	F	91.1	2193.0	93.0	2.87	0.0671	0.1217	0.1778	0.2377	0.2895	—	Challenge organizers	L2-Net extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/yuruntian/L2-Net	imagematching@uvic.ca	N/A	128 float32
SIFT + ContextDesc + Inlier Classification V1 kp:8000, match:custom	19-05-29	F/M	93.8	2146.5	96.8	2.88	0.0897	0.1619	0.2190	0.2876	0.3456	—	Dawei Sun, Zixin Luo, Jiahui Zhang	We use the SIFT detector and ContextDesc descriptor, and then we train an inlier classification and fundamental matrix estimation network using the architecture of [Yi et al. CVPR2018] (https://arxiv.org/pdf/1711.05971.pdf).	https://github.com/lzx551402/contextdesc	zluoag@cse.ust.hk	N/A	128 float32
SIFT (OpenCV) kp:8000, match:nn	19-04-24	F	82.8	1322.7	79.8	2.65	0.0441	0.0756	0.1131	0.1496	0.1827	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
SIFT + TFeat kp:8000, match:nn	19-04-24	F	90.0	1949.4	87.8	2.74	0.0448	0.0903	0.1402	0.1946	0.2422	—	Challenge organizers	T-Feat extracted on SIFT keypoints. Number of keypoints: 8000 per image. Models trained on the Liberty sequence of the Brown dataset. We use slightly larger paches than specified for SIFT (scale multiplying factor 16/12). Feature matching with brute-force nearest-neighbour search.	https://github.com/vbalnt/tfeat	imagematching@uvic.ca	N/A	128 float32
SIFT (OpenCV) kp:2048, match:nn	19-05-17	F	72.9	516.2	63.2	2.54	0.0216	0.0431	0.0652	0.0850	0.1032	—	Challenge organizers	SIFT, as implemented in OpenCV. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	128 float32
Superpoint (nn matcher) kp:2048, match:nn	19-06-07	F	86.9	669.9	86.2	2.85	0.0493	0.1073	0.1618	0.2121	0.2538	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
Superpoint (1:1 matcher) kp:2048, match:nn1to1	19-06-07	F	88.3	656.0	87.2	2.98	0.0628	0.1220	0.1814	0.2381	0.2909	—	Challenge organizers	SuperPoint features from the submission 'SuperPoint + Custom Matcher (v2)' with a brute-force 1:1 matcher instead of the custom matcher. For reference.	TBA	imagematching@uvic.ca	N/A	256 float32
SuperPoint (default) kp:2048, match:nn	19-04-24	F	82.5	376.4	75.0	2.92	0.0608	0.1068	0.1470	0.1855	0.2230	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned (about 1200 on average). Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:1024, match:nn	19-04-26	F	79.4	335.0	74.2	2.87	0.0524	0.0956	0.1302	0.1667	0.1983	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:2048, match:nn	19-04-26	F	84.4	537.5	77.5	2.87	0.0563	0.1054	0.1501	0.1934	0.2285	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:256, match:nn	19-04-26	F	26.1	75.3	18.0	1.75	0.0007	0.0011	0.0016	0.0017	0.0023	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:512, match:nn	19-04-26	F	66.6	152.6	57.0	2.59	0.0231	0.0409	0.0585	0.0746	0.0871	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
SuperPoint kp:8000, match:nn	19-04-26	F	86.7	1606.2	84.8	2.63	0.0276	0.0716	0.1208	0.1717	0.2196	—	Challenge organizers	SuperPoint features. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We lower the default detection threshold to take the number of features indicated in the label. Feature matching done with brute-force nearest-neighbour search.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	imagematching@uvic.ca	N/A	256 float32
Scale-invariant desc. (Cartesian, lambda=16) kp:8000, match:nn	19-07-29	F	91.8	2760.2	94.5	2.75	0.0476	0.1005	0.1591	0.2132	0.2671	—	Patrick Ebel	We compute scale-invariant descriptors with a log-polar transformation of the patch. Keypoints are DoG, with a scaling factor of lambda/12 over its chosen scale. (This is a baseline where we use cartesian patches instead.) Reference: 'Beyond Cartesian Representations for Local Descriptors', Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls, ICCV 2019.	TBA	patrick.ebel@epfl.ch	N/A	128 float32
SuperPoint (trained on coco + phototourism training set) kp:2048, match:nn	19-05-30	F	85.0	464.9	82.5	2.94	0.0617	0.1122	0.1590	0.2069	0.2538	—	Daniel DeTone, Paul Sarlin, Tomasz Malisiewicz, Andrew Rabinovich	SuperPoint V1 model trained on COCO homographic warps at VGA resolution, plus pairs from the phototourism training set using the GT poses and depths for correspondence. If necessary, we downsample the images so that the largest dimension is at most 1024 pixels. We extract features with the default parameters and use however many are returned.	https://github.com/MagicLeapResearch/SuperPointPretrainedNetwork	ddetone@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v1) kp:2048, match:custom	19-05-30	F/M	91.1	525.3	88.8	3.23	0.1240	0.1938	0.2596	0.3222	0.3717	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set; - keypoints refinement; - better descriptor sampling; - adjusted thresholds. A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SuperPoint + Custom Matcher (v2) kp:2048, match:custom	19-05-28	F/M	91.8	640.3	90.5	3.26	0.1085	0.1803	0.2540	0.3161	0.3740	—	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich	Features are extracted by a modified SuperPoint: - retrained on the phototourism training set, - keypoints refinement, - better descriptor sampling, - adjusted thresholds; A custom matcher estimates the image rotation and rejects outlier matches using the correspondence classifier introduced in 'Learning To Find Good Correspondences' (Yi et al., 2018).	TBA	pesarlin@magicleap.com	N/A	256 float32
SURF (OpenCV) kp:8000, match:nn	19-04-24	F	82.8	967.7	73.0	2.52	0.0190	0.0419	0.0760	0.1135	0.1451	—	Challenge organizers	SURF, as implemented in OpenCV. Number of keypoints: 8000 per image. Feature matching with brute-force nearest-neighbour search.	https://opencv.org	imagematching@uvic.ca	N/A	TBA
SIFT + ContextDesc kp:8000, match:nn1to1	19-06-07	F	91.9	2349.6	93.2	2.78	0.0712	0.1350	0.1965	0.2561	0.3058	—	Challenge organizers	ContextDesc extracted on SIFT keypoints. Number of keypoints: 8000 per image. Feature matching with nearest-neighbour search, enforcing cross-match consistency. Features are quantized to uint8 and extracted from the code provided by the authors.	https://github.com/lzx551402/contextdesc	imagematching@uvic.ca	N/A	TBA

Breakdown: Phototourism | MVS | Sequences

home