A common problem in wide-baseline matching isthe sparse and non-uniform distribution of correspondences whenusing conventional detectors such as SIFT, SURF, FAST, A-KAZEand MSER. In this paper we introduce a novel segmentationbased feature detector (SFD) that produces an increased numberof accurate features for wide-baseline matching. A multi-scaleSFD is proposed using bilateral image decomposition to producea large number of scale-invariant features for wide-baselinereconstruction. All input images are over-segmented into re-gions using any existing segmentation technique like Watershed,Mean-shift, and SLIC. Feature points are then detected at theintersection of the boundaries of three or more regions. Thedetected feature points are local maxima of the image function.The key advantage of feature detection based on segmentationis that it does not require global threshold setting and cantherefore detect features throughout the image. A comprehensiveevaluation demonstrates that SFD gives an increased number offeatures which are accurately localised and matched betweenwide-baseline camera views; the number of features for a givenmatching error increases by a factor of 3-5 compared to SIFT;feature detection and matching performance is maintained withincreasing baseline between views; multi-scale SFD improvesmatching performance at varying scales. Application of SFD tosparse multi-view wide-baseline reconstruction demonstrates afactor of ten increase in the number of reconstructed points withimproved scene coverage compared to SIFT/MSER/A-KAZE.Evaluation against ground-truth shows that SFD produces anincreased number of wide-baseline matches with reduced error.


MSFD: Multi-scale segmentation based feature detection for wide-baseline scene reconstruction
Armin Mustafa, Hansung Kim and Adrian Hilton


Data used in this work can be found in the CVSSP 3D Data Repository.


			author = {Mustafa, A. and Kim, H. and Hilton, A.},
			title = {MSFD: Multi-scale segmentation based feature detection for wide-baseline scene reconstruction},
			journal = {IEEE Transactions in Image Processing},
			year = {2019},
			pages = {1118-1132}


This research was supported by the Royal Academy of Engineering Research Fellowship RF-201718-17177, and the European Commission and EPSRC Platform Grant on Audio-Visual Media Research EP/P022529.