关键点就是一小块图像,而描述符是一种数学结构,通常为一组浮点数。而如何更好地将图像信息抽象为描述符将是一个主要的问题。同时,对于描述符,它应该能够针对不同的场景,给出一定的旋转不变性。
关键点和描述符的是三个主要的应用场景就是跟踪,目标识别和立体建模。不过不管是哪种用途,根本的处理逻辑确实类似的:首先找出图像中的关键点,之后选取一种描述符,最后基于关键点的描述符查找匹配 —— detect-describe-match。
为了描述关键点,Opencv 关键点的类定义如下
class cv::KeyPoint { public: cv::Point2f pt; // coordinates of the keypoint float size; // diameter of the meaningful keypoint neighborhood float angle; // computed orientation of the keypoint (-1 if none) float response; // response for which the keypoints was selected int octave; // octave (pyramid layer) keypoint was extracted from int class_id; // object id, can be used to cluster keypoints by object cv::KeyPoint( cv::Point2f _pt, float _size, float _angle = -1, float _response = 0, int _octave = 0, int _class_id = -1 ); cv::KeyPoint( float x, float y, float _size, float _angle = -1, float _response = 0, int _octave = 0, int _class_id = -1 ); ... };
参数说明:
为了查找并计算描述符,Opencv 定义了如下抽象类。
class cv::Feature2D : public cv::Algorithm { public: virtual void detect( cv::InputArray image, // Image on which to detect vector< cv::KeyPoint >& keypoints, // Array of found keypoints cv::InputArray mask = cv::noArray() ) const; virtual void detect( cv::InputArrayOfArrays images, // Images on which to detect vector<vector< cv::KeyPoint > >& keypoints, // keypoints for each image cv::InputArrayOfArrays masks = cv::noArray() ) const; virtual void compute( cv::InputArray image, // Image where keypoints are located std::vector<cv::KeyPoint>& keypoints, // input/output vector of keypoints cv::OutputArray descriptors); // computed descriptors, M x N matrix, // where M is the number of keypoints // and N is the descriptor size virtual void compute( cv::InputArrayOfArrays image, // Images where keypoints are located std::vector<std::vector<cv::KeyPoint> >& keypoints, //I/O vec of keypnts cv::OutputArrayOfArrays descriptors); // computed descriptors, // vector of (Mi x N) matrices, where // Mi is the number of keypoints in // the i-th image and N is the // descriptor size virtual void detectAndCompute( cv::InputArray image, // Image on which to detect cv::InputArray mask, // Optional region of interest mask std::vector<cv::KeyPoint>& keypoints, // found or provided keypoints cv::OutputArray descriptors, // computed descriptors bool useProvidedKeypoints = false); // if true, // the provided keypoints are used, // otherwise they are detected virtual int descriptorSize() const; // size of each descriptor in elements virtual int descriptorType() const; // type of descriptor elements virtual int defaultNorm() const; // the recommended norm to be used // for comparing descriptors. // Usually, it's NORM_HAMMING for // binary descriptors and NORM_L2 // for all others. virtual void read(const cv::FileNode&); virtual void write(cv::FileStorage&) const; ... };
函数说明:
一个实际的实现可以只实现其中的某个或几个方法:
通常,一个匹配器尝试在一副或一组图中匹配一幅图中的关键点,如果匹配成功,将返回 cv::DMatch 的列表。
class cv::DMatch { public: DMatch(); // sets this->distance // to std::numeric_limits<float>::max() DMatch(int _queryIdx, int _trainIdx, float _distance); DMatch(int _queryIdx, int _trainIdx, int _imgIdx, float _distance); int queryIdx; // query descriptor index int trainIdx; // train descriptor index int imgIdx; // train image index float distance; bool operator<(const DMatch &m) const; // Comparison operator // based on 'distance' }
成员说明:
通常匹配器被应用在目标识别和跟踪两个场景中。其中目标识别需要我们训练匹配器——给出已知物体最大区分度的描述符,然后根据我们给出的描述符给出字典中哪个描述符与之相匹配。而跟踪则要求在给定两组描述符的条件下,给出它们之间的匹配情况。
cv::DescriptorMatcher 提供了 match(),knnMatch() 和 radiusMatch() 三个函数,其中每个函数都有针对目标检测和跟踪的两个版本,其中识别需要输入一个特性列表和训练好的字典,而跟踪则需输入两个特性列表。
class cv::DescriptorMatcher { public: virtual void add(InputArrayOfArrays descriptors); // Add train descriptors virtual void clear(); // Clear train descriptors virtual bool empty() const; // true if no descriptors void train(); // Train matcher virtual bool isMaskSupported() const = 0; // true if supports masks const vector<cv::Mat>& getTrainDescriptors() const; // Get train descriptors // methods to match descriptors from one list vs. "trained" set (recognition) // void match( InputArray queryDescriptors, vector<cv::DMatch>& matches, InputArrayOfArrays masks = noArray() ); void knnMatch( InputArray queryDescriptors, vector< vector<cv::DMatch> >& matches, int k, InputArrayOfArrays masks = noArray(), bool compactResult = false ); void radiusMatch( InputArray queryDescriptors, vector< vector<cv::DMatch> >& matches, float maxDistance, InputArrayOfArrays masks = noArray(), bool compactResult = false ); // methods to match descriptors from two lists (tracking) // // Find one best match for each query descriptor void match( InputArray queryDescriptors, InputArray trainDescriptors, vector<cv::DMatch>& matches, InputArray mask = noArray() ) const; // Find k best matches for each query descriptor (in increasing // order of distances) void knnMatch( InputArray queryDescriptors, InputArray trainDescriptors, vector< vector<cv::DMatch> >& matches, int k, InputArray mask = noArray(), bool compactResult = false ) const; // Find best matches for each query descriptor with distance less // than maxDistance void radiusMatch( InputArray queryDescriptors, InputArray trainDescriptors, vector< vector<cv::DMatch> >& matches, float maxDistance, InputArray mask = noArray(), bool compactResult = false ) const; virtual void read(const FileNode&); // Reads matcher from a file node virtual void write(FileStorage&) const; // Writes matcher to a file storage virtual cv::Ptr<cv::DescriptorMatcher> clone( bool emptyTrainData = false ) const = 0; static cv::Ptr<cv::DescriptorMatcher> create( const string& descriptorMatcherType ); ... };
函数说明:
关键点滤波器用于从现有的关键点中查找更佳的关键点或者去除相同的关键点。
class cv::KeyPointsFilter { public: static void runByImageBorder( vector< cv::KeyPoint >& keypoints, // in/out list of keypoints cv::Size imageSize, // Size of original image int borderSize // Size of border in pixels ); static void runByKeypointSize( vector< cv::KeyPoint >& keypoints, // in/out list of keypoints float minSize, // Smallest keypoint to keep float maxSize = FLT_MAX // Largest one to keep ); static void runByPixelsMask( vector< cv::KeyPoint >& keypoints, // in/out list of keypoints const cv::Mat& mask // Keep where mask is nonzero ); static void removeDuplicated( vector< cv::KeyPoint >& keypoints // in/out list of keypoints ); static void retainBest( vector< cv::KeyPoint >& keypoints, // in/out list of keypoints int npoints // Keep this many ); }
函数说明:
Brute force matching with cv::BFMatcher
class cv::BFMatcher : public cv::DescriptorMatcher { public: BFMatcher(int normType, bool crossCheck = false); virtual ~BFMatcher() {} virtual bool isMaskSupported() const { return true; } virtual Ptr<DescriptorMatcher> clone( bool emptyTrainData = false ) const; ... };
暴力搜索就是直接根据询问集从训练集中查找,唯一需要指定的是距离度量方法(normType)
其中 crosscheck 如果置 1,那么必须两者分别为对方的最近邻。这能够有效的降低错误匹配,不过将花费更多的时间。
Fast approximate nearest neighbors and cv::FlannBasedMatcher
class cv::FlannBasedMatcher : public cv::DescriptorMatcher { public: FlannBasedMatcher( const cv::Ptr< cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(), const cv::Ptr< cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams() ); virtual void add(const vector<Mat>& descriptors); virtual void clear(); virtual void train(); virtual bool isMaskSupported() const; virtual void read(const FileNode&); // Read from file node virtual void write(FileStorage&) const; // Write to file storage virtual cv::Ptr<DescriptorMatcher> clone( bool emptyTrainData = false ) const; ... };
参数说明:
SearchParams:
struct cv::flann::SearchParams : public cv::flann::IndexParams { SearchParams( int checks = 32, // Limit on NN candidates to check float eps = 0, // (Not used right now) bool sorted = true // Sort multiple returns if 'true' ); };
indexParams:
// 等价于 cv::BFMatcher cv::FlannBasedMatcher matcher( new cv::flann::LinearIndexParams(), // Default index parameters new cv::flann::SearchParams() // Default search parameters );
cv::FlannBasedMatcher matcher( new cv::flann::KDTreeIndexParams(16), // Index using 16 kd-trees new cv::flann::SearchParams() // Default search parameters );
struct cv::flann::KMeansIndexParams : public cv::flann::IndexParams { KMeansIndexParams( int branching = 32, // Branching factor for tree int iterations = 11, // Max for k-means stage float cb_index = 0.2, // Probably don't mess with cv::flann::flann_centers_init_t centers_init = cv::flann::CENTERS_RANDOM ); };
struct cv::flann::CompositeIndexParams : public cv::flann::IndexParams { CompositeIndexParams( int trees = 4, // Number of trees int branching = 32, // Branching factor for tree int iterations = 11, // Max for k-means stage float cb_index = 0.2, // Usually leave as-is cv::flann::flann_centers_init_t centers_init = cv::flann::CENTERS_RANDOM ); };
struct cv::flann::LshIndexParams : public cv::flann::IndexParams { LshIndexParams( unsigned int table_number, // Number of hash tables to use, usually '10' to '30' unsigned int key_size, // key bits, usually '10' to '20' unsigned int multi_probe_level // Best to just set this to '2' ); };
struct cv::flann::AutotunedIndexParams : public cv::flann::IndexParams { AutotunedIndexParams( float target_precision = 0.9, // Percentage of searches required // to return an exact result float build_weight = 0.01, // Priority for building fast float memory_weight = 0.0, // Priority for saving memory float sample_fraction = 0.1 // Fraction of training data to use ); };
Displaying keypoints with cv::drawKeypoints
void cv::drawKeypoints( const cv::Mat& image, // Image to draw keypoints const vector< cv::KeyPoint >& keypoints, // List of keypoints to draw cv::Mat& outImg, // image and keypoints drawn const Scalar& color = cv::Scalar::all(-1), // Use different colors int flags = cv::DrawMatchesFlags::DEFAULT );
参数说明:
Displaying keypoint matches with cv::drawMatches
void cv::drawMatches( const cv::Mat& img1, // "Left" image const vector< cv::KeyPoint >& keypoints1, // Keypoints (lt. img) const cv::Mat& img2, // "Right" image const vector< cv::KeyPoint >& keypoints2, // Keypoints (rt. img) const vector< cv::DMatch >& matches1to2, // List of matches cv::Mat& outImg, // Result images const cv::Scalar& matchColor = cv::Scalar::all(-1), const cv::Scalar& singlePointColor = cv::Scalar::all(-1), const vector<char>& matchesMask = vector<char>(), int flags = cv::DrawMatchesFlags::DEFAULT ) void cv::drawMatches( const cv::Mat& img1, // "Left" image const vector< cv::KeyPoint >& keypoints1, // Keypoints (lt. img) const cv::Mat& img2, // "Right" image const vector< cv::KeyPoint >& keypoints2, // Keypoints (rt. img) const vector< vector<cv::DMatch> >& matches1to2, // List of lists // of matches cv::Mat& outImg, // Result images const cv::Scalar& matchColor // and connecting line = cv::Scalar::all(-1), const cv::Scalar& singlePointColor // unmatched ones = cv::Scalar::all(-1), const vector< vector<char> >& matchesMask // only draw for nonzero = vector< vector<char> >(), int flags = cv::DrawMatchesFlags::DEFAULT );
参数说明:
最后给出一个 Python 的具体用例
import cv2 sift = cv2.xfeatures2d.SIFT_create() target = cv2.imread("target.png") target_feature = sift.detectAndCompute(target, None) # keypoints, descriptors # cv2.drawKeypoints(target, target_feature[0], target) # cv2.imshow("target", target) # cv2.waitKey(0) query = cv2.imread("query.png") query_feature = sift.detectAndCompute(query, None) # print target_feature # print query_feature # bf = cv2.BFMatcher() bf = cv2.FlannBasedMatcher() matches = bf.knnMatch(query_feature[1], target_feature[1], k=2) good = [] good_without_list = [] for m, n in matches: if m.distance < 0.75 * n.distance: good.append([m]) good_without_list.append(m) result_knn = cv2.drawMatchesKnn(query, query_feature[0], target, target_feature[0], good, None) cv2.imshow("result_knn", result_knn) result = cv2.drawMatches(query, query_feature[0], target, target_feature[0], good_without_list, None) cv2.imshow("result", result) cv2.waitKey(0)