做为方便、快捷、功能强大的开发工具,Matlab有大量的第三方资源。在图像处理、计算机视觉方面,几年前的Matlab功能不如如今强大,因此在matahworks网站的File exchange(File exchange )能够看到大量的我的上传的代码,但绝大多数综合性和运算性能不强。vlfeat的出现改变了这一现状,能够移步官网下载vlfeat ,最好下载编译过的bin文件,不然只下载源码的话须要本身编译。不过官方的编译不太完整,某些函数若是要在64的Windows10或Windows7下使用,须要本身编译,也可单独对某个函数使用mex命令编译,我使用Visual Studio 2010和Visual Studio 2015编译过。
vlfeat的功能不少,包含了多种特征提取(SIFT、DSIFT、QuickSIFT、PHOW、HOG、MSER、SLIC、Fisher、LBP)、局部特征匹配(UBC match)、分类(SVM、GMM)、聚类(IKM、HKM、AIB,Agglomerative Information Bottleneck)、检索(Random kd-tree)、分割、计算评价指标(true positives and false positives)、做图(Precision-recall curve、ROC curve)、生成分布函数(Second derivative of the Gaussian density function、Derivative of the Gaussian density function、Derivative of the sigmoid function、Standard Gaussian density function、Sigmoid function)、特征编码(VLAD)等功能,可参看帮助文章或演示程序。
注意,在运行演示程序前,要先运行toolbox目录下的vl_setup.m和vl_root.m,以添加必要的路径。
这里介绍一下Caltech101的分类演示程序phow_caltech101.m。html
function phow_caltech101() % PHOW_CALTECH101 Image classification in the Caltech-101 dataset % This program demonstrates how to use VLFeat to construct an image % classifier on the Caltech-101 data. The classifier uses PHOW % features (dense SIFT), spatial histograms of visual words, and a % Chi2 SVM. To speedup computation it uses VLFeat fast dense SIFT, % kd-trees, and homogeneous kernel map. The program also % demonstrates VLFeat PEGASOS SVM solver, although for this small % dataset other solvers such as LIBLINEAR can be more efficient. % % Author: Andrea Vedaldi % Copyright (C) 2011-2013 Andrea Vedaldi % All rights reserved. conf.calDir = 'data/caltech-101' ; conf.dataDir = 'data/' ; conf.autoDownloadData = true ; conf.numTrain = 15 ; conf.numTest = 15 ; conf.numClasses = 102 ; conf.numWords = 600 ; conf.numSpatialX = [2 4] ; conf.numSpatialY = [2 4] ; conf.quantizer = 'kdtree' ; conf.svm.C = 10 ; conf.svm.solver = 'liblinear' ; conf.svm.biasMultiplier = 1 ; conf.phowOpts = {'Step', 3} ; conf.clobber = false ; conf.tinyProblem = true ; conf.prefix = 'baseline' ; conf.randSeed = 1 ; %为加快速度,只处理5类数据,词典单词数为300 if conf.tinyProblem conf.prefix = 'tiny' ; conf.numClasses = 5 ; conf.numSpatialX = 2 ; conf.numSpatialY = 2 ; conf.numWords = 300 ; conf.phowOpts = {'Verbose', 2, 'Sizes', 7, 'Step', 5} ; end %设置词典、直方图、模型、运行结果、特征文件存储路径 conf.vocabPath = fullfile(conf.dataDir, [conf.prefix '-vocab.mat']) ; conf.histPath = fullfile(conf.dataDir, [conf.prefix '-hists.mat']) ; conf.modelPath = fullfile(conf.dataDir, [conf.prefix '-model.mat']) ; conf.resultPath = fullfile(conf.dataDir, [conf.prefix '-result']) ; conf.featPath = fullfile(conf.dataDir, [conf.prefix '-feat.mat']) ; %设置随机数生成器 randn('state',conf.randSeed) ; vl_twister('state',conf.randSeed) ; %第一次运行需下载数据,如速度慢,可以使用迅雷在该网址手动下载 if ~exist(conf.calDir, 'dir') || ... (~exist(fullfile(conf.calDir, 'airplanes'),'dir') && ... ~exist(fullfile(conf.calDir, '101_ObjectCategories', 'airplanes'))) if ~conf.autoDownloadData error(... ['Caltech-101 data not found. ' ... 'Set conf.autoDownloadData=true to download the required data.']) ; end vl_xmkdir(conf.calDir) ; calUrl = ['http://www.vision.caltech.edu/Image_Datasets/' ... 'Caltech101/101_ObjectCategories.tar.gz'] ; fprintf('Downloading Caltech-101 data to ''%s''. This will take a while.', conf.calDir) ; untar(calUrl, conf.calDir) ; end %设置图像集路径 if ~exist(fullfile(conf.calDir, 'airplanes'),'dir') conf.calDir = fullfile(conf.calDir, '101_ObjectCategories') ; end %获取类别信息,文件夹名字为类别名字 classes = dir(conf.calDir) ; classes = classes([classes.isdir]) ; classes = {classes(3:conf.numClasses+2).name} ; images = {} ; imageClass = {} ; %遍历各图像集各种文件夹,获取图像名和类别名 for ci = 1:length(classes) ims = dir(fullfile(conf.calDir, classes{ci}, '*.jpg'))' ;
ims = vl_colsubset(ims, conf.numTrain + conf.numTest) ; ims = cellfun(@(x)fullfile(classes{ci},x),{ims.name},'UniformOutput',false) ; images = {images{:}, ims{:}} ; imageClass{end+1} = ci * ones(1,length(ims)) ; end %创建训练集 selTrain = find(mod(0:length(images)-1, conf.numTrain+conf.numTest) < conf.numTrain) ; %创建测试集 selTest = setdiff(1:length(images), selTrain) ; imageClass = cat(2, imageClass{:}) ; model.classes = classes ; model.phowOpts = conf.phowOpts ; model.numSpatialX = conf.numSpatialX ; model.numSpatialY = conf.numSpatialY ; model.quantizer = conf.quantizer ; model.vocab = [] ; model.w = [] ; model.b = [] ; model.classify = @classify ; %提取图像PHOW特征并训练词典 if ~exist(conf.vocabPath) || conf.clobber %取30幅图像进行训练 selTrainFeats = vl_colsubset(selTrain, 30) ; descrs = {} ; for ii = 1:length(selTrainFeats) im = imread(fullfile(conf.calDir, images{selTrainFeats(ii)})) ; %对图像进行标准化 im = standarizeImage(im) ; %提取特征 [drop, descrs{ii}] = vl_phow(im, model.phowOpts{:}) ; end descrs = vl_colsubset(cat(2, descrs{:}), 10e4) ; descrs = single(descrs) ; save(conf.featPath, 'descrs') ; % 对特征进行聚类生成词典 vocab = vl_kmeans(descrs, conf.numWords, 'verbose', 'algorithm', 'elkan', MaxNumIterations', 50) ;
save(conf.vocabPath, 'vocab') ; else load(conf.vocabPath) ; end model.vocab = vocab ; if strcmp(model.quantizer, 'kdtree') %为词典创建kdtree索引 model.kdtree = vl_kdtreebuild(vocab) ; end %计算图像直方图 if ~exist(conf.histPath) || conf.clobber hists = {} ; parfor ii = 1:length(images) fprintf('Processing %s (%.2f %%)\n', images{ii}, 100 * ii / length(images)) ; im = imread(fullfile(conf.calDir, images{ii})) ; hists{ii} = getImageDescriptor(model, im); end hists = cat(2, hists{:}) ; save(conf.histPath, 'hists') ; else load(conf.histPath) ; end % 对直方图进行homker映射 psix = vl_homkermap(hists, 1, 'kchi2', 'gamma', .5) ; % 训练词典 if ~exist(conf.modelPath) || conf.clobber switch conf.svm.solver case {'sgd', 'sdca'} lambda = 1 / (conf.svm.C * length(selTrain)) ; w = [] ; parfor ci = 1:length(classes) perm = randperm(length(selTrain)) ; fprintf('Training model for class %s\n', classes{ci}) ; y = 2 * (imageClass(selTrain) == ci) - 1 ; [w(:,ci) b(ci) info] = vl_svmtrain(psix(:, selTrain(perm)), y(perm), lambda, ... 'Solver', conf.svm.solver,'MaxNumIterations', 50/lambda, ... 'BiasMultiplier', conf.svm.biasMultiplier, 'Epsilon', 1e-3); end case 'liblinear' svm = train(imageClass(selTrain)',sparse(double(psix(:,selTrain))), ... sprintf(' -s 3 -B %f -c %f', conf.svm.biasMultiplier, conf.svm.C),'col') ; w = svm.w(:,1:end-1)' ; b = svm.w(:,end)' ;
end model.b = conf.svm.biasMultiplier * b ; model.w = w ; save(conf.modelPath, 'model') ; else load(conf.modelPath) ; end % 计算测试图像得分 scores = model.w' * psix + model.b' * ones(1,size(psix,2)) ; [drop, imageEstClass] = max(scores, [], 1) ; % 计算混淆矩阵 idx = sub2ind([length(classes), length(classes)], ... imageClass(selTest), imageEstClass(selTest)) ; confus = zeros(length(classes)) ; confus = vl_binsum(confus, ones(size(idx)), idx) ; % Plots figure(1) ; clf; subplot(1,2,1) ; imagesc(scores(:,[selTrain selTest])) ; title('Scores') ; set(gca, 'ytick', 1:length(classes), 'yticklabel', classes) ; subplot(1,2,2) ; imagesc(confus) ; title(sprintf('Confusion matrix (%.2f %% accuracy)', ... 100 * mean(diag(confus)/conf.numTest) )) ; print('-depsc2', [conf.resultPath '.ps']) ; save([conf.resultPath '.mat'], 'confus', 'conf') ; % ------------------------------------------------------------------------- function im = standarizeImage(im) % ------------------------------------------------------------------------- im = im2single(im) ; if size(im,1) > 480, im = imresize(im, [480 NaN]) ; end % ------------------------------------------------------------------------- function hist = getImageDescriptor(model, im) % ------------------------------------------------------------------------- im = standarizeImage(im) ; width = size(im,2) ; height = size(im,1) ; numWords = size(model.vocab, 2) ; [frames, descrs] = vl_phow(im, model.phowOpts{:}) ; switch model.quantizer case 'vq' [drop, binsa] = min(vl_alldist(model.vocab, single(descrs)), [], 1) ; case 'kdtree' binsa = double(vl_kdtreequery(model.kdtree, model.vocab,single(descrs), 'MaxComparisons', 50)) ; end for i = 1:length(model.numSpatialX) binsx = vl_binsearch(linspace(1,width,model.numSpatialX(i)+1), frames(1,:)) ; binsy = vl_binsearch(linspace(1,height,model.numSpatialY(i)+1), frames(2,:)) ; bins = sub2ind([model.numSpatialY(i), model.numSpatialX(i), numWords], binsy,binsx,binsa) ; hist = zeros(model.numSpatialY(i) * model.numSpatialX(i) * numWords, 1) ; hist = vl_binsum(hist, ones(size(bins)), bins) ; hists{i} = single(hist / sum(hist)) ; end hist = cat(1,hists{:}) ; hist = hist / sum(hist) ; % ------------------------------------------------------------------------- function [className, score] = classify(model, im) % ------------------------------------------------------------------------- hist = getImageDescriptor(model, im) ; psix = vl_homkermap(hist, 1, 'kchi2', 'gamma', .5) ; scores = model.w' * psix + model.b' ; [score, best] = max(scores) ; className = model.classes{best} ;
Matlab版本的函数vlfeat可在该页查看,API函数 。简单列举以下:node
vl_compile Compile VLFeat MEX files
vl_demo Run VLFeat demos
vl_harris Harris corner strength
vl_help VLFeat toolbox builtin help
vl_noprefix Create a prefix-less version of VLFeat commands
vl_root Obtain VLFeat root path
vl_setup Add VLFeat Toolbox to the pathless- AIB
vl_aib Agglomerative Information Bottleneck
vl_aibcut Cut VL_AIB tree
vl_aibcuthist Compute a histogram by using an AIB compressed alphabet
vl_aibcutpush Quantize based on VL_AIB cut
vl_aibhist Compute histogram over VL_AIB tree- FISHER
vl_fisher Fisher vector feature encoding- GEOMETRY
vl_hat Hat operator
vl_ihat Inverse vl_hat operator
vl_irodr Inverse Rodrigues' formula
vl_rodr Rodrigues' formula- GMM
vl_gmm Learn a Gaussian Mixture Model using EM- IMOP
vl_dwaffine Derivative of an affine warp
vl_imarray Flattens image array
vl_imarraysc Scale and flattens image array
vl_imdisttf Image distance transform
vl_imdown Downsample an image by two
vl_imgrad Image gradient
vl_imintegral Compute integral image
vl_impattern Generate an image from a stock pattern
vl_imreadbw Reads an image as gray-scale
vl_imreadgray Reads an image as gray-scale
vl_imsc Scale image
vl_imsmooth Smooth image
vl_imup Upsample an image by two
vl_imwbackward Image backward warping
vl_imwhiten Whiten an image
vl_rgb2xyz Convert RGB color space to XYZ
vl_tps Compute the thin-plate spline basis
vl_tpsu Compute the U matrix of a thin-plate spline transformation
vl_waffine Apply affine transformation to points
vl_witps Inverse thin-plate spline warping
vl_wtps Thin-plate spline warping
vl_xyz2lab Convert XYZ color space to LAB
vl_xyz2luv Convert XYZ color space to LUV
vl_xyz2rgb Convert XYZ to RGB- KMEANS
vl_hikmeans Hierachical integer K-means
vl_hikmeanshist Compute histogram of quantized data
vl_hikmeanspush Push data down an integer K-means tree
vl_ikmeans Integer K-means
vl_ikmeanshist Compute histogram of quantized data
vl_ikmeanspush Project data on integer K-means paritions
vl_kmeans Cluster data using k-means- MISC
vl_alldist2 Pairwise distances
vl_alphanum Sort strings using the Alphanum algorithm
vl_argparse Parse list of parameter-value pairs
vl_binsearch Maps data to bins
vl_binsum Binned summation
vl_colsubset Select a given number of columns
vl_cummax Cumulative maximum
vl_getpid Get MATLAB process ID
vl_grad Compute the gradient of an image
vl_histmarg Marginal of histogram
vl_hog Compute HOG features
vl_homkermap Homogeneous kernel map
vl_ihashfind Find labels in an integer hash table
vl_ihashsum Accumulate integer labels into a hash table
vl_inthist Calculate Integral Histogram
vl_isoctave Determines whether Octave is running
vl_kdtreebuild Build randomized kd-tree
vl_kdtreequery Query KD-tree
vl_lbp Local Binary Patterns
vl_lbpfliplr Flip LBP features left-right
vl_localmax Find local maximizers
vl_matlabversion Return MATLAB version as an integer
vl_numder Numerical derivative
vl_numder2 Numerical second derivative
vl_override Override structure subset
vl_pegasos [deprecated]
vl_sampleinthist Sample integral histogram
vl_simdctrl Toggle VLFeat SIMD optimizations
vl_svmdataset Construct advanced SVM dataset structure
vl_svmpegasos [deprecated]
vl_svmtrain Train a Support Vector Machine
vl_threads Control VLFeat computational threads
vl_twister Random number generator
vl_version Obtain VLFeat version information
vl_whistc Weighted histogram
vl_xmkdir Create a directory recursively.- MSER
vl_erfill Fill extremal region
vl_ertr Transpose exremal regions frames
vl_mser Maximally Stable Extremal Regions- PLOTOP
vl_cf Creates a copy of a figure
vl_click Click a point
vl_clickpoint Select a point by clicking
vl_clicksegment Select a segment by clicking
vl_det Compute DET curve
vl_figaspect Set figure aspect ratio
vl_linespec2prop Convert PLOT style line specs to line properties
vl_plotbox Plot boxes
vl_plotframe Plot a geometric frame
vl_plotgrid Plot a 2-D grid
vl_plotpoint Plot 2 or 3 dimensional points
vl_plotstyle Get a plot style
vl_pr Precision-recall curve.
vl_printsize Set the printing size of a figure
vl_roc ROC curve.
vl_tightsubplot Tiles axes without wasting space
vl_tpfp Compute true positives and false positives- QUICKSHIFT
vl_flatmap Flatten a tree, assigning the label of the root to each node
vl_imseg Color an image based on the segmentation
vl_quickseg Produce a quickshift segmentation of a grayscale or color image
vl_quickshift Quick shift image segmentation
vl_quickvis Create an edge image from a Quickshift segmentation.- SIFT
vl_covdet Covariant feature detectors and descriptors
vl_dsift Dense SIFT
vl_frame2oell Convert a geometric frame to an oriented ellipse
vl_liop Local Intensity Order Pattern descriptor
vl_phow Extract PHOW features
vl_plotsiftdescriptor Plot SIFT descriptor
vl_plotss Plot scale space
vl_sift Scale-Invariant Feature Transform
vl_siftdescriptor Raw SIFT descriptor
vl_ubcmatch Match SIFT features
vl_ubcread Read Lowe's SIFT implementation data files- SLIC
vl_slic SLIC superpixels- SPECIAL
vl_ddgaussian Second derivative of the Gaussian density function
vl_dgaussian Derivative of the Gaussian density function
vl_dsigmoid Derivative of the sigmoid function
vl_gaussian Standard Gaussian density function
vl_rcos RCOS function
vl_sigmoid Sigmoid function- VLAD
vl_vlad VLAD feature encoding
具体使用方法可运行演示程序。dom