http://openaccess.thecvf.com/content_cvpr_2017/papers/Kodirov_Semantic_Autoencoder_for_CVPR_2017_paper.pdfweb
Semantic Autoencoder for Zero-Shot Learning,Elyor Kodirov Tao Xiang Shaogang Gong,Queen Mary University of London, UK,{e.kodirov, t.xiang, s.gong}@qmul.ac.uk算法
亮点性能
- 经过对耦学习提高零次学习系统的性能(相似CycleGan)
- 结构很是简洁,且可直接求解,速度很是快
- 有效应用到其余相关任务(监督聚类)上,证实了范化性能

方法学习
Linear autoencoder测试

Model Formulationspa

which is a well-known Sylvester equation which can be solved efficiently by the Bartels-Stewart algorithm (matlab sylvester).3d
零次学习:基于以上算法有两种测试的方法:code
- 将一个未知的类别特征样本xi经过W映射到语义空间(属性)si,经过比较语义空间的距离找到离它最近的类别(无训练样本),即为它的标签
- 将全部无训练数据类别的语义特征S经过WT映射到特征空间X,经过比较一个未知类别的样本xi和映射到特征空间的类别中心X的距离,找到离它最近的类别,即为它的标签
- 以上两种算法获得结果的准确度基本相同。
监督聚类:在这个问题中,语义空间即为类别标签空间(one-hot class label)。全部测试数据被影射到训练类别标签空间,而后使用k-means聚合orm
与已有模型的关系:零度学习已有模型通常学习一个知足如下条件的影射:blog

或者,在[54]中将属性影射到特征空间,学习目标变为,

文中的算法结合了这二者,并且因为W*=WT,在对耦学习中W不可能太大(不然,x乘以两个范数很大的的矩阵没法恢复原来的初始值),正则化项能够被忽略。

实验
零次学习
数据集:Semantic word vector representation is used for large-scale datasets (ImNet-1 and ImNet-2). We train a skip-gram text model on a corpus of 4.6M Wikipedia documents to obtain the word2vec2 [38, 37] word vectors.

特征:除 ImNet-1用AlexNet提取外,其余均使用了GoogleNet
结果:
- Our SAE model achieves the best results on all 6 datasets.
- On the smallscale datasets, the gap between our model’s results to the strongest competitor ranges from 3.5% to 6.5%.
- On the large-scale datasets, the gaps are even bigger: On the largest ImNet-2, our model improves over the state-of-the-art SS-Voc [22] by 8.8%.
- Both the encoder and decoder projection functions in our SAE model (SAE (W) and SAE (WT) respectively) can be used for effective ZSL.
- The encoder projection function seems to be slightly better overall.
- Measures how well a zero-shot learning method can trade-off between recognising data from seen classes and that of unseen classes
- Holding out 20% of the data samples from the seen classes and mixing them with the samples from the unseen classes.
- On AwA, our model is slightly worse than the SynCstruct [13].
- However, on the more challenging CUB dataset, our method significantly outperforms the competitors.
聚类
数据集: A synthetic dataset and Oxford Flowers-17 (848 images)
结果:
- On computational cost, our model (93s) is more expensive than MLCA (39%) but much better than all others (hours~days).
- Achieves the best clustering accuracy