《Supervised Multimodal Bitransformers for Classifying Images and Text》阅读笔记

时间 2020-12-29

原文原文链接

《Supervised Multimodal Bitransformers for Classifying Images and Text》阅读笔记 1 Why 2 What 3 How 3.1 文本特征 3.2 图片特征 4 Result 5 Idea 6 Relatives 1 Why 现在越来越多模态化，文字信息通常夹带着图像、声音、视频以及各种传感器的信号。但是很多的多模态数据是以文本为主