survey: VQA

VQA: Given an image and a question in natural language, it requires reasoning over visual elements of the image and general knowledge to infer the correct answer. 和基于对象检测的任务区别 对象识别-对图像主要对象进行分类 目标检测-通过
相关文章
相关标签/搜索