How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation Abstract 本文调研了各种nlg系统的metric 近期的nlg metric从MT发展而来,本文发现这些metric与人类在Twitt
相关文章
相关标签/搜索