How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for

时间 2020-12-24

原文原文链接

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation Abstract 本文调研了各种nlg系统的metric 近期的nlg metric从MT发展而来，本文发现这些metric与人类在Twitt