【DL小结5】Transformer模型与self attention

时间 2020-12-30

原文原文链接

1 提出背景针对attention model不能平行化，且忽略了输入句中文字间和目标句中文字间的关系，google在2017年《Attention is all you need》一文提出了Transformer模型。Transformer最大的特点就是完全抛弃了RNN、CNN架构。模型中主要的概念有2项：1. Self attention（代替RNN）：解决输入句中文字间和目标句中文字间的