Attention is all you need

时间 2021-07-11

原文原文链接

Abstract 摘要主要的序列转换模型是基于复杂的递归或卷积神经网络，其中包括编码器和解码器。性能最好的模型还通过注意机制连接编码器和解码器。我们提出了一种新的简单的网络结构——变形金刚，它完全基于注意力机制，完全省去了递归和卷积。在两个机器翻译任务上的实验表明，这些模型在质量上更优，同时具有更大的并行性，并且需要更少的训练时间。我们的模型在2014年WMT英德翻译任务中达到28.4 BLEU。

>>阅读原文<<

1. Attention Is All You Need
2. Attention is all you need
3. 《Attention Is All You Need》
4. Attention Is All You Need简析
5. 【笔记】Attention Is All You Need
6. 【算法】Attention is all you need
7. attention is all you need笔记
8. Transformer【Attention is all you need】
9. 译文 Attention Is All You Need
10. Attention Is All You Need 笔记
更多相关文章...
• XML Schema all 元素 - XML Schema 教程
• XSL-FO 与 XSLT - XSL-FO 教程
• 为了进字节跳动，我精选了29道Java经典算法题，带详细讲解
• RxJava操作符（七）Conditional and Boolean