“变形金刚”为何强大：从模型到代码全面解析Google Tensor2Tensor系统

时间 2021-01-12

原文原文链接

张金超_WXG_PRC 在这篇文章中：第一章：概述第二章：序列到序列任务与Transformer模型 2.1 序列到序列任务与Encoder-Decoder框架 2.2 神经网络模型与语言距离依赖现象 2.3 self-attention机制的形式化表达 2.4 “Attention is All You Need” 第三章：Tensor2Tensor系统实现深度解析 3.1 使用篇