论文阅读笔记:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

目录 摘要 1.引言 2.相关工作 2.1Feature-based Approaches 2.2Fine-tuning方法  3 BERT 3.1 Model Architecture 3.2 Input Representation  3.3 Pre-training Tasks 3.3.1 Task #1: Masked LM 3.3.2 Task #2: Next Sentence Pre
相关文章
相关标签/搜索