cs294-RL introduction

时间 2021-01-16

标签 cs294 强化学习繁體版

原文原文链接

强化学习的种类 model-based RL 值函数 policy gradient actor-critic： value function plus policy gradients 为什么要有那么多的RL算法？协调因素：采样高效、稳定不同假设：随机或确定、连续or离散、episode or infinite horizon 难度不同：策略展示简单还是模型展示简单采样高效、on-poli

>>阅读原文<<

1. Introduction
2. ProGuard Introduction
3. Spring Introduction
4. Grafana introduction
5. Lecture1: Introduction
6. Week1:Introduction
7. ffos:ffos introduction
8. LLVM Introduction
9. Beamer Introduction
10. Solr: Introduction
更多相关文章...
• Web 品质 - 重要的 HTML 元素 - 网站品质教程
• XLink 实例 - XLink 和 XPointer 教程