Machine Learning(8): Reinforcement learning

Reinforcement learning Problem-abstraction The processing of Markov The propery of Markov The policy Value function The example of Value function Bellman’s Expectation Equation Optimal policy Bellman’
相关文章
相关标签/搜索