AI - Reinforcement

MDP Markov Decision Process MDP (Markov Decision Process) Created with Raphaël 2.1.2 State Space Action Space Transition Function Reward Function State: S Action: A Tansition Function T(s,a,s′)=P(St+1
相关文章
相关标签/搜索