Skip to content

Devil King's Blog

Posts
Tags
About
Archives
/

Posts Tags About Archives Search

posts202102-04-q-learnning

Q-learning

4 Feb, 2021

// content

通过reward值，可以形成矩阵

将agent的每一次探索称为一个episode，即从任意初始状态到达目标状态

// end

Devil King's Blog

Long-term thinking, steady delivery, and code that lasts.

Maintained by gqlxj1987 https://gqlxj1987.github.io/

© 2026 Devil King's Blog. All rights reserved.

Built with Astro