强化学习 | Notion

Untitled

Untitled

使用 Actor-Critic 方法玩 CartPole 游戏 | TensorFlow Core

【生成式AI導論 2024】第8講：大型語言模型修練史 — 第三階段: 參與實戰，打磨技巧 (Reinforcement Learning from Human Feedback, RLHF) - YouTube