Lecture 7 - Dynamic Programming | Reinforcement Learning Phase | Reasoning LLMs from Scratch | Transcript