Large AI models are now part of everyday technology, yet the way they actually learn remains hidden behind layers of abstraction. This talk explores the training pipeline behind modern AI from a developer’s perspective, with an emphasis on building mechanical sympathy for how these systems behave under the hood.
The discussion begins with the fundamentals of model learning — how neural networks extract patterns from data and how training gradually shapes model behavior. With that foundation, the session moves through the major learning paradigms used in machine learning and leads into reinforcement learning, a technique that has become central to shaping modern large language models (LLMs).
Particular attention is given to Reinforcement Learning from Human Feedback (RLHF), which has often proven more effective than traditional supervised fine-tuning when guiding model behavior. The talk also introduces the intuition behind RL training, including policies as a decision layer, how policy updates influence a language model, and how such systems are evaluated and refined before deployment.
The goal is to develop a clear, end-to-end understanding of how the field evolved — from training basic neural networks to building the large language models that power today’s AI systems — and how a deeper understanding of their internals can help developers use them more effectively.
