The Craft of Post-Training

A Practical Guide for AI Engineers and Developers

by Chris von Csefalvay

Available July 2026, 416 pp.

ISBN-13:

9781718505209

Download Chapter 3: Supervised Fine-Tuning: The Foundation Technique

A pre-trained model has read most of the internet—and can be trusted with almost none of it. Post-training is the work that changes that: where you take a raw, general model and shape it into something that behaves, follows instructions, refuses what it shouldn’t do, and handles the specific job you need. It’s the human hand on the machine, and the part almost no one explains.

Chris von Csefalvay has spent his career building production ML systems in industry, from clinical language to legal text. In The Craft of Post-Training, he shows you the decisions behind every technique: when to fine-tune and when not to, why a model quietly gets worse, and which method fits the constraint you’re actually under. The math is here, because knowing why a technique works is what lets you debug it when it breaks.

You’ll know how to:

Choose among the main post-training methods, from SFT and RLHF to DPO, KTO, and GRPO, well enough to fix failures instead of guessing
Adapt a model to your domain without catastrophic forgetting—the tendency of a network to abruptly overwrite what it already knew when you train it on something new
Run larger models with the memory you have by using new quantization
Train agentic systems to act reliably under adversarial pressure
Measure what matters in your deployment, beyond standard benchmarks

When you’ve used LLMs long enough, you start to wonder what was done to make them behave. The secret is in the post-training that shaped them. The Craft of Post-Training shows you how that’s done.

Author Bio

Chris von Csefalvay is a Principal at HCLTech’s AI Practice, leading post-training research and clinical intelligence. He has held senior data science leadership roles across major enterprises and designed language models for applications from pharmacovigilance to social dynamics. He holds degrees from Oxford and Cardiff, and is a Fellow of the Royal Society for Public Health and a Senior Member of IEEE.

Table of contents

Acknowledgments
Preface

Part I: The Foundation
Chapter 1: Post-Training Essentials: What Is and Why It Matters
Chapter 2: Prerequisites for Success: Before You Fine-Tune

Part II: The Tools
Chapter 3: Supervised Fine-Tuning: The Foundation Technique
Chapter 4: Reinforcement Learning: Better Each Time
Chapter 5: Preference Optimization Modern Alternatives to PPO
Chapter 6: Evaluation Strategies: Measuring Model Quality

Part III: The Craft
Chapter 7: Efficiency Techniques: Quantization and Compression
Chapter 8: Domain Adaptation: Make It Yours
Chapter 9: Agentic Models: Deeds, Not Words
Chapter 10: Reasoning Capabilities: Training for Complex Thought

Part IV: The Frontier
Chapter 11: Synthetic Training: Self-Play and Generated Data
Chapter 12: Multimodal Systems: Post-Training Beyond Text
Chapter 13: Future Directions: What Comes Next
Bibliography

The chapters in red are included in this Early Access PDF.

Extra Stuff

Visit the book's companion repository on GitHub.