Building a Transformer with Rust

Transformers are often perceived as incomprehensible giants. This talk aims to prove the opposite: they are not black boxes but elegant mechanisms that can be understood and mastered from their fundamentals. We present Molinete AI, a GPT-2-style model built strictly from scratch in Rust. No deep learning frameworks—just tensors, math, and full control. Inspired by Feste from Tag1 Consulting (trained on Shakespeare), this project poses a different challenge: training the network on Miguel de Cervantes's work to generate text in the style of the Golden Age. Throughout the session we'll break the model down piece by piece. With the support of a Manim animated presentation (over 4,000 lines of code), we'll make visible how information flows inside the network. We'll start from tokenization (BPE) and building basic operations, then dive into the core of the model: embeddings, causal mask, and Multi-Head Self-Attention. Finally, we'll explore the learning process, watching how gradients flow through the network during training. More than a demo, this talk aims to provide a clear, operational view of Transformers, connecting theory with a real from-scratch implementation.

Want to know more?

Join PyCon Colombia newsletter and get a complete overview of our events, speakers and community participation.