Talks

Discover the Talks at PyCon Colombia 2026 ✨

Browse every accepted session—titles, tracks, levels, and speakers—before you plan your days in Medellín.

Search talks
Machine LearningData Science

How to Find Pearls on the Bottom of the Sea – Autoencoders as Anomaly Detection Models

Like finding pearls on the ocean floor, detecting rare anomalies in large datasets requires sophisticated techniques. In this workshop, you'll learn the theory and practice of autoencoder architectures, how to train them for anomaly detection, how to set decision boundaries, and how to evaluate their performance. We'll work with real-world datasets and build complete anomaly detection pipelines in Python.

View talk
Artificial IntelligenceMachine LearningData Science

The GenAI Revolution Reaches RecSys

When we talk about the generative AI revolution, the conversation usually stays close to chatbots, image generation, and code assistants. But the same architectures that powered that wave (transformers, autoregressive modeling, scaling laws) are quietly reshaping fields most people don't associate with GenAI at all. Recommender systems are one of the most interesting examples. Meta, Netflix, Google, Spotify and others are replacing decades-old recsys pipelines with transformer-based foundation models, and the results are hard to ignore. This talk is a practical tour of that shift from a Python engineer's seat.

View talk
Core PythonScientific Computing

Leverage your Python skill using the Python interpreter

In this talk, I'll challenge the audience's mindset about Python. Python is not an interpreter, and in fact, there are multiple Python interpreters—each with its own architecture and purpose. I'll walk through Python's core internals and show how programming languages interact beneath the surface. We'll explore how to write better Python by understanding the garbage collector, what you can build using the AST, how to read and leverage the disassembler, and the practical implications of Python's transition from its old LL(1) parser to the current PEG parser. We'll also dive into lesser-known features of Python interpreters, what a PEP really is and how it shapes the language, and conclude with a deep look at Python without the GIL—what changes, what breaks, and how the core team removed it. Throughout the talk, I'll share personal stories, including battles caused by identical ASTs and the moment I believed I had discovered a way to speed up the Python interpreter itself.

View talk
Artificial IntelligenceMachine Learning

Opening the Black Box: Mechanistic Interpretability of LLMs

As agents are deployed in high-stakes contexts (finance, manufacturing, healthcare), understanding how they make decisions—and not just what they decide—becomes fundamental to safety and trust. For example, when an agent receives the instruction "Search for our company's third-quarter results" and chooses to search internal documents instead of the public web, what internal process drives that choice? Answer engineering, behavioral testing, and chain-of-thought analysis describe correlations or narratives; none reveals the actual mechanism. Understanding how an agent reaches a conclusion is a critical component of developing AI responsibly, especially regarding reliability and transparency in AI systems. Model interpretability is one way developers can build trust and consistency in their systems and support the safe deployment of AI agents.

View talk
Artificial IntelligenceDevOps

From Vibe Coding to Spec-Driven Development with AWOS in Claude Code

Vibe coding works great until it doesn't. When AI agents start ignoring your architecture, making wrong assumptions about your stack, and producing code that compiles but misses the point, the problem isn't the model. It's the instructions. This talk introduces AWOS (Agentic Workflow Operating System), an open-source framework built by Provectus for Claude Code that brings Spec-Driven Development to AI-assisted coding. AWOS structures the development process into 8 phases, each with its own specialized agent and audience. What you'll see: a live demo building a conference talk management app. What you'll take home: a tool you can install with npx @provectusinc/awos and start using immediately.

View talk
Core PythonWeb

Python in the Browser: Powered by WebAssembly

What if the browser could run Python as a first-class language? In this talk, I'll show how PyScript makes it possible to execute real Python directly in the browser, powered by WebAssembly. Through a series of exciting, live examples, you'll see Python manipulating the DOM, calling browser APIs, and building interactive experiences, all without a traditional JavaScript codebase. I will also show a couple of examples of how you can embed both JavaScript and Python on PyScript to make even more exciting tools. I will also discuss what WebAssembly is, why it exists, and how it enables languages like Python to run safely and efficiently on the web platform. Finally, I'll discuss when tools like PyScript make sense, and compare it with similar tools. Whether you're a Python developer curious about the frontend, an engineer interested in WebAssembly, or simply someone who enjoys seeing the boundaries of Python pushed, this talk will change how you think about what can run in a browser.

View talk
Artificial IntelligenceMachine LearningDevOps

From Expert Judgment to Autonomous Optimization: Encoding Human Expertise into LLM Judges with DSPy

A single misread clause in a reinsurance contract can mean millions in liability. Our LLM pipeline could extract and summarize these documents, but how do you know the output is actually correct? String matching fails ("USD 5,000,000" vs "$5M" scores zero), human review at scale is unaffordable, and a single LLM-as-judge prompt gives inconsistent, uncalibrated scores. The real bottleneck was never generation; it was evaluation. This talk shows how we solved it in two steps, both built entirely in Python. First, we encoded expert evaluation at scale using DSPy to distill judgments from five domain experts into a panel of calibrated LLM judges, each targeting a single quality dimension, weighted to reflect what experts actually care about. Then we closed the loop using DSPy's MIPROv2 and GEPA optimizers, wiring the judge panel as a fitness function and letting the system rewrite prompts autonomously, with regression guards and CI gates so humans review only the final score delta. The stack is Python-native: DSPy, MLflow, LiteLLM, Pydantic. You will leave with a concrete recipe for encoding expert knowledge into automated LLM evaluation and self-improving optimization, applicable to any domain where "correct" is nuanced.

View talk
Artificial IntelligenceData Science

How We Stopped Answering Data Questions and Built the Stack That Answers Them

If you've worked at a growing startup, you probably know the feeling: multiple teams pulling different numbers for the same metric, ops constantly asking engineering for basic answers, and creating or organizing metrics that's a real pain. Every new question feels like starting from scratch. This talk is the story of how a small team fixed that. First, by building a proper dbt architecture from scratch with Sources, Staging, Intermediate, and Marts so that things like bookings, revenue, and providers were defined in one place and everyone was looking at the same number. Once the data was reliable, we connected an LLM so non-technical teammates could ask questions in plain English and get real answers directly from Snowflake. No SQL, no ticket, no waiting on engineering. You'll walk away with a clear mental model for building a dbt layer people actually trust, a practical architecture for connecting an LLM to your warehouse, and the one thing that made it all click: your dbt docs are your LLM prompt.

View talk
Artificial Intelligence

Multi-Agent Teams in AI-Assisted Development: A Glimpse Into the Future of Programming

Get a glimpse into the future of programming, where teams of AI agents collaborate with human developers. In this workshop, you'll explore cutting-edge patterns for multi-agent collaboration in AI-assisted development: code generation agents, review agents, testing agents, and orchestration strategies. We'll build a mini multi-agent development team using Python and the Claude SDK, and discuss where this technology is heading and how developers can prepare.

View talk
Artificial IntelligenceWeb

Build an OpenClaw-style Coding Assistant on WhatsApp with Claude Agent SDK

Build a fully functional AI coding assistant that lives in WhatsApp, inspired by OpenClaw, using Claude's Agent SDK and Python. In this hands-on workshop, you'll learn to integrate the Claude Agent SDK with the WhatsApp Business API, design conversational flows for code assistance, handle multi-turn conversations with memory, and deploy your assistant to the cloud. Walk away with a working AI coding companion accessible from any device.

View talk
Artificial IntelligenceData Science

PyBlend: Towards an AI Food Scientist for Nutritional Product Design

Discover how Python and AI are transforming nutritional product design. In this workshop, you'll be introduced to PyBlend, a framework that models the complex optimization problem of designing nutritional formulations. We'll explore how machine learning algorithms can navigate vast ingredient spaces, balance nutritional constraints, and generate novel product formulations. Attendees will gain hands-on experience with AI-driven product design and learn how Python makes interdisciplinary AI applications possible.

View talk
Artificial IntelligenceMachine LearningData ScienceCore Python

Your AI Eval Is Lying To You

When you set temperature=0 and run your AI eval, you expect the same input to give the same output. It doesn't. Recent measurements on Qwen3-235B at temperature=0 produced 80 unique completions on a single prompt. So when your eval reports "92% pass rate," what does that actually mean? This talk is about the gap between how the AI eval ecosystem talks about scores and what those scores can actually support. We walk through five specific tools that fix the gap: Pass@k versus pass^k, Wilson confidence intervals, Bayesian pass@k with Beta-Binomial conjugacy, sequential drift detection with EWMA, CUSUM, and OLS, and family-wise error control via Benjamini-Hochberg procedures. Each method gets a short demo in pure Python with no framework dependency. The audience leaves with reference implementations they can paste into an existing pytest setup tonight.

View talk
Core PythonWeb

Stop Mocking, Start Containerizing

Tired of maintaining brittle mock objects that don't reflect production behavior? In this workshop, you'll learn how to replace mocks with real containerized services using Testcontainers for Python. Bring your laptop and a running Docker engine—we're going to get our hands dirty!

View talk