Discover the Talks at PyCon Colombia 2026 ✨
Browse every accepted session—titles, tracks, levels, and speakers—before you plan your days in Medellín.
Future-proof Engineers with AI-DLC
AI is transforming not just what engineers build, but how they learn and grow. In this workshop, you'll discover AI-DLC (AI-Driven Learning Curriculum), a framework for creating personalized, adaptive learning paths for software engineers using AI tools. We'll explore how to design learning curricula that incorporate AI assistance, build skills that complement rather than compete with AI, and create development plans that keep engineers relevant and valuable for years to come.
Carlos Alberto Riveros Varela
Software Engineer @ EPAM Systems
Jesús Alfredo Reyes Vargas
Software Engineer @ EPAM Systems
STUART: An Autonomous Hacker Agent Built in Python
What if you give a Python agent an IP address and ask it to find the server's vulnerabilities on its own? That's exactly what I did. In this talk I present STUART, an autonomous pentesting agent I built with AG2 (AutoGen) and GPT-4. The agent can analyze target systems without human intervention, following the first stages of the Cyber Kill Chain: reconnaissance and vulnerability identification. The architecture is 100% Python: an AssistantAgent backed by GPT-4 that reasons and plans, and a UserProxyAgent with a Code Executor that interacts directly with the target system. All orchestrated by AG2, the open-source framework for building multi-agent systems. The talk includes a live demo where STUART will analyze a vulnerable system deployed in Docker. You'll see step by step how the agent scans ports, identifies services, detects vulnerabilities, and reports findings—all autonomously, deciding for itself what to do at each step. You'll take away practical knowledge on how to build agents that act in the real world with AG2, and a concrete perspective on what offensive AI can do today. If a Python agent can find your vulnerabilities, how should defense teams prepare? All demonstrations are performed in controlled, ethical environments.
Vulnerable AI Systems: Real Data, Responsible Design
29% of attacks bypass the security filters of the most widely used LLMs in production. It's not a bug. It's the nature of the system. LLMs are stochastic processes trained on human language—the most flexible, ambiguous, and manipulable medium that exists. This talk presents the results of llm-break-bench: 3,360 adversarial tests on GPT-4o, Claude, Gemini, Grok, and DeepSeek using MLCommons AI Safety v0.5 and OWASP LLM Top 10 as standards. The smartest model in the benchmark is 5 times more vulnerable than the cheapest. The data connects to real use cases where LLMs are in production: RAGs, chatbots, agents, code assistants. The closing is actionable: 5 design pillars for AI systems that don't depend on the model for their own security, with real code from NVIDIA NeMo Guardrails and Meta LlamaFirewall.
High-Performance Video Ingestion with Async Python
Video is one of the most demanding data types to process. In this workshop, you'll learn how to build high-performance video ingestion pipelines using Python's async capabilities. We'll cover asyncio fundamentals for I/O-bound video processing, concurrent frame extraction and processing, async queue patterns for data pipelines, performance profiling and optimization, and real-world deployment considerations. Build a production-grade async video ingestion system from scratch.
Not Every Nail Needs an AI Hammer: Architectures That Think Before They Generate
We live in an era where everything "needs generative AI"... or so we're told. In this talk I'll cut through the hype to talk about what really matters: designing clean, intentional, and sustainable architectures. We'll explore how to combine the best of the traditional world with emerging tools without falling into over-engineering. Because sometimes a well-placed regex beats a multi-million-parameter LLM. If you're tired of seeing Ferraris parked at the supermarket, this talk is for you.
The GenAI Revolution Reaches RecSys
When we talk about the generative AI revolution, the conversation usually stays close to chatbots, image generation, and code assistants. But the same architectures that powered that wave (transformers, autoregressive modeling, scaling laws) are quietly reshaping fields most people don't associate with GenAI at all. Recommender systems are one of the most interesting examples. Meta, Netflix, Google, Spotify and others are replacing decades-old recsys pipelines with transformer-based foundation models, and the results are hard to ignore. This talk is a practical tour of that shift from a Python engineer's seat.
Hacking AI Agents with Python
Artificial intelligence is evolving from static models to autonomous systems capable of reasoning, making decisions, and executing actions through tools and APIs. These systems, known as AI agents, are primarily built in Python. But with this evolution comes a new attack surface. In this talk we'll explore how AI agents can be exploited from an offensive perspective, using Python to demonstrate real attacks such as: prompt injection in agent pipelines, information exfiltration through RAG, decision manipulation through adversarial inputs, and abuse of connected tools and APIs. From these scenarios, we'll show how to design security testing (pentesting) specific to AI systems, including black-box, gray-box, and white-box approaches. The talk won't focus only on attacks but also on how to mitigate them, presenting a practical roadmap to evaluate and strengthen AI systems in production. This session is aimed at Python developers, data scientists, and engineers building or integrating AI systems who want to understand how to secure what they're creating.
Executable Skills: How to Teach an Agent How Your Company Works
How do you make an AI agent that truly understands how your company works? In this workshop, you'll learn to design and implement executable skills—reusable, structured pieces of organizational knowledge that agents can invoke. We'll cover skill architecture, knowledge representation in Python, integrating skills with popular agent frameworks, and testing skill reliability. By the end, you'll have a blueprint for building a company brain that your agents can tap into.
Structured Learning: AI-Powered Platform That Transforms Academic Papers into Interactive Learning Experiences
Structured Learning is a platform that turns a research paper into a complete learning module—chapter-by-chapter explanations, incremental executable code, RAG chat, FSRS spaced-repetition flashcards, equation derivations, and a knowledge graph in Neo4j. This talk covers the product, the engineering of an agentic workflow pipeline that takes a GitHub issue to a merged PR with isolated worktrees, auto-patching after failed review, and GitHub as the agents' API, and how it runs on AWS with LocalStack for dev-prod parity. Agents don't replace engineers—they replace the glue between engineers and the boring 80% of the SDLC—and that's where compound returns live.
Building Your First AI Tool Server: Creating a Pokédex with FastMCP and Python
Build your first AI tool server from scratch using FastMCP and Python, with the Pokédex as your guide! In this hands-on workshop, you'll learn the Model Context Protocol (MCP), set up a FastMCP server, implement custom tools that AI agents can call, and connect everything into a working Pokédex AI assistant. No prior MCP experience needed—just Python knowledge and a love for Pokémon.
The Fellowship of Agentic Evaluations: How to Evaluate an Agent?
How do you know if your AI agent is actually doing the right thing? In this workshop, we'll explore practical evaluation frameworks for agentic systems. Forming a fellowship of evaluation techniques—from simple unit tests to complex behavioral evaluations—we'll apply them to real agent scenarios. You'll learn to define evaluation criteria, implement automated test suites, measure agent performance quantitatively, and track improvement over time.
From S3 to AI Agent: Your First Queryable Lakehouse
AI agents are only as good as the data they can query. Most agents built today connect to outdated CSVs, unstructured databases, or nothing at all. What if your agent could query a real lakehouse—with versioning, schema evolution, and time travel—using natural language? In this workshop we build exactly that from scratch using only open-source tools that run on your laptop. Starting from a local Docker Compose stack, we stand up a functional lakehouse with MinIO as S3-compatible storage, Apache Iceberg as the table format, Project Nessie as a Git-like versioned catalog, and Trino as the SQL query engine. On top of that, we build a Python MCP server that exposes Iceberg tables as tools for an AI agent, and connect Claude so it can query the lakehouse in natural language.
Vision-Language-Action Models: From Chatbots to Interaction with the Physical World
LLM-powered chatbots marked a before and after in artificial intelligence, enabling systems capable of understanding and generating natural language with great fluency. More recently, multimodal models expanded these capabilities by incorporating images, audio, and video, bringing AI closer to a more complete understanding of its environment. In this talk we'll explore Vision-Language-Action Models (VLA), architectures that combine computer vision, natural language, and decision-making to let intelligent agents interpret their environment and execute actions in the physical world. We'll also see how the Python ecosystem has become a fundamental piece for developing these solutions through modern tools like PyTorch, Hugging Face, robotic simulators, and open source frameworks currently used in robotics and multimodal artificial intelligence.
Provenance by Default: AI Media Pipelines in Python
A model can now generate a video that looks indistinguishable from one your camera recorded. The same is true for an image, a voice, or a song. As Python developers, we are building those pipelines — and we are also the ones who will be asked, very soon, to prove what came out of them. This talk is about building generative media pipelines in Python in a way that answers that question by default. We'll walk through Genblaze, an open-source SDK (github.com/backblaze-labs/genblaze, MIT licensed) that I work on at Backblaze, and use it as a vehicle to talk about the design problems any team faces when wiring AI generation into a real product. We will cover, with live code: the Pipeline pattern with a fluent Pipeline → Step → Run → Manifest API built on Pydantic v2; one API across eleven providers; provenance that survives the file with SHA-256-verified manifests embedded into PNG, JPEG, MP4, MP3, and WAV; privacy and policy controls; storage and replay; and agent loops with lineage. By the end, attendees will have a clear reference for how to architect generative-AI features in Python so that what did this system actually produce, and can I prove it? is a one-line answer instead of a ticket.
hls4ml: From Python Models to Hardware Acceleration
Bridge the gap between Python machine learning and hardware implementation using hls4ml. In this workshop, you'll learn how to take ML models trained in Python (TensorFlow, PyTorch, scikit-learn) and deploy them to FPGAs using the hls4ml library. We'll cover model quantization, hardware-aware training, the HLS synthesis workflow, performance profiling, and practical considerations for deploying ML at the edge. No prior FPGA experience required.
Jerónimo López Gómez
Researcher @ Universidad de Antioquia
Natalia Echeverri Durán
Researcher @ Universidad de Antioquia
Cost Optimization Strategies for GenAI with Python and AWS
Is it possible to scale Generative AI without project success compromising the organization's financial stability? This session will address how to transform the deployment of large language models (LLMs) through architecture design oriented toward operational efficiency. Instead of accepting high token consumption as an inevitable cost, we'll explore a sustainable cost model that lets you build intelligent, scalable applications without sacrificing profitability. Through a technical path centered on Python and AWS services, we'll analyze key strategies such as model arbitrage, where application logic dynamically decides which intelligence engine to use based on task complexity. We'll dive into how smart use of low-impact vector databases and semantic caching reuse prior knowledge, achieving significant infrastructure savings. Attendees will discover how implementing async flows and batch processing optimizes available resources. This talk is a practical guide for architects and developers looking to lead the transition from costly prototypes to production systems that are technically and economically viable.
Python in the Browser: Powered by WebAssembly
What if the browser could run Python as a first-class language? In this talk, I'll show how PyScript makes it possible to execute real Python directly in the browser, powered by WebAssembly. Through a series of exciting, live examples, you'll see Python manipulating the DOM, calling browser APIs, and building interactive experiences, all without a traditional JavaScript codebase. I will also show a couple of examples of how you can embed both JavaScript and Python on PyScript to make even more exciting tools. I will also discuss what WebAssembly is, why it exists, and how it enables languages like Python to run safely and efficiently on the web platform. Finally, I'll discuss when tools like PyScript make sense, and compare it with similar tools. Whether you're a Python developer curious about the frontend, an engineer interested in WebAssembly, or simply someone who enjoys seeing the boundaries of Python pushed, this talk will change how you think about what can run in a browser.
From Expert Judgment to Autonomous Optimization: Encoding Human Expertise into LLM Judges with DSPy
A single misread clause in a reinsurance contract can mean millions in liability. Our LLM pipeline could extract and summarize these documents, but how do you know the output is actually correct? String matching fails ("USD 5,000,000" vs "$5M" scores zero), human review at scale is unaffordable, and a single LLM-as-judge prompt gives inconsistent, uncalibrated scores. The real bottleneck was never generation; it was evaluation. This talk shows how we solved it in two steps, both built entirely in Python. First, we encoded expert evaluation at scale using DSPy to distill judgments from five domain experts into a panel of calibrated LLM judges, each targeting a single quality dimension, weighted to reflect what experts actually care about. Then we closed the loop using DSPy's MIPROv2 and GEPA optimizers, wiring the judge panel as a fitness function and letting the system rewrite prompts autonomously, with regression guards and CI gates so humans review only the final score delta. The stack is Python-native: DSPy, MLflow, LiteLLM, Pydantic. You will leave with a concrete recipe for encoding expert knowledge into automated LLM evaluation and self-improving optimization, applicable to any domain where "correct" is nuanced.
Mateo Rios Querubin
Senior ML Engineer @ Provectus / Universidad EAFIT
Sebastián Gómez Ahumada
Middle ML Engineer @ Provectus
Your LLM Is Bleeding Money and Python Can Stop It
Every token your LLM processes costs money, and without proper observability, costs can spiral out of control. In this workshop, you'll learn how to instrument your Python LLM applications to track token usage, latency, and cost per request. We'll build a complete observability stack using open-source tools, set up alerts for cost anomalies, and implement strategies to cut your LLM bill without sacrificing quality.
From Notebook to Production: End-to-End MLOps on Databricks
Move beyond Jupyter notebooks and deploy machine learning models to production using MLOps best practices on Databricks. In this intermediate workshop, you'll learn to structure ML projects for production, implement CI/CD pipelines for models, manage experiments with MLflow, deploy models as REST APIs, and monitor them in production. We'll walk through a complete end-to-end example from data preparation to automated retraining.
Patterns, Protocols and Tactics for Multi-Agent Systems
Master the essential patterns, protocols, and tactics for building robust multi-agent systems in Python. In this workshop, you'll learn proven architectural patterns for multi-agent collaboration, communication protocols between agents, error handling and recovery strategies, and practical implementation tactics. Drawing from real-world experience, we'll build multiple agent architectures and analyze their trade-offs—giving you a reusable toolkit for designing multi-agent systems.
Python and Machine Learning for Sustainable Thermochemical Optimization
Chemical engineering still relies heavily on costly, slow experimental trials to evaluate operating conditions in thermochemical processes. This talk proposes a practical approach based on Python and machine learning to accelerate that process: building predictive models from physicochemical data that estimate key outcomes without testing every scenario in the lab. A complete flow oriented toward real applications will be shown, from data to decisions, with the goal of reducing analysis time, lowering experimental costs, and supporting process optimization with environmental impact.
NLP in Practice: From Corpus Linguistics to RAG with Python
Bridge the gap between traditional corpus linguistics and modern Retrieval-Augmented Generation (RAG) systems. In this workshop, researchers and developers will learn how classical NLP techniques—corpus analysis, tokenization, and annotation—can inform and improve RAG implementations. We'll use Python to build a pipeline that takes a text corpus from raw collection through linguistic analysis to a queryable RAG system, demonstrating how academic NLP foundations enhance practical AI applications.
Biviana Marcela Suárez Sierra
Researcher @ Universidad EAFIT
Dora Cecilia Alzate Gallo
Researcher @ Universidad EAFIT
Build an OpenClaw-style Coding Assistant on WhatsApp with Claude Agent SDK
Build a fully functional AI coding assistant that lives in WhatsApp, inspired by OpenClaw, using Claude's Agent SDK and Python. In this hands-on workshop, you'll learn to integrate the Claude Agent SDK with the WhatsApp Business API, design conversational flows for code assistance, handle multi-turn conversations with memory, and deploy your assistant to the cloud. Walk away with a working AI coding companion accessible from any device.
From Typosquatting to Infrastructure Poisoning
In 2026, Python supply chain security has moved beyond misspelled package names to become an infrastructure battlefield. This talk analyzes the technical transition from simple Typosquatting attacks to sophisticated poisoning of CI/CD tools and runtime environments. We'll explore recent real cases such as the TeamPCP campaign and the Aqua Security Trivy compromise, analyzing persistence techniques through .pth files that enable malicious execution without an explicit import. Finally, we'll present the roadmap for modern defense: from Sigstore and PEP 740 to compliance with the Cyber Resilience Act (CRA).
Clean Code in the Era of LLMs: Do Good Practices Still Matter?
Instead, research from METR, CodeRabbit, and GitClear is converging on an uncomfortable truth: code duplication has quadrupled, copy-pasted code now exceeds moved code, bugs have risen 70%, and security issues have nearly tripled. AI didn't break our codebases. It amplified what was already broken. So what do we actually do about it? This talk makes the case that clean code, SOLID, DDD, TDD, and design patterns matter more than ever when LLMs write half the code. Your codebase is now a prompt: clean code leads to better AI suggestions, which make it easier to stay clean. We'll walk through which practices now matter more, which ones have quietly turned against you, and how to collaborate with an LLM without becoming a rubber stamp for its output. You'll leave with a concrete framework, Adversarial Collaboration: generate, critique, refactor, verify. Not vibe coding. Real engineering, just faster.
PyBlend: Towards an AI Food Scientist for Nutritional Product Design
Discover how Python and AI are transforming nutritional product design. In this workshop, you'll be introduced to PyBlend, a framework that models the complex optimization problem of designing nutritional formulations. We'll explore how machine learning algorithms can navigate vast ingredient spaces, balance nutritional constraints, and generate novel product formulations. Attendees will gain hands-on experience with AI-driven product design and learn how Python makes interdisciplinary AI applications possible.
Understanding Cognitive Complexity in Python
Modern Python makes it incredibly easy to write code quickly, but much harder to keep it understandable as projects grow. This talk explores cognitive complexity: a metric focused not on what code does, but on how difficult it is for humans to read, reason about, and maintain. Through real Python examples, we will analyze how nested conditionals, branching logic, async flows, exceptions, and growing business rules silently increase the mental load required to work with a codebase. We will also discuss why traditional metrics such as cyclomatic complexity often fail to reflect actual readability, and how cognitive complexity provides a more human-centered perspective on maintainability. The talk includes practical refactoring techniques, common anti-patterns found in production Python projects, and lessons learned while building complexipy, an open source cognitive complexity analyzer for Python written in Rust, designed to provide fast local feedback and CI integration.
Your AI Eval Is Lying To You
When you set temperature=0 and run your AI eval, you expect the same input to give the same output. It doesn't. Recent measurements on Qwen3-235B at temperature=0 produced 80 unique completions on a single prompt. So when your eval reports "92% pass rate," what does that actually mean? This talk is about the gap between how the AI eval ecosystem talks about scores and what those scores can actually support. We walk through five specific tools that fix the gap: Pass@k versus pass^k, Wilson confidence intervals, Bayesian pass@k with Beta-Binomial conjugacy, sequential drift detection with EWMA, CUSUM, and OLS, and family-wise error control via Benjamini-Hochberg procedures. Each method gets a short demo in pure Python with no framework dependency. The audience leaves with reference implementations they can paste into an existing pytest setup tonight.
Beyond Vibe Coding: Spec Driven Development with Code Graphs
Go beyond vibe coding and learn how to use specifications and code graphs to guide AI-assisted development. In this workshop, you'll discover how structured specs and dependency graphs give AI coding tools the context they need to produce coherent, maintainable code. We'll work with real Python projects to define specs, generate code graphs, and wire them into your AI-assisted workflow—resulting in code that actually makes sense architecturally.
Esneider Bravo Benítez
Software Engineer @ Lendingfront
Jonathan Vallejo Muñoz
Software Engineer @ Lendingfront
Now or Never! Token Diet with TOON to Save Money and Help AI Understand More
Tokens cost money, and every unnecessary token you send to an LLM is money wasted. In this workshop, you'll learn how to put your AI applications on a token diet using TOON, a Python tool for creating compact, semantically rich data representations. We'll cover TOON's architecture, how to serialize complex data structures efficiently, measure token reduction, and integrate TOON into existing AI pipelines—without losing the information your models need.