Discover the Talks at PyCon Colombia 2026 ✨
Browse every accepted session—titles, tracks, levels, and speakers—before you plan your days in Medellín.
Machine Learning Applied to Genetic Sequences
DNA contains massive amounts of biological information, but how can artificial intelligence help us understand it? In this talk, we will explore how Python and Machine Learning can be used to analyze genetic sequences in a practical and beginner-friendly way. Using public biological datasets, we will demonstrate how DNA sequences can be transformed into data suitable for machine learning models, covering concepts such as feature extraction, sequence representation, and basic classification techniques. We will also review popular Python tools used in bioinformatics, including Biopython, pandas, and scikit-learn, while discussing real-world challenges when working with biological data, such as high dimensionality, noise, and interpretability limitations. By the end of the talk, attendees will have a clear understanding of how to start building genetic analysis projects using accessible tools from the Python ecosystem, even without prior bioinformatics experience.
STUART: An Autonomous Hacker Agent Built in Python
What if you give a Python agent an IP address and ask it to find the server's vulnerabilities on its own? That's exactly what I did. In this talk I present STUART, an autonomous pentesting agent I built with AG2 (AutoGen) and GPT-4. The agent can analyze target systems without human intervention, following the first stages of the Cyber Kill Chain: reconnaissance and vulnerability identification. The architecture is 100% Python: an AssistantAgent backed by GPT-4 that reasons and plans, and a UserProxyAgent with a Code Executor that interacts directly with the target system. All orchestrated by AG2, the open-source framework for building multi-agent systems. The talk includes a live demo where STUART will analyze a vulnerable system deployed in Docker. You'll see step by step how the agent scans ports, identifies services, detects vulnerabilities, and reports findings—all autonomously, deciding for itself what to do at each step. You'll take away practical knowledge on how to build agents that act in the real world with AG2, and a concrete perspective on what offensive AI can do today. If a Python agent can find your vulnerabilities, how should defense teams prepare? All demonstrations are performed in controlled, ethical environments.
Vulnerable AI Systems: Real Data, Responsible Design
29% of attacks bypass the security filters of the most widely used LLMs in production. It's not a bug. It's the nature of the system. LLMs are stochastic processes trained on human language—the most flexible, ambiguous, and manipulable medium that exists. This talk presents the results of llm-break-bench: 3,360 adversarial tests on GPT-4o, Claude, Gemini, Grok, and DeepSeek using MLCommons AI Safety v0.5 and OWASP LLM Top 10 as standards. The smartest model in the benchmark is 5 times more vulnerable than the cheapest. The data connects to real use cases where LLMs are in production: RAGs, chatbots, agents, code assistants. The closing is actionable: 5 design pillars for AI systems that don't depend on the model for their own security, with real code from NVIDIA NeMo Guardrails and Meta LlamaFirewall.
From Voice to Action: Building an AI Assistant with Python and Google Workspace
Jumping between Gmail, Calendar, Drive, and Jira tabs for repetitive tasks is exhausting. That's why we built Attento, an assistant that lets you execute real actions in Google Workspace using natural language. In this talk we build Attento, an end-to-end voice assistant that turns natural language into real actions across Google Workspace. We'll cover architecture with FastAPI, OAuth 2.0 authentication with PKCE, function calling with Gemini, streaming with NDJSON, best practices with uv and Pydantic Settings, and the path from demo to production with Postgres and automated morning briefings.
Juan Manuel Marín Bedoya
Senior Data Engineer @ Huge
Juliana Suárez Ávila
Data Scientist @ Cuesta Partners
Not Every Nail Needs an AI Hammer: Architectures That Think Before They Generate
We live in an era where everything "needs generative AI"... or so we're told. In this talk I'll cut through the hype to talk about what really matters: designing clean, intentional, and sustainable architectures. We'll explore how to combine the best of the traditional world with emerging tools without falling into over-engineering. Because sometimes a well-placed regex beats a multi-million-parameter LLM. If you're tired of seeing Ferraris parked at the supermarket, this talk is for you.
The GenAI Revolution Reaches RecSys
When we talk about the generative AI revolution, the conversation usually stays close to chatbots, image generation, and code assistants. But the same architectures that powered that wave (transformers, autoregressive modeling, scaling laws) are quietly reshaping fields most people don't associate with GenAI at all. Recommender systems are one of the most interesting examples. Meta, Netflix, Google, Spotify and others are replacing decades-old recsys pipelines with transformer-based foundation models, and the results are hard to ignore. This talk is a practical tour of that shift from a Python engineer's seat.
Hacking AI Agents with Python
Artificial intelligence is evolving from static models to autonomous systems capable of reasoning, making decisions, and executing actions through tools and APIs. These systems, known as AI agents, are primarily built in Python. But with this evolution comes a new attack surface. In this talk we'll explore how AI agents can be exploited from an offensive perspective, using Python to demonstrate real attacks such as: prompt injection in agent pipelines, information exfiltration through RAG, decision manipulation through adversarial inputs, and abuse of connected tools and APIs. From these scenarios, we'll show how to design security testing (pentesting) specific to AI systems, including black-box, gray-box, and white-box approaches. The talk won't focus only on attacks but also on how to mitigate them, presenting a practical roadmap to evaluate and strengthen AI systems in production. This session is aimed at Python developers, data scientists, and engineers building or integrating AI systems who want to understand how to secure what they're creating.
Employability in the Age of AI
Artificial intelligence is changing the job market faster than ever. Many developers wonder: will AI replace me or empower me? In this talk I'll share my real experience going from being a developer in Latin America to working for companies in the United States—facing interviews, optimizing my professional profile, and adapting to an environment where AI is already part of daily life. We'll explore how AI doesn't replace the developer but redefines the value we bring: from writing code to solving real problems, communicating ideas, and building complete solutions. The talk will cover the future of programming, how to shift your mindset toward AI, which skills really matter today, how to stand out in international hiring processes, the role of AI tools in your professional growth, and common mistakes that hold back your employability.
Structured Learning: AI-Powered Platform That Transforms Academic Papers into Interactive Learning Experiences
Structured Learning is a platform that turns a research paper into a complete learning module—chapter-by-chapter explanations, incremental executable code, RAG chat, FSRS spaced-repetition flashcards, equation derivations, and a knowledge graph in Neo4j. This talk covers the product, the engineering of an agentic workflow pipeline that takes a GitHub issue to a merged PR with isolated worktrees, auto-patching after failed review, and GitHub as the agents' API, and how it runs on AWS with LocalStack for dev-prod parity. Agents don't replace engineers—they replace the glue between engineers and the boring 80% of the SDLC—and that's where compound returns live.
Leverage your Python skill using the Python interpreter
In this talk, I'll challenge the audience's mindset about Python. Python is not an interpreter, and in fact, there are multiple Python interpreters—each with its own architecture and purpose. I'll walk through Python's core internals and show how programming languages interact beneath the surface. We'll explore how to write better Python by understanding the garbage collector, what you can build using the AST, how to read and leverage the disassembler, and the practical implications of Python's transition from its old LL(1) parser to the current PEG parser. We'll also dive into lesser-known features of Python interpreters, what a PEP really is and how it shapes the language, and conclude with a deep look at Python without the GIL—what changes, what breaks, and how the core team removed it. Throughout the talk, I'll share personal stories, including battles caused by identical ASTs and the moment I believed I had discovered a way to speed up the Python interpreter itself.
Feeding the Invisible: Food Security in Intermediate Cities with Python
In many countries, food insecurity is not only a social problem but also a data problem. In Colombia, key monitoring systems have lost continuity, leaving critical information gaps for public decision-making. This talk presents the development of a Python prototype to build a monitoring and prediction system for food insecurity risk in intermediate cities, using only open data. From a reproducible pipeline, multiple data science components are integrated: ingestion and processing of food price data (SIPSA), time series models for price forecasting (including classical approaches and machine learning like XGBoost), household segmentation through clustering from socioeconomic surveys, construction of a composite index relating income, prices, and vulnerability, and development of a decision support system (DSS) prototype. Attendees will take away a replicable approach for building complex indicators, strategies for working with imperfect open data, ideas for integrating models, socioeconomic data, and visualization in a single system, and a real example of applying Python in public policy and territorial development.
Opening the Black Box: Mechanistic Interpretability of LLMs
As agents are deployed in high-stakes contexts (finance, manufacturing, healthcare), understanding how they make decisions—and not just what they decide—becomes fundamental to safety and trust. For example, when an agent receives the instruction "Search for our company's third-quarter results" and chooses to search internal documents instead of the public web, what internal process drives that choice? Answer engineering, behavioral testing, and chain-of-thought analysis describe correlations or narratives; none reveals the actual mechanism. Understanding how an agent reaches a conclusion is a critical component of developing AI responsibly, especially regarding reliability and transparency in AI systems. Model interpretability is one way developers can build trust and consistency in their systems and support the safe deployment of AI agents.
From Vibe Coding to Spec-Driven Development with AWOS in Claude Code
Vibe coding works great until it doesn't. When AI agents start ignoring your architecture, making wrong assumptions about your stack, and producing code that compiles but misses the point, the problem isn't the model. It's the instructions. This talk introduces AWOS (Agentic Workflow Operating System), an open-source framework built by Provectus for Claude Code that brings Spec-Driven Development to AI-assisted coding. AWOS structures the development process into 8 phases, each with its own specialized agent and audience. What you'll see: a live demo building a conference talk management app. What you'll take home: a tool you can install with npx @provectusinc/awos and start using immediately.
Vision-Language-Action Models: From Chatbots to Interaction with the Physical World
LLM-powered chatbots marked a before and after in artificial intelligence, enabling systems capable of understanding and generating natural language with great fluency. More recently, multimodal models expanded these capabilities by incorporating images, audio, and video, bringing AI closer to a more complete understanding of its environment. In this talk we'll explore Vision-Language-Action Models (VLA), architectures that combine computer vision, natural language, and decision-making to let intelligent agents interpret their environment and execute actions in the physical world. We'll also see how the Python ecosystem has become a fundamental piece for developing these solutions through modern tools like PyTorch, Hugging Face, robotic simulators, and open source frameworks currently used in robotics and multimodal artificial intelligence.
Provenance by Default: AI Media Pipelines in Python
A model can now generate a video that looks indistinguishable from one your camera recorded. The same is true for an image, a voice, or a song. As Python developers, we are building those pipelines — and we are also the ones who will be asked, very soon, to prove what came out of them. This talk is about building generative media pipelines in Python in a way that answers that question by default. We'll walk through Genblaze, an open-source SDK (github.com/backblaze-labs/genblaze, MIT licensed) that I work on at Backblaze, and use it as a vehicle to talk about the design problems any team faces when wiring AI generation into a real product. We will cover, with live code: the Pipeline pattern with a fluent Pipeline → Step → Run → Manifest API built on Pydantic v2; one API across eleven providers; provenance that survives the file with SHA-256-verified manifests embedded into PNG, JPEG, MP4, MP3, and WAV; privacy and policy controls; storage and replay; and agent loops with lineage. By the end, attendees will have a clear reference for how to architect generative-AI features in Python so that what did this system actually produce, and can I prove it? is a one-line answer instead of a ticket.
Building a Transformer with Rust
Transformers are often perceived as incomprehensible giants. This talk aims to prove the opposite: they are not black boxes but elegant mechanisms that can be understood and mastered from their fundamentals. We present Molinete AI, a GPT-2-style model built strictly from scratch in Rust. No deep learning frameworks—just tensors, math, and full control. Inspired by Feste from Tag1 Consulting (trained on Shakespeare), this project poses a different challenge: training the network on Miguel de Cervantes's work to generate text in the style of the Golden Age. Throughout the session we'll break the model down piece by piece. With the support of a Manim animated presentation (over 4,000 lines of code), we'll make visible how information flows inside the network. We'll start from tokenization (BPE) and building basic operations, then dive into the core of the model: embeddings, causal mask, and Multi-Head Self-Attention. Finally, we'll explore the learning process, watching how gradients flow through the network during training. More than a demo, this talk aims to provide a clear, operational view of Transformers, connecting theory with a real from-scratch implementation.
NLP Without Labels: How to Cluster N Legal Processes of the Colombian State and Turn Chaos into a Production Classifier
What do you do when you have 600,000 legal complaints, zero labeled data, and a government entity waiting for results? This talk walks through the full process of building an unsupervised NLP classification system for the Procuraduría General de la Nación. Starting from raw administrative text—noisy, full of abbreviations and institutional jargon—I'll show how TF-IDF, truncated SVD, and KMeans combined to organize more than half a million records into 64 semantically coherent groups, without a single manual label. But clustering is only the starting point. I'll cover how clusters were validated, how a Logistic Regression classifier was trained on them to make the system deployable, and how the final pipeline was packaged in a .pkl that non-technical colleagues use in production today. Along the way we'll face real problems: elbow curves that don't behave, 1:20 size imbalances between clusters, and the tension between mathematical elegance and institutional usability. Because in the public sector, a model nobody uses isn't a model—it's a PDF gathering dust.
Cost Optimization Strategies for GenAI with Python and AWS
Is it possible to scale Generative AI without project success compromising the organization's financial stability? This session will address how to transform the deployment of large language models (LLMs) through architecture design oriented toward operational efficiency. Instead of accepting high token consumption as an inevitable cost, we'll explore a sustainable cost model that lets you build intelligent, scalable applications without sacrificing profitability. Through a technical path centered on Python and AWS services, we'll analyze key strategies such as model arbitrage, where application logic dynamically decides which intelligence engine to use based on task complexity. We'll dive into how smart use of low-impact vector databases and semantic caching reuse prior knowledge, achieving significant infrastructure savings. Attendees will discover how implementing async flows and batch processing optimizes available resources. This talk is a practical guide for architects and developers looking to lead the transition from costly prototypes to production systems that are technically and economically viable.
Python in the Browser: Powered by WebAssembly
What if the browser could run Python as a first-class language? In this talk, I'll show how PyScript makes it possible to execute real Python directly in the browser, powered by WebAssembly. Through a series of exciting, live examples, you'll see Python manipulating the DOM, calling browser APIs, and building interactive experiences, all without a traditional JavaScript codebase. I will also show a couple of examples of how you can embed both JavaScript and Python on PyScript to make even more exciting tools. I will also discuss what WebAssembly is, why it exists, and how it enables languages like Python to run safely and efficiently on the web platform. Finally, I'll discuss when tools like PyScript make sense, and compare it with similar tools. Whether you're a Python developer curious about the frontend, an engineer interested in WebAssembly, or simply someone who enjoys seeing the boundaries of Python pushed, this talk will change how you think about what can run in a browser.
From Expert Judgment to Autonomous Optimization: Encoding Human Expertise into LLM Judges with DSPy
A single misread clause in a reinsurance contract can mean millions in liability. Our LLM pipeline could extract and summarize these documents, but how do you know the output is actually correct? String matching fails ("USD 5,000,000" vs "$5M" scores zero), human review at scale is unaffordable, and a single LLM-as-judge prompt gives inconsistent, uncalibrated scores. The real bottleneck was never generation; it was evaluation. This talk shows how we solved it in two steps, both built entirely in Python. First, we encoded expert evaluation at scale using DSPy to distill judgments from five domain experts into a panel of calibrated LLM judges, each targeting a single quality dimension, weighted to reflect what experts actually care about. Then we closed the loop using DSPy's MIPROv2 and GEPA optimizers, wiring the judge panel as a fitness function and letting the system rewrite prompts autonomously, with regression guards and CI gates so humans review only the final score delta. The stack is Python-native: DSPy, MLflow, LiteLLM, Pydantic. You will leave with a concrete recipe for encoding expert knowledge into automated LLM evaluation and self-improving optimization, applicable to any domain where "correct" is nuanced.
Mateo Rios Querubin
Senior ML Engineer @ Provectus / Universidad EAFIT
Sebastián Gómez Ahumada
Middle ML Engineer @ Provectus
How We Stopped Answering Data Questions and Built the Stack That Answers Them
If you've worked at a growing startup, you probably know the feeling: multiple teams pulling different numbers for the same metric, ops constantly asking engineering for basic answers, and creating or organizing metrics that's a real pain. Every new question feels like starting from scratch. This talk is the story of how a small team fixed that. First, by building a proper dbt architecture from scratch with Sources, Staging, Intermediate, and Marts so that things like bookings, revenue, and providers were defined in one place and everyone was looking at the same number. Once the data was reliable, we connected an LLM so non-technical teammates could ask questions in plain English and get real answers directly from Snowflake. No SQL, no ticket, no waiting on engineering. You'll walk away with a clear mental model for building a dbt layer people actually trust, a practical architecture for connecting an LLM to your warehouse, and the one thing that made it all click: your dbt docs are your LLM prompt.
Elevate your code quality in Python with modern, ultra-fast tooling
AI coding assistants have changed how we build software. We can now generate features, refactors, and entire services in minutes — but speed without strong engineering practices quickly becomes technical debt. In this talk, I'll show how modern Python teams can build fast and reliable development workflows using tools like Astral's Ruff, Ty, and uv. We'll explore how traditional slow and noisy quality pipelines are being replaced by a new generation of tooling that provides near-instant feedback while improving code quality and developer experience. Topics include why AI-generated code makes automated quality gates more important than ever, using Ruff for formatting and linting, using Ty for modern static typing, structuring formatter → linter → type-checker workflows, pre-commit hooks and CI pipelines developers actually enjoy using, and reducing friction between local development and CI/CD.
Real-Time Voice Systems: Design and Architecture in 5 Levels
Voice systems have advanced rapidly in recent years, but most implementations still stop at demos: simple combinations of Speech-to-Text, language models, and Text-to-Speech that work in controlled environments but fail when facing real-world conditions. This talk proposes a different approach: understanding voice systems as an architecture that evolves through maturity levels, from basic prototypes to real-time production-ready systems. Through a 5-level framework, we'll walk the full path of a Conversational AI system: from integrating basic components, through orchestration challenges (streaming, latency, turn-taking), to less obvious but critical problems like audio quality, robustness, and user experience, reaching real-time architectures with technologies like LiveKit, and finally exploring where the future is headed with end-to-end systems and multimodal agents. The talk is based on real experience building voice systems in production and focuses on engineering decisions more than specific tools. Attendees will leave with a clear understanding of how to design modern voice systems with Python, what problems to anticipate, and how to structure their own architectures to build world-class conversational experiences.
Python and Machine Learning for Sustainable Thermochemical Optimization
Chemical engineering still relies heavily on costly, slow experimental trials to evaluate operating conditions in thermochemical processes. This talk proposes a practical approach based on Python and machine learning to accelerate that process: building predictive models from physicochemical data that estimate key outcomes without testing every scenario in the lab. A complete flow oriented toward real applications will be shown, from data to decisions, with the goal of reducing analysis time, lowering experimental costs, and supporting process optimization with environmental impact.
From Typosquatting to Infrastructure Poisoning
In 2026, Python supply chain security has moved beyond misspelled package names to become an infrastructure battlefield. This talk analyzes the technical transition from simple Typosquatting attacks to sophisticated poisoning of CI/CD tools and runtime environments. We'll explore recent real cases such as the TeamPCP campaign and the Aqua Security Trivy compromise, analyzing persistence techniques through .pth files that enable malicious execution without an explicit import. Finally, we'll present the roadmap for modern defense: from Sigstore and PEP 740 to compliance with the Cyber Resilience Act (CRA).
Clean Code in the Era of LLMs: Do Good Practices Still Matter?
Instead, research from METR, CodeRabbit, and GitClear is converging on an uncomfortable truth: code duplication has quadrupled, copy-pasted code now exceeds moved code, bugs have risen 70%, and security issues have nearly tripled. AI didn't break our codebases. It amplified what was already broken. So what do we actually do about it? This talk makes the case that clean code, SOLID, DDD, TDD, and design patterns matter more than ever when LLMs write half the code. Your codebase is now a prompt: clean code leads to better AI suggestions, which make it easier to stay clean. We'll walk through which practices now matter more, which ones have quietly turned against you, and how to collaborate with an LLM without becoming a rubber stamp for its output. You'll leave with a concrete framework, Adversarial Collaboration: generate, critique, refactor, verify. Not vibe coding. Real engineering, just faster.
Understanding Cognitive Complexity in Python
Modern Python makes it incredibly easy to write code quickly, but much harder to keep it understandable as projects grow. This talk explores cognitive complexity: a metric focused not on what code does, but on how difficult it is for humans to read, reason about, and maintain. Through real Python examples, we will analyze how nested conditionals, branching logic, async flows, exceptions, and growing business rules silently increase the mental load required to work with a codebase. We will also discuss why traditional metrics such as cyclomatic complexity often fail to reflect actual readability, and how cognitive complexity provides a more human-centered perspective on maintainability. The talk includes practical refactoring techniques, common anti-patterns found in production Python projects, and lessons learned while building complexipy, an open source cognitive complexity analyzer for Python written in Rust, designed to provide fast local feedback and CI integration.
Your AI Eval Is Lying To You
When you set temperature=0 and run your AI eval, you expect the same input to give the same output. It doesn't. Recent measurements on Qwen3-235B at temperature=0 produced 80 unique completions on a single prompt. So when your eval reports "92% pass rate," what does that actually mean? This talk is about the gap between how the AI eval ecosystem talks about scores and what those scores can actually support. We walk through five specific tools that fix the gap: Pass@k versus pass^k, Wilson confidence intervals, Bayesian pass@k with Beta-Binomial conjugacy, sequential drift detection with EWMA, CUSUM, and OLS, and family-wise error control via Benjamini-Hochberg procedures. Each method gets a short demo in pure Python with no framework dependency. The audience leaves with reference implementations they can paste into an existing pytest setup tonight.
Lessons Learned Reporting Vulnerabilities in the Python Ecosystem
You've surely received that notification telling you to update a dependency due to a security flaw. But have you wondered what happens from when someone discovers that vulnerability until the patch reaches your project? In this talk I'll share my experience reporting vulnerabilities in the Python ecosystem. We'll explore the behind the scenes: from the technical finding and reporting process to collaboration with maintainers and patch publication. We'll address not only technical aspects but also the human factor—both crucial for effective vulnerability resolution. The challenges maintainers and the community face, especially in this new era of open source software security where artificial intelligence plays an increasingly relevant role.
Camila Plejia, Virtual Assistant Applied to People with Tetraplegia
The combination of different tools and technologies in artificial intelligence—Computer Vision, OCR, NLP, RPA, Voice to text, text to voice—gives rise to a virtual assistant, Camila Plejia, that helps people with tetraplegia, facilitating their daily tasks such as reading news, checking the weather, reviewing, reading and writing email, reviewing, sending and reading WhatsApp messages, searching for and watching a specific video on YouTube, among others. It allows a person with tetraplegia to have a window of communication with the outside world, considering they spend much time isolated between four walls and depend on a third party's assistance to perform activities.