From S3 to AI Agent: Your First Queryable Lakehouse
AI agents are only as good as the data they can query. Most agents built today connect to outdated CSVs, unstructured databases, or nothing at all. What if your agent could query a real lakehouse—with versioning, schema evolution, and time travel—using natural language? In this workshop we build exactly that from scratch using only open-source tools that run on your laptop. Starting from a local Docker Compose stack, we stand up a functional lakehouse with MinIO as S3-compatible storage, Apache Iceberg as the table format, Project Nessie as a Git-like versioned catalog, and Trino as the SQL query engine. On top of that, we build a Python MCP server that exposes Iceberg tables as tools for an AI agent, and connect Claude so it can query the lakehouse in natural language.
Workshop requirements
Minimum hardware: laptop with at least 8 GB RAM (16 GB recommended) and 10 GB free disk space. Required software (install before the event): • Docker Desktop: https://www.docker.com/products/docker-desktop • Python 3.11+: https://www.python.org/downloads • Git: https://git-scm.com • Claude Desktop (free): https://claude.ai/download • VS Code (recommended): https://code.visualstudio.com
Speaker
Want to know more?
Join PyCon Colombia newsletter and get a complete overview of our events, speakers and community participation.


