Bauplan Launches with $7.5M Seed Round to Simplify Data Infrastructure with Python-First Serverless Platform

Bauplan, a Python-first serverless data platform, announced today its public launch alongside $7.5 million in seed funding, led by Innovation Endeavors. The round also includes participation from top data infrastructure operators including Wes McKinney, Aditya Agarwal, and Chris Ré, with Ihab Ilyasjoining as an advisor.
Bauplan aims to simplify data infrastructure by allowing developers to write scalable data pipelines and applications in pure Python, without managing Spark, Kubernetes, or complex orchestration tools.
The platform introduces a serverless runtime that operates directly over object storage, supporting modern features such as git-like operations (branch, commit, merge) and zero-copy data versioning.
Rethinking Data Infrastructure for AI-Era Developers
Traditional big data platforms often require specialized skills in SQL, Spark, or cluster orchestration. Bauplan addresses this limitation by making scalable data processing accessible to application developers and software engineers familiar with CI/CD and Python—but not necessarily trained in big data.
Ciro Greco, CEO and Co-founder of Bauplan, said, “We’re building the missing layer between software engineering and data infrastructure — where deploying AI and data apps is as easy as writing Python.”
The founding team, Ciro Greco, Jacopo Tagliabue, and Mattia Pavoni, previously built Tooso, acquired by Coveo, and bring decades of combined experience in machine learning, infrastructure, and open-source contributions (over 50M downloads and 10K GitHub stars).
Also read: CoreWeave Deploys NVIDIA GB200 NVL72 Systems to Power Next-Gen AI Workloads
Core Features and Use Cases
Bauplan is designed for data-intensive domains like machine learning, media analytics, finance, and health tech, and is already in use by customers such as MFE-MediaForEurope, one of Europe’s largest broadcasters.
The platform offers a range of powerful capabilities, including serverless Python functions operating over object storage and native support for Apache Iceberg tables.
It enables CI/CD-style workflows to manage the data lifecycle efficiently, along with zero-copy branching and merging for datasets. Additionally, it features integrated data versioning, allowing for streamlined collaboration and traceability.