Databricks has unveiled the general availability of Lakebase, an innovative serverless database solution built on PostgreSQL that offers independent scaling of compute and storage resources. This new offering is designed to seamlessly integrate with the Databricks platform, presenting a hybrid architecture that merges transactional and analytical functionalities.
Streamlining Real-Time Applications
The primary objective of Lakebase is to simplify the development of real-time applications and AI workloads by consolidating database management, analytics, and governance into a single cohesive platform. Key features of Lakebase include instant data branching, point-in-time recovery, and unified access controls, all aimed at accelerating development processes, enhancing reliability, and ensuring that operational and analytical data remain synchronized.
Databricks argues that traditional operational databases are ill-equipped to handle the demands of modern AI-driven applications. They describe Lakebase as a pioneering operational database architecture that utilizes lightweight, ephemeral compute resources layered on top of durable data lake storage. The challenges posed by conventional databases are succinctly captured by the Databricks team:
“Because every query competes for the same fixed CPU and memory resources, a single query can affect all live operations. These constraints slow teams down and make it risky to work against live data. As applications become more automated and systems act on data in real time, this kind of shared, fragile infrastructure becomes an even bigger liability. To remove this architectural bottleneck, we created the lakebase category, a new architecture for operational databases that separates compute from storage.”
Lakebase offers a managed PostgreSQL database service that is tightly integrated with the Databricks Data Intelligence Platform. This integration allows for automatic scaling, branching, and compatibility with various Databricks services. Known primarily for its data analytics and AI capabilities centered around Apache Spark, Databricks has now expanded its Lakehouse solution with this new offering. Matei Zaharia, CTO and co-founder of Databricks, shared his insights on LinkedIn:
“We believe this is going to make it radically simpler and more reliable to work with operational databases. You can instantly branch your database, take snapshots, roll back to a point in time, or create another copy for offline analysis, whether it’s humans doing the operations or agents. All while keeping the standard Postgres interface and extensions.”
The managed Lakebase service supports up to 8TB per instance and is built on PostgreSQL 17, featuring pgvector for AI-driven search functionalities. Use cases highlighted in the announcement include real-time feature serving for machine learning, persistent memory for AI agents, and embedded analytics.
Lakebase has been under development since June 2025, leveraging technology acquired from the PostgreSQL company Neon, and further enhanced by the acquisition of Mooncake, which improved PostgreSQL integration with lakehouse data. Currently, Lakebase is available in two versions: Autoscaling and Provisioned. The Autoscaling version is the latest iteration, where new features are actively being developed, while the Provisioned version continues to receive updates to existing capabilities.
Jeremy Daly, co-founder of Ampt and AWS Serverless Hero, remarked in his newsletter about the significance of Lakebase:
“Databricks is turning some heads with its new Lakebase serverless database. Separating storage and compute isn’t anything new, but using a Postgres interface to write directly to lakehouse storage in formats that Spark, Databricks SQL, and other analytics engines can immediately query without ETL is huge.”
For the Autoscaling version, billing is based on usage, calculated in Databricks Units (DBUs) according to the number of Capacity Unit hours consumed by the workload. Customers can define a minimum and maximum auto-scaling range and set a “scale to zero” timeout, with storage billed separately.
Lakebase is now generally available for production use on AWS, while Azure is currently in public preview, with full support expected soon. Google Cloud is anticipated to follow later this year. Additionally, SOC2 and HIPAA certifications are projected for early 2026, and high availability features, such as readable secondaries, are presently exclusive to the Provisioned version.