• Cloud Database Insider
  • Posts
  • 🚨SAP to Buy Dremio🚨|Databricks and BigQuery unify data catalogs|Vector DBs Decline|BigQuery boosts AI

🚨SAP to Buy Dremio🚨|Databricks and BigQuery unify data catalogs|Vector DBs Decline|BigQuery boosts AI

Deep Dive: Real World Agentic Databases

In partnership with

Databricks and BigQuery unify data catalogs collaboratively 🌐

Vector databases integrate into broader AI platforms now ☁️

Google Cloud adds AI, automating BigQuery queries ⚙️

Also, check out the weekly Deep Dive - Real World Agentic Databases

The best prompt engineers aren't typing. They're talking.

Power users figured this out early: speaking a prompt gives you 10x more context in half the time. You include the edge cases, the examples, the tone you want — because talking is fast enough that you don't skip them.

Wispr Flow captures everything you say and turns it into clean, structured text for any AI tool. Speak messy. Get polished input. Paste into ChatGPT, Claude, Cursor, or wherever you work.

89% of messages sent with zero edits. 4x faster than typing. Works system-wide on Mac, Windows, and iPhone.

CLOUD DATABASES

TL;DR: SAP intends to acquire Dremio, bringing Dremio’s Iceberg-native, agentic lakehouse capabilities into SAP Business Data Cloud to unify SAP application data with broader enterprise data for governed analytics and AI agents.

  • SAP plans to acquire Dremio, pending regulatory approval, to strengthen its open data platform strategy and accelerate its agentic AI vision.

  • Dremio’s platform brings lakehouse, query federation, Apache Iceberg, Apache Arrow, and Apache Polaris capabilities that help enterprises analyze data across lakes, warehouses, databases, and SaaS systems.

  • The combination is intended to let SAP customers combine SAP application data with non-SAP enterprise data in a single governed platform for analytics, AI agents, and business workflows.

  • Dremio says SAP remains committed to open-source technologies including Apache Iceberg, Apache Polaris, and Apache Arrow, positioning the acquisition around interoperability and open data architectures.

Why this matters: SAP’s planned acquisition of Dremio signals a major push to make the lakehouse central to enterprise AI and analytics. By combining SAP’s application data, business context, and AI agent strategy with Dremio’s Iceberg-native lakehouse and query federation capabilities, SAP could give customers a more unified way to govern, analyze, and activate data across complex enterprise environments. This also reinforces the growing importance of open table formats, metadata catalogs, and agent-ready data platforms in the next phase of cloud data architecture.

DATA GOVERNANCE

TL;DR: Databricks and Google BigQuery integrated their data catalogs, enabling unified metadata, consistent security, and seamless cross-cloud data sharing to enhance collaboration and support hybrid multi-cloud strategies.

  • Databricks and Google BigQuery have partnered to integrate their data catalog services for unified data management.

  • The integration enables synchronization of metadata and consistent security policies across Databricks and BigQuery platforms.

  • Users can discover and share datasets seamlessly across cloud environments, boosting collaboration and data accessibility.

  • This collaboration supports hybrid multi-cloud strategies, enhancing data governance and accelerating data-driven decisions.

Why this matters: The Databricks-BigQuery catalog integration streamlines data governance and discovery across clouds, breaking down silos that hinder enterprise collaboration. This advancement empowers hybrid cloud strategies, enabling faster, more secure, and unified access to data—critical for informed, agile decision-making in increasingly complex IT environments.

VECTOR DATABASES

TL;DR: Organizations and tech giants are moving away from standalone vector databases, integrating vector search into broader platforms for better scalability, simplicity, and AI service efficiency, signaling commoditization and infrastructure evolution.

  • Many organizations are moving away from standalone vector databases due to scalability and complexity challenges.

  • Tech giants embed vector search into broader database and machine learning platforms for streamlined AI infrastructure.

  • Hybrid architectures favor integrating vector search with traditional data stores, reducing operational overhead.

  • The shift signals vector databases becoming commoditized features, driving innovation in combined data management solutions.

Why this matters: The decline of standalone vector databases reflects AI infrastructure maturation, pushing the industry toward integrated, hybrid data architectures. This reduces complexity and boosts scalability, fostering innovation and cost-efficiency, ultimately reshaping how enterprises manage vector data and accelerating practical AI application development.

GOOGLE CLOUD PLATFORM

TL;DR: Google Cloud enhanced BigQuery with generative AI, autonomous agents, and new connectors, enabling natural language queries, automated data workflows, and improved interoperability for smarter, faster enterprise decision-making.

  • Google Cloud integrates generative AI into BigQuery, enabling natural language queries and reducing SQL reliance.

  • Enhanced support for autonomous AI agents facilitates dynamic querying and automated data-driven decision-making.

  • New connectors and APIs improve interoperability between BigQuery, AI models, and external applications.

  • These updates promote automated, intelligent data operations, boosting efficiency and competitive advantage in enterprises.

Why this matters: By embedding generative AI and autonomous agents into BigQuery, Google Cloud revolutionizes data analysis, simplifying access and accelerating insights. This fosters smarter, faster decision-making, reduces reliance on specialized skills, and enhances enterprise agility, marking a major leap toward fully automated, intelligent data ecosystems in business.

EVERYTHING ELSE IN CLOUD DATABASES

DEEP DIVE

Real World Agentic Databases

I watched a video this week.

The things that I saw in the video above absolutely terrified me, annoyed me, and at the same time I knew for certain, things like this would come to pass.

Just imagine talking to a prompt and spinning up 10 databases in one uttered sentence.

The resultant 10 databases being created was not for some OLTP use case, but for use in an agentic AI use case.

And then, on a whim, destroying 10 databases by speaking to your computer.

I was a de facto DBA for maybe the first 15 years of my career.

I take pride that I never nuked a database in my whole career, and took great care to always have backups available, from physical tapes stored off site to cloud based restoration Azure mechanisms.

So just think that now with what I call agentic databases, the notion of the micromanager DBA has changed dramatically.

So with that ramble and preamble, welcome to Ghost (Towards Data Science beat me to the punch by a day).

In a nutshell Ghost is positioned as “a database for AI agents”: a cloud Postgres service where developers or coding agents can quickly create, fork, query, tune, and delete databases through a CLI or MCP server. 

It fits the current moment because developers increasingly use tools like Codex, Claude Code, Cursor, Windsurf, and VS Code agents to build applications, run experiments, and prototype systems. Ghost gives those agents a database environment they can manipulate directly.

What Ghost actually does

Ghost provides unlimited Postgres databases and forks, with the ability to create and discard them freely. Its official quick start is built around commands such as ghost login, ghost mcp install, ghost create, and ghost list; the docs also list commands for connecting, forking, pausing, resuming, deleting, viewing logs, and managing API keys.

The big design idea is that a database should be as easy to spin up as a code branch or a temporary sandbox.

Instead of treating every database as long-lived infrastructure that requires provisioning, dashboards, credentials, and manual lifecycle management, Ghost treats databases as programmable, disposable environments for development, experimentation, testing, and agent-driven workflows. Disposable is the part that gets me.

Why it matters

Traditional managed databases were designed for human administrators and production systems, while AI agents need something different: fast, isolated, scriptable database environments that can be created and destroyed repeatedly.

Ghost’s MCP support is central here because it lets coding agents interact with database operations directly rather than forcing the human to manually create databases, copy connection strings, run migrations, or clean up test environments.

For example, an AI coding agent could create a database, generate a schema, seed test data, run queries, test an index, fork the database, compare performance, and discard the losing version.

That is a very different workflow from manually provisioning a Postgres instance in RDS, Azure Database for PostgreSQL, Supabase, or Neon for every experiment.

Best way to interpret it

Ghost is not a replacement for enterprise production databases yet. It is better understood as an agent-native Postgres workbench: useful for prototyping, temporary environments, app scaffolding, schema experimentation, test data generation, migration testing, and AI-assisted development.

The daily.dev explicitly frames it as “ideal for prototyping, testing, and agent-driven experimentation rather than production workloads.”

This is exciting, and a little disconcerting at the same time. At this point, we just have to understand why this new paradigm has evolved.

Gladstone Benjamin

🚀 Work With Cloud Database Insider

Looking to reach enterprise data engineers and architects?

Limited sponsorship slots available each month.