- Cloud Database Insider
- Posts
- Databricks buys Mooncake Labs🤝|BlackRock $4B deal for data centers💼|Cybercriminals exploit MS SQL Servers 🔓 |Microsoft AI Tour Toronto Review
Databricks buys Mooncake Labs🤝|BlackRock $4B deal for data centers💼|Cybercriminals exploit MS SQL Servers 🔓 |Microsoft AI Tour Toronto Review
Databricks is still shopping for companies

What’s in today’s newsletter:
Databricks buys Mooncake Labs for Lakehouse boost🤝
BlackRock eyes $4B deal for data centers expansion đź’Ľ
Cybercriminals exploit MS SQL Servers for ransomware attacks 🔓
Batch vs. streaming: balancing real-time and batch processing ⚖️
Data Lakehouse Market Hits $112.6B by 2035📊
Also, check out the weekly Deep Dive - Microsoft AI Tour Toronto Review, and Everything Else in Cloud Databases.
BTW, congratulations to the winners of the prizes that filled out the subscriber survey (Sean, Gray, Mark, Dominique, Doug, Ellie, and Matt). Thank you for filling out the survey. Your prizes have been sent to you.
Join 400,000+ executives and professionals who trust The AI Report for daily, practical AI updates.
Built for business—not engineers—this newsletter delivers expert prompts, real-world use cases, and decision-ready insights.
No hype. No jargon. Just results.
DATABRICKS

TL;DR: Databricks acquired Mooncake Labs to enhance its unified lakehouse platform, improving data management, performance, and integration, accelerating enterprise adoption, and challenging traditional data warehousing solutions.
Databricks acquired Mooncake Labs to enhance its unified lakehouse data platform for better data management.
Mooncake Labs' cloud-native expertise will improve performance, governance, and integration in Databricks’ platform.
The acquisition aims to accelerate enterprise adoption of lakehouse architecture by simplifying data integration challenges.
Databricks strengthens its position in next-gen data infrastructure, challenging traditional data warehousing solutions.
Why this matters: Databricks’ acquisition of Mooncake Labs accelerates the shift toward unified lakehouse architectures, offering enterprises a more scalable, integrated, and cost-effective alternative to traditional data warehouses. This development advances data-driven decision-making and could reshape future cloud data management strategies industry-wide.
INFRASTRUCTURE

TL;DR: BlackRock is negotiating a $4 billion acquisition of Aligned Data Centers, aiming to expand its data infrastructure portfolio amid rising cloud demand and increasing investor interest in data centers.
BlackRock is in talks to acquire Aligned Data Centers for approximately $4 billion, expanding its data infrastructure presence.
Aligned Data Centers serves major cloud providers and enterprises with hyperscale facilities supporting digital transformation.
Rising demand for cloud computing has fueled investor interest in data center real estate, making Aligned a prime acquisition target.
The deal would boost BlackRock’s scale in digital infrastructure, intensifying competition and potential consolidation in the data center sector.
Why this matters: BlackRock's potential $4 billion acquisition of Aligned Data Centers highlights the strategic pivot of financial giants toward digital infrastructure. This move underscores the surging value of data center assets amid cloud demand growth, signaling intensified competition and possible consolidation shaping the future of technology real estate investments.
SQL SERVER

TL;DR: Cybercriminals increasingly hijack vulnerable Microsoft SQL Servers using brute-force attacks and exploits to deploy ransomware, urging organizations to enforce patching, strong authentication, and network segmentation to prevent costly breaches.
Cybercriminals exploit Microsoft SQL Server vulnerabilities to hijack servers and deploy ransomware attacks.
Attackers use automated scripts, brute-force credential attacks, and exploit weak security controls.
Compromised servers enable lateral movement within networks to increase damage and ransom demands.
Organizations must implement regular patching, strong authentication, and network segmentation to prevent breaches.
Why this matters: The exploitation of MS SQL Server vulnerabilities for ransomware reveals a critical cybersecurity risk to business data and operations. Without proactive security practices like patching and strong authentication, organizations face increased financial loss, operational disruption, and escalating threats as attackers evolve their methods.
DATA ENGINEERING
TL;DR: The article compares batch processing and streaming for big data, highlighting batch's simplicity for non-urgent analysis, streaming's real-time advantages, and hybrid approaches balancing both for optimal data handling.
Batch processing collects and analyzes large data sets at intervals, ideal when latency is not critical.
Streaming processes data in real-time, supporting applications like fraud detection and live monitoring.
Streaming systems are complex, requiring robust data consistency management amidst continuous data flow.
Many modern architectures combine batch and streaming to balance historical analysis with real-time insights.
Why this matters: Understanding when to use batch versus streaming data processing enables organizations to optimize analytics effectively. Streaming delivers critical real-time insights for urgent decisions, while batch suits extensive historical analysis. Hybrid approaches maximize strengths of both, enhancing data-driven responsiveness and operational efficiency in complex environments.
DATA LAKEHOUSE ARCHITECTURE
TL;DR: The data lakehouse market will grow to $112.6 billion by 2035, driven by cloud adoption and AI, enabling unified, efficient data management and real-time processing for competitive, cost-effective businesses.
The global data lakehouse market is expected to reach USD 112.6 billion by 2035, driven by big data needs.
Data lakehouses combine data lakes and warehouses, enabling efficient storage, processing, and unified data management.
Rising cloud adoption and AI advancements fuel demand for real-time data processing and seamless integration.
Market growth signals a shift to integrated data architectures, cutting IT complexity and enhancing competitive advantage.
Why this matters: The surge in the data lakehouse market underscores a critical shift toward unified data management, enabling businesses to leverage big data and AI more efficiently. This evolution reduces IT complexity and drives innovation, positioning companies for stronger competitiveness in an increasingly data-centric global economy.

EVERYTHING ELSE IN CLOUD DATABASES
Apache Airflow 3 Debuts on Amazon MWAA
NBA Teams Up with AWS for Advanced Data Tech
GraphStorm v0.5 boosts real-time fraud detection!
Aerospike DB 8.0 enhances real-time data speed
Snowflake Debuts Cortex AI for Finance Sector
Top Snowflake Competitors in Data Cloud Market
StreamNative unveils real-time AI lakehouse solution
PostgreSQL 18 increases AI, UUIDv7, and Features
AI Vector Databases to Transform India's Search Landscape

DEEP DIVE
Microsoft AI Tour Toronto Review
This past Wednesday, I attended along with a whole bunch of my teammates, the 2025 edition of the Microsoft AI Tour Toronto. Last week was indeed the Snowflake World Tour. It’s hard to keep track of all these conferences.
It was the usual Microsoft presentations. Nothing earth shattering. What intrigued me in one presentation were 2 technologies I had never heard about before, DiskANN and Microsoft Fabric SQL. The following is an overview of both:
1. DiskANN: Approximate Nearest Neighbor (ANN) Search
DiskANN is a state-of-the-art technology developed by Microsoft Research for performing highly efficient and accurate Approximate Nearest Neighbor (ANN) search. This technology is at the core of vector search capabilities in various Microsoft products, including Azure Cosmos DB. The earlier research project, Project Akupara, laid some of the foundational work in this area.
In essence, ANN search is a method used to find the most similar items (or "nearest neighbors") to a given query item in a large dataset, without having to compare the query item to every single item in the dataset. This is particularly useful in applications that deal with high-dimensional data, such as:
Semantic search: Finding documents or text passages that are semantically similar to a query.
Recommender systems: Suggesting products or content that are similar to what a user has previously interacted with.
Image and video search: Finding visually similar images or videos.
Key features of DiskANN include:
High performance and low latency: It is designed to provide fast search results, even over massive datasets with billions of vectors.
Cost-effectiveness: By utilizing SSDs and intelligent in-memory caching of quantized (compressed) vectors, DiskANN significantly reduces the memory requirements and, therefore, the cost of vector search.
Scalability and robustness: It can scale to handle very large datasets and is resilient to data updates, insertions, and deletions without requiring frequent and expensive index rebuilds.
Integration with Azure Cosmos DB: DiskANN is deeply integrated into Azure Cosmos DB, a globally distributed, multi-model database service. This integration allows developers to build powerful AI applications that combine transactional data with vector search capabilities in a single, unified platform.
2. Microsoft Fabric SQL
Microsoft Fabric SQL is a fully managed, SaaS (Software-as-a-Service) operational database that is part of the broader Microsoft Fabric analytics platform. It is built on the same SQL Server engine as Azure SQL Database, providing a familiar T-SQL interface for developers.
Microsoft Fabric is an all-in-one analytics solution that integrates various data and analytics services, including data engineering, data integration, data warehousing, business intelligence, and real-time analytics. The SQL database in Fabric is designed to be a developer-friendly transactional database that seamlessly integrates with all other Fabric workloads.
Key features of Microsoft Fabric SQL include:
Unified Platform: Data in a Fabric SQL database is automatically replicated to OneLake, Fabric's unified data lake. This makes the data readily accessible to other Fabric services like Spark for data engineering and Power BI for business intelligence, without the need for complex data movement or ETL processes.
Simplified Management: As a SaaS offering, it automates many of the administrative tasks such as scaling, backups, and patching, allowing developers to focus on building applications.
Developer-Friendly: It supports the standard T-SQL language and provides a web-based SQL editor. It also offers modern data access methods like a built-in GraphQL API.
Cross-database querying: OneLake allows you to perform queries that join data from your SQL database with data from other sources within Fabric, such as data warehouses and lakehouses, in a single query.
AI-Powered: Being part of the Fabric ecosystem, it benefits from built-in AI capabilities, including Copilot, which can assist with tasks like writing SQL queries.
Just sharing some new stuff I found. I was not expecting a great deal of database news at an AI conference, but this is new to me nonetheless.
Gladstone Benjamin