- Cloud Database Insider
- Posts
- AWS vs Azure vs GCP for Data & AIđĽ|8 Databases Testedâď¸|MySQL stagnatesâ ď¸
AWS vs Azure vs GCP for Data & AIđĽ|8 Databases Testedâď¸|MySQL stagnatesâ ď¸
Deep Dive: A Look Into Microsoft Fabric

Whatâs in todayâs newsletter:
AWS vs Azure vs GCP for Data & AI WorkloadsđĽ
8 Databases Tested : Surprising Results Revealedâď¸
MySQL stagnates; community pushes for innovative forks â ď¸
Also, check out the weekly Deep Dive - Microsoft Fabric
Become An AI Expert In Just 5 Minutes
If youâre a decision maker at your company, you need to be on the bleeding edge of, well, everything. But before you go signing up for seminars, conferences, lunch ân learns, and all that jazz, just know thereâs a far better (and simpler) way: Subscribing to The Deep View.
This daily newsletter condenses everything you need to know about the latest and greatest AI developments into a 5-minute read. Squeeze it into your morning coffee break and before you know it, youâll be an expert too.
Subscribe right here. Itâs totally free, wildly informative, and trusted by 600,000+ readers at Google, Meta, Microsoft, and beyond.
CLOUD DATABASES

TL;DR: AWS, Azure, and GCP each excel in AI workloads differentlyâAWS in scalability, Azure in enterprise integration, and GCP in advanced researchâmaking cloud choice essential for aligning with specific AI project goals.
AWS excels in scalability and offers a broad range of AI services including SageMaker for model building and training.
Azure provides strong enterprise support with seamless integration of Azure Machine Learning and Microsoft software tools.
GCP is recognized for advanced AI research, TensorFlow integration, and innovative tools like Vertex AI simplifying ML workflows.
Choosing the right cloud platform depends on organizational needs for scalability, integration, cost, and AI development goals.
Why this matters: Selecting the ideal cloud providerâAWS, Azure, or GCPâdirectly influences AI project success by balancing scalability, integration, cost, and innovation. Matching platform strengths to business goals ensures optimized performance and efficiency, crucial as AI drives competitive advantage and operational transformation across industries.

TL;DR: Testing eight databases with the same query revealed relational DBs excel in joins and transactions, NoSQL offers scalability and flexibility, while Redis is fastest but best as a cache, emphasizing trade-offs.
The author tested eight databases with the same query to compare performance, usability, and data suitability.
Relational databases like MySQL and PostgreSQL excelled at complex joins and transactional stability.
NoSQL databases like MongoDB and Cassandra offered schema flexibility and scalability but struggled with complex joins.
Redis delivered ultra-fast access but functions better as a cache rather than a primary data store.
Why this matters: This experiment reveals the critical trade-offs between database types, emphasizing that selection depends on specific application needs like query complexity, scaling, and consistency. It encourages developers to perform real-world testing to avoid mismatches that could undermine performance and reliability in production.
RELATIONAL DATABASE

TL;DR: MySQL's slowed development and weak community engagement have sparked concerns, prompting calls for forks and risking loss of dominance to more innovative, agile competitors like MariaDB and PostgreSQL.
MySQL's development has slowed, causing concern among its open-source community over limited features and innovation.
Community members criticize slower release cycles and poor responsiveness compared to competitors like MariaDB and PostgreSQL.
Some advocates are pushing for forks and new projects that emphasize openness and reinvigorate community collaboration.
Stagnation threatens MySQL's dominance and relevance, risking migration to more agile, community-driven database solutions.
Why this matters: MySQL's slowed innovation and poor community engagement threaten its leadership amid competitors advancing faster. This risks migrating users to more dynamic, community-driven databases, potentially fracturing its ecosystem and impacting future cloud-native developments reliant on adaptable, continuously evolving database technologies.

EVERYTHING ELSE IN CLOUD DATABASES
Top SQL Beautifiers: Clean Code Online Fast!
Datadog 2026: AI Agents Boost Platform Growth
Apache Polaris joins top Apache projects!
Microsoft Fabric boosts Emablerâs data insights fast
ScyllaDB boosts Fanaticsâ performance and cuts costs
ClickHouse Boosts PostgreSQL with PG_ClickHouse Extension
BigQuery adds cross-region global queries
Google Cloud launches managed MCP servers for databases
Google Spanner adds Cassandra Query Language API support
Boost Neo4j with Gemini CLI Extension
Super Bowl LX powered by smart data tech
Yieldmo slashes database costs, cuts cloud reliance
Kong Launches Context Mesh for API Discovery
Memgraph's Atomic GraphRAG boosts multi-source data
QuestDB boosts HDFC Bank's real-time risk analytics
Zilliz Cloud Expands AI Reach Across Europe

DEEP DIVE
A detailed look into Microsoft Fabric, Part One, an introduction to MS Fabric
Over the last couple days, I took some time off from the day job to have a little retreat of sorts, to do some reflection into to the state of the newsletter. What became apparent to me is that the coverage has skewed over time towards the twin behemoths of Snowflake and Databricks.
Truth be told, these two magnificent companies are innovative and it honestly is challenging to keep track of the feature sets of them. Plus all of my contacts at both companies have always been helpful to me and my team (and their events are pretty good too).
But, there is a third column on the rise to challenge Snowflake and Databricks, which of course is Microsoft Fabric.
What is Microsoft Fabric you may ask. Is it just a rebranding of services and features that had already existed or is it a reworking of said systems to have better integration?

Letâs take a high level look into Microsoft Fabric.
What is Microsoft Fabric?
At its core, Microsoft Fabric is an all-in-one, Software-as-a-Service (SaaS) analytics platform. Historically, organizations had to manually stitch together various Platform-as-a-Service (PaaS) tools for data ingestion, engineering, warehousing, and business intelligence. This created a costly and fragile âintegration tax.â Fabric eliminates that burden by collapsing all of these distinct data-lifecycle stages into one cohesive environment.
The Foundation: OneLake and the âOne Copyâ of Data
The biggest fundamental shift in Fabric is its storage layer, OneLakeâpositioned as the âOneDrive for data.â Fabric operates on a strict single-copy-of-data philosophy. Instead of moving or duplicating data to fit different tools, all tabular data is stored natively in open formats (Delta Parquet). This means a data engineer can transform data using Apache Spark, a SQL analyst can query it using T-SQL, and a business user can build Power BI dashboardsâall simultaneously, on the exact same underlying dataset.
How Fabric Challenges Snowflake and Databricks
Databricks pioneered the open lakehouse architecture and excels at complex machine learning, AI, and heavy data-engineering workloads. Snowflake is the cloud-native data-warehouse champion, renowned for its elastic separation of compute and storage, multi-cloud flexibility, and massive SQL concurrency.
Microsoft Fabric challenges both by offering a highly integrated âwalled gardenâ approachâideal for organizations already deeply invested in the Microsoft ecosystem (Azure, Microsoft 365, Power BI). Rather than requiring teams to assemble modular, best-of-breed tools, Fabric delivers a single unified interface where data integration, reporting, and AI converge naturally.
Core Technologies of Microsoft Fabric
Unified Storage Foundation
⢠OneLake: Built on Azure Data Lake Storage (ADLS) Gen2, this is the central, logical data lake for the entire organization.
⢠Delta Lake & Parquet: Fabric standardizes on open data formats. All tabular data is natively stored in Delta Parquet, so any compute engine (Spark, SQL, or BI) can read the same single copy of data without format conversions or duplication.
⢠Shortcuts & Mirroring: Shortcuts act as zero-copy virtual pointers, letting Fabric query data in external stores (AWS S3, Google Cloud, or Snowflake) as if it were local to OneLake. Mirroring provides continuous, near-real-time replication from operational databases (Azure SQL, Cosmos DB, PostgreSQL) directly into OneLake in an analytics-ready Parquet format.
Data Integration & Orchestration
⢠Data Factory: Handles data movement and transformation. It combines enterprise-grade Data Pipelines (for orchestration and moving petabytes of data) with Dataflows Gen2 (a low-code, visual interface powered by the Power Query engine for data preparation).
Compute Engines (The Synapse Legacy)
⢠Synapse Data Engineering: A high-performance Apache Spark environment for massive data transformations and building lakehouses. It supports Python (PySpark), Scala, R, and Spark SQL.
⢠Synapse Data Warehouse: A fully serverless, distributed T-SQL query engine that decouples compute from storage and runs relational SQL queries directly against the open Delta files in OneLakeâno proprietary database loading required.
⢠Synapse Data Science: An environment dedicated to machine learning. It integrates natively with Azure Machine Learning and uses MLflow for experiment tracking and model registry, enabling data scientists to train and deploy models directly on OneLake data.
⢠Synapse Real-Time Intelligence: Built on the Kusto Query Language (KQL) engine, this handles high-velocity, high-volume streaming data (IoT telemetry, application logs) via Eventstreams and Real-Time Hubs with sub-second latency.
⢠Fabric SQL Database: A newly integrated operational database built on the SQL Server engine, adding native transactional (OLTP) capabilities inside the Fabric ecosystem.
Consumption & Action
⢠Power BI & Direct Lake Mode: Power BI serves as the semantic and visualization layer. Its standout Fabric feature, Direct Lake, lets the Power BI VertiPaq engine read Delta Parquet files straight from OneLake into memoryâdelivering import-level speed without the latency or cost of actually moving data.
⢠Data Activator: A no-code event-detection engine that continuously monitors data streams or Power BI reports and automatically triggers actions (Teams messages, emails, Power Automate workflows) when specific thresholds or patterns are met.
Universal Connective Tissue
⢠OneSecurity & Microsoft Purview: Provide unified, enterprise-grade governance. Role-based access control (RBAC), row-level security, and sensitivity labels are defined once and enforced everywhereâacross Spark, SQL, and Power BI.
⢠Copilot (Generative AI): Powered by Azure OpenAI and embedded in every workload, Copilot translates natural language into SQL, generates PySpark code in notebooks, builds data pipelines, and creates DAX measures for Power BI reports.
A New Economic Model
Strategic Positioning: Challenging the Titans
Microsoft Fabric introduces a highly integrated "walled garden" approach, specifically targeting organizations already invested in Azure and Microsoft 365.
Competitor | Core Strength | Fabricâs Challenge |
Databricks | Optimized for complex ML, AI, and heavy engineering via an open lakehouse. | Fabric offers a unified interface where engineering and reporting converge naturally. |
Snowflake | Famous for elastic SQL concurrency and multi-cloud flexibility. | Fabric provides a "single-pane-of-glass" experience that removes the need to assemble modular tools. |
Finally, Fabric changes the economic model. Instead of paying for individual services or compute clusters, organizations purchase a shared pool of Capacity Units that power everythingâfrom data pipelines to Power BI reporting. This unified capacity model dramatically reduces the operational overhead of managing isolated data stacks.
Over the next couple of newsletters I will dig a bit deeper into Microsoft Fabric as I need to know the technology for my own benefit and I will share what I find with you.
Gladstone Benjamin

