- Cloud Database Insider
- Posts
- AWS Dominance Risks Global Cloud Disruptions☁️|MongoDB's surge🚀|MySQL 8 Support Ends⚠️|New Data Formats On the Horizon
AWS Dominance Risks Global Cloud Disruptions☁️|MongoDB's surge🚀|MySQL 8 Support Ends⚠️|New Data Formats On the Horizon
Get familiar with these new data formats (Lance, Nimble, and Vortex)

What’s in today’s newsletter:
AWS Dominance Risks Global Cloud Disruptions☁️
MongoDB's surge driven by new strategic plans🚀
MySQL 8 Support Ends, Upgrade Now⚠️
Solidatus nets $5M to boost AI data lineage tech🔄
BigQuery syncs data from S3 with custom options🛠️
Also, check out the weekly Deep Dive - New Data Formats On the Horizon (Lance, Nimble, and Vortex-Read my blog post here).
Voice AI Goes Mainstream in 2025
Human-like voice agents are moving from pilot to production. In Deepgram’s 2025 State of Voice AI Report, created with Opus Research, we surveyed 400 senior leaders across North America - many from $100M+ enterprises - to map what’s real and what’s next.
The data is clear:
97% already use voice technology; 84% plan to increase budgets this year.
80% still rely on traditional voice agents.
Only 21% are very satisfied.
Customer service tops the list of near-term wins, from task automation to order taking.
See where you stand against your peers, learn what separates leaders from laggards, and get practical guidance for deploying human-like agents in 2025.
AWS

TL;DR: Heavy reliance on AWS creates systemic risks, causing widespread disruptions during outages; businesses should adopt multi-cloud strategies, and policymakers must address cloud market consolidation to ensure infrastructure resilience.
AWS outages cause widespread disruptions due to many companies’ heavy dependence on its cloud infrastructure.
The concentrated reliance on AWS creates systemic vulnerabilities and cascading failures during service interruptions.
Businesses are urged to adopt multi-cloud or hybrid strategies to reduce risks from single-provider outages.
Cloud market consolidation raises concerns about digital infrastructure resilience and requires regulatory and industry attention.
Why this matters: Heavy dependence on AWS exposes businesses worldwide to cascading failures during outages, risking significant operational disruption. Diversifying cloud strategies is crucial for resilience, while the industry's consolidation demands policy and competition measures to protect critical digital infrastructure from systemic vulnerabilities.
NOSQL

TL;DR: MongoDB’s stock surge stems from strategic partnerships and product innovations in real-time processing and multi-cloud deployment, boosting investor confidence and strengthening its competitive position in the cloud database market.
MongoDB’s stock surged due to strategic partnerships enhancing its cloud database market presence.
Collaborations integrate MongoDB’s scalable solutions with major cloud platforms, improving enterprise value.
Product innovations include real-time data processing and multi-cloud deployment to meet evolving customer needs.
Investor optimism reflects confidence in MongoDB’s growth and competitive positioning in the cloud-native sector.
Why this matters: MongoDB’s surge signals strong investor belief in its strategic moves to dominate cloud databases. Its partnerships and innovations address key market demands, enhancing enterprise appeal and competitive edge. Sustained growth here could reshape cloud data management, benefiting shareholders and driving broader industry advancements.
RELATIONAL DATABASE

TL;DR: MySQL 8 support ends in 2027, prompting users to migrate or upgrade to ensure security, compatibility, and operational stability. Oracle advises proactive testing and planning to avoid risks and extra costs.
MySQL 8 support and security updates will end in 2027, affecting developers and enterprises globally.
Oracle urges users to plan migrations, test new versions, and check application compatibility proactively.
Businesses face security risks and lack of troubleshooting if they continue using unsupported MySQL 8 software.
Transitioning to newer MySQL versions or alternative platforms is essential to avoid operational and maintenance issues.
Why this matters: The end of MySQL 8 support in 2027 demands urgent action from organizations to mitigate security risks and operational disruptions. Proactive migration and compatibility testing are essential to ensure continued reliability, avoid costly downtime, and stay competitive in evolving database technology landscapes.
DATA LINEAGE
TL;DR: Solidatus raised $5 million to enhance its AI-powered data lineage platform, automating data flow mapping to improve governance, compliance, and analytics, expanding development and market reach.
Solidatus raised $5 million to advance its AI-powered data lineage platform for better data governance.
The platform automates data flow mapping, reducing manual errors and improving real-time data analysis.
Funding will accelerate development, expand market reach, and support innovations in AI-driven data tracking.
AI-enhanced data lineage aids compliance with privacy regulations and optimizes organizational analytics initiatives.
Why this matters: Solidatus’s $5M funding accelerates AI-driven data lineage, critical for accurate, compliant data governance amid growing regulatory pressures. Automating data flow mapping reduces errors and boosts real-time insights, empowering organizations to optimize analytics and AI initiatives while mitigating compliance risks in complex data environments.
GCP

TL;DR: The article guides configuring and automating data transfers from Amazon S3 to Google BigQuery, enhancing cross-cloud interoperability and enabling businesses to leverage scalable multi-cloud analytics effectively.
The article guides users on transferring data from Amazon S3 to Google BigQuery using Google's built-in transfer feature.
It explains configuring transfer parameters like bucket name, data path, file format, partitions, and authentication methods.
Users learn how to schedule transfers and manage incremental loads or full data refreshes for automation.
This functionality improves cross-cloud interoperability, simplifying data pipelines and enhancing multi-cloud analytics capabilities.
Why this matters: Simplified, automated transfers from Amazon S3 to BigQuery break down cloud silos, enabling organizations to unify analytics across platforms. This boosts data accessibility and scalability, empowering better business insights and strategic decisions while easing multi-cloud data management complexities.

EVERYTHING ELSE IN CLOUD DATABASES
RavenDB Debuts AI Agent Creator for Easy Integration
Cloudera Leads 2025 Data Management Report
Snowflake unveils new AI-powered data tools
Graph RAG vs SQL RAG: Key AI Retrieval Methods Compared
PostgreSQL Meets Lakehouse: Unified Data Power
CockroachDB maps geography to boost database power
Celonis, Databricks partner to boost AI adoption
EnterpriseDB Debuts AI Data Horizons Podcast
Elastic unveils DiskBBQ vector storage in Elasticsearch 9.2
Denodo Tops Dresner Data Architecture Report Again

DEEP DIVE
New Data Formats On the Horizon (Lance, Nimble, and Vortex)
I like Dremio. Their database engine product is VERY cool and VERY fast. I know a few of the folks that work there, and attended an event they hosted when they came to town a while ago. If you are in the tri-state area, they are having their Subsurface event this week in NYC on November 13.
What I like about Dremio on top of their lightning fast database engine, is the amount of good content they produce and the fact that they are contributors and supporters to the Apache Iceberg project.
Some of the biggest companies on the planet contribute alongside Dremio such as Cloudera, IOMETE, Oracle, Snowflake, Starburst, Tabular, AWS, Google Cloud, and even Databricks who have their own competing format Delta Lake. There was/is the notion of the of the Open Table Format War.
I received an email from Dremio earlier this week and the content of said email was about Lance, Nimble, and Vortex. I honesty have not heard about these formats, but I am always open to learning about new tech (I created a summary research report here). The email had an article about the newer data formats.
Here is my synopsis of the article and the new data formats:
Traditional file formats like Parquet, which were designed for large-scale batch analytics, are being challenged by the demands of modern AI and machine learning workloads. These new workloads require low-latency random access, efficient handling of high-dimensional vector data, and optimization for modern hardware like NVMe drives and GPUs.
This shift has led to the emergence of new, specialized file formats:
Lance: A format built specifically for AI and vector workloads. It excels at high-performance random access and natively supports vector indexing, making it ideal for retrieval-augmented generation (RAG) and vector search.
Nimble: A format that prioritizes fast decoding speed for ML training datasets, especially those with tens of thousands of columns. By using simpler encodings, it can deliver 2-3x faster decoding speeds than Parquet, reducing I/O bottlenecks during model training.
Vortex: An experimental format aimed at real-time analytics on rapidly changing data. It focuses on improving update mechanics and making streaming data queryable almost immediately upon arrival.
The rise of these new formats creates a challenge for table formats like Apache Iceberg, which must manage the data at a higher level. To address this, a new File Format API proposal for Iceberg is under review.
This proposed API aims to create a unified, pluggable interface for file formats. Instead of hardcoding support for each format, the new API would allow Iceberg to easily integrate with new and existing formats (like Lance or Vortex) through a common contract.
That’s it. Remember to check out the Dremio blog to find out A LOT more about their database engine and their writings about data formats.
Gladstone Benjamin

