Back to feed
6/10
Industry
5 May 2026, 06:02 UTC
SAP announces acquisition of data lakehouse vendor Dremio to bolster enterprise data capabilities.
SAP's acquisition of Dremio signals a major push to integrate federated data querying directly into the SAP ecosystem without moving data. For data engineers, this means SAP Datasphere will likely get native Apache Iceberg support and a much stronger distributed SQL engine, reducing ETL overhead for hybrid architectures. This consolidates the semantic layer, making it easier to expose SAP ERP data to external AI and analytics tools.
What happened
SAP has announced its intent to acquire Dremio, a company known for its "agentic" data lakehouse platform, for an undisclosed amount. This move aims to fold Dremio's distributed SQL query engine and data federation capabilities directly into SAP's enterprise software portfolio, most notably SAP Datasphere.Technical details
Dremio's architecture is built heavily on Apache Arrow (for in-memory columnar processing) and Apache Iceberg (for open table formats). It allows organizations to query data where it resides—whether in AWS S3, Azure Data Lake, or on-premises environments—without requiring heavy ETL pipelines to move the data into a centralized warehouse. By acquiring Dremio, SAP is securing a robust semantic layer and a high-performance query engine that can seamlessly federate queries across diverse data stores. The recent "agentic" branding points to Dremio's pushes into AI, allowing autonomous agents to query and retrieve context from the lakehouse via text-to-SQL and semantic search integrations.Why it matters
For data engineers and architects working within or alongside the SAP ecosystem, this is a significant architectural shift. SAP has historically been a walled garden, though SAP Datasphere began opening the doors. Integrating Dremio means SAP is fully embracing the open data lakehouse paradigm. Engineers can expect reduced reliance on complex data replication between SAP ERP systems and external data lakes. The native integration of Apache Iceberg into SAP's stack will standardize how enterprise data is stored and accessed, drastically simplifying the semantic layer and making SAP data more readily available for GenAI applications and external compute engines.What to watch next
Keep an eye on the product roadmap for SAP Datasphere to see how quickly Dremio's query engine is integrated. Also, watch for how SAP handles Dremio's existing non-SAP customer base and open-source contributions, particularly around Apache Arrow. Finally, observe if this triggers a response from competitors like Snowflake or Databricks regarding tighter integrations with legacy ERP systems.Sources
sap
dremio
data-lakehouse
apache-iceberg
data-engineering