
The query times are fast enough for reporting/BI and the underlying Lakehouse storage formats are flexible enough for ML/AI. This forced many companies to run both data lakes & warehouses in a two-tiered architecture that was costly and tedious to maintain.Ī data lakehouse is the architecture of the future because it allows organizations to keep their data in one place. They also aren't good for ML or AI analyses.Ī #datalake is low-cost, highly scalable, and easily adaptable for ML workflows, but doesn't offer performance that's sufficient for reporting/BI. Data Lakehouses are performant enough for reporting and BI, but also flexible enough for machine learning and advanced analytics.įirst-generation #datawarehouses were good for BI/reporting, but were expensive and difficult to scale because they coupled compute and storage. A #lakehouse is a modern way for an enterprise to manage its data.ĭata Lakehouses store data in an open Lakehouse storage system that can be queried by any open-source or proprietary engine.



Enterprises have shifted from data warehouses => data lakes + data warehouses => data lakehouses over the last decade or so.
