A data warehouse is a data management system which aggregates large volumes of data from multiple sources into a single repository of highly structured and unified historical data. The centralized data in a warehouse is ready for use to support business intelligence (BI), data analysis, artificial intelligence, and machine learning needs to inform decision making and improve organizational performance.
Historically, data warehouses were hosted on-premises, and since data was stored in a relational database, it had to be transformed before loading using the classic Extract, Transform, and Load (ETL) process. But as you’d expect, data warehousing systems continue to evolve with the surrounding data integration ecosystem.
With the rise of modern cloud architectures, larger datasets and the need to support real-time analytics and machine learning projects, warehouses are now typically hosted in the cloud and pipelines are shifting from ETL to Extract, Load, and Transform (ELT), streaming and API. Also, modern data warehouse automation allows you to create data models, add new sources, and provision new data marts without writing any SQL code. This 2-minute video describes the three zones used in structuring a data warehouse and how automation allows you to generate, model, and create SQL for each zone.