Wednesday, March 5, 2014

Data Warehouse Definition - Overview

Hello. In this post I want to add my series about data warehouse. I think this post should be the first post because I will share about data warehouse in common and overview. In this post I also combine some definition from famous resources. Please enjoy. :)

Data Warehouse Definition

In computing, a data warehouse (DW, DWH), or an enterprise data warehouse (EDW), is a database used for reporting and data analysis. Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before it is used in the DW for reporting.

Data warehousing incorporates data stores and conceptual, logical, and physical models to support business goals and end-user information needs. A data warehouse (DW) is the foundation for a successful BI program.

Creating a DW requires mapping data between sources and targets, then capturing the details of the transformation in a metadata repository. The data warehouse provides a single, comprehensive source of current and historical information.

A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis.

***By Bill Inmon (American Computer Scientist, recognized by many as the father of the data warehouse)
Source : Wikipedia

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.

Subject-Oriented : A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.

Integrated : A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.

Time-Variant : Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.

Non-Volatile : Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered. 


***By Ralph Kimball (Author on the subject of data warehousing and business intelligence, also known by his methodology : dimensional modeling or the Kimball methodology
Source : Wikipedia

A data warehouse is a copy of transaction data specifically structured for query and analysis.

This is a functional view of a data warehouse. Kimball did not address how the data warehouse is built like Inmon did; rather he focused on the functionality of a data warehouse.

A database, typically very large, containing the historical data of an enterprise. Used for decision support or business intelligence, it organizes data and allows coordinated updates and loads.

Previous Related Posts

No comments:

Post a Comment

Share Your Inspiration...