Monday, 17 March 2014

Data Warehouse and Data Mart

Difference between Data Warehouse and Data Mart

Key Difference: Data Warehouse is a big central repository of historical data. This data is assembled from different departments and units of the company. Data Mart can be considered as a subset of data warehouse or simply a data repository which is generally focused on a single functional area. They both primarily vary in their scope and usage area.


Basically, a data warehouse is a collection of data which is isolated from the operational systems. It assists in the decision making of the company. The data is assembled from multiple sources in order to provide accurate and timely information. The data is stored from a historic perspective.

The data in the warehouse is information which has been extracted efficiently from multiple functional units. It is checked, cleaned and finally integrated to be a part of the warehouse. Data warehouses are controlled and implemented by a central organizational unit.


A data mart is an important subset of a data warehouse. It is specifically subject oriented, and it is designed to meet the needs of a specific group of users. Data marts can be individually designed for departments like Sales, Finance, etc.

Data marts are generally controlled by a single department of an organization. The data for these data marts is assembled only from a few sources. Thus, data mart and data warehouse mainly differ in their scope and data sources. Data marts are generally less than 100 GB in size, whereas the size of a data warehouse is typically larger than 100 GB. Due to the difference in scope, it is comparatively easy to design and use data marts. However, using a data warehouse can be difficult and complex at the same time.


Comparison between Data Warehouse and Data Mart:


Data Warehouse
Data Mart
Definition Removing informational processing load from transaction-oriented databases. Data Mart can be considered as a subset of data warehouse. It is generally focused on a single functional area.
Focus Multiple subject areas Specific subject area
Control Central organization unit Generally, single department
Scope Corporate Line of Business
Data Sources Multiple Few selected
Size 100 GB-TB+ < 100 GB
Designing Comparatively difficult Easy
Advantages
  • It is accessible across the enterprise
  • Contains historical and current data
  • Can be considered as a "single version" of the truth about enterprise activities.
  • Removes informational processing load from transaction-oriented databases

  • Incremental development
  • Easy understanding of data
  • Simple data design
  • Easy Manipulation of data
  • Better Reporting performance due to smaller queries

Implementation time Months to years Months
Decision Strategic Tactical

No comments:

Post a Comment