When it comes to designing a data warehouse for your business, the two most commonly discussed methods are the approaches introduced by Bill Inmon and Ralph Kimball. Debates on which one is better and more effective have been on for years. But a clear cut answer has never been arrived upon, as both philosophies have their own advantages and differentiating factors, and enterprises continue to use either of these.
To begin with, let us have a quick look at both the approaches.
In a nutshell
Bill Inmon’s enterprise data warehouse approach (the top-down design): A normalized data model is designed first. Then the dimensional data marts, which contain data required for specific business processes or specific departments are created from the data warehouse.
Ralph Kimball’s dimensional design approach (the bottom-up design): The data marts facilitating reports and analysis are created first; these are then combined together to create a broad data warehouse.
Inmon’s top-down approach
Inmon defines data warehouse as a centralized repository for the entire enterprise. Data warehouse stores the ‘atomic’ data at the lowest level of detail. Dimensional data marts are created only after the complete data warehouse has been created. Thus, data warehouse is at the center of the Corporate Information Factory (CIF), which provides a logical framework for delivering business intelligence.
Inmon defines the data warehouse in the following terms:
- Subject-oriented: The data in the data warehouse is organized so that all the data elements relating to the same real-world event or object are linked together
- Time-variant: The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time
- Non-volatile: Data in the data warehouse is never over-written or deleted -- once committed, the data is static, read-only, and retained for future reporting
- Integrated: The database contains data from most or all of an organization's operational applications, and that this data is made consistent
Kimball’s bottom-up approach
Keeping in mind the most important business aspects or departments, data marts are created first. These provide a thin view into the organizational data, and as and when required these can be combined into a larger data warehouse. Kimball defines data warehouse as “A copy of transaction data specifically structured for query and analysis”.
Kimball’s data warehousing architecture is also known as Data Warehouse Bus (BUS). Dimensional modeling focuses on ease of end user accessibility and provides a high level of performance to the data warehouse.
Inmon vs. Kimball: Similar or different?
"You can catch all the minnows in the ocean and stack them together and they still do not make a whale." ~Inmon
“The data warehouse is nothing more than the union of all the data marts" ~Kimball
Pros and cons of both the approaches
How to decide?
As we have already seen, the approach to designing a data warehouse depends on the business objectives of an organization, nature of business, time and cost involved, and the level of dependencies between various functions. Inmon’s approach is suitable for stable businesses which can afford the time taken for design and the cost involved. Also, with every changing business condition, they do not change the design; instead they accommodate these into the existing model. However, if local optimization is good enough and the focus is on quick win, it is advisable to go for Kimball’s approach. Keeping this in mind let Inmon vs. Kimball fight happen over a few sectors / functions.
- Insurance: It is vital to get the overall picture with respect to individual clients, groups, history of claims, mortality rate tendencies, demography, profitability of each plan and agents, etc. All aspects are inter-related and therefore suited for the Inmon’s approach.
- Marketing: This is a specialized division, which does not call for enterprise warehouse. Only data marts are required. Hence, Kimball’s approach is suitable.
- CRM in banks: The focus is on parameters such as products sold, up-sell and cross-sell at a customer-level. It is not necessary to get an overall picture of the business. For example, there is no need to link a customer’s details to the treasury department dealing with forex transactions and regulations. Since the scope is limited, you can go for Kimball’s method. However, if the entire processes and divisions in the bank are to be linked, the obvious choice isInmon’s design vs. Kimball’s.
- Manufacturing: Multiple functions are involved here, irrespective of the budget involved. Thus, where there is a systemic dependency as in this case, an enterprise model is required. Hence Inmon’s method is ideal.
While designing a data warehouse, first you have to look at your business objectives – short term and long term. See where the functional links are and what stands alone. Analyze data sources for quantity and quality. Finally, evaluate your resource level, time frame and wallet. This helps you to arrive at which method to adopt Inmon’s or Kimball’s or a combination of both.
About the author: Sansu George is a business analyst at ABIBA Systems, a specialist telecommunication business intelligence and analytics firm based in Bangalore. Currently she works on solutions pertaining to enterprise performance analysis, customer segmentation, campaign management and churn prediction, specifically for telecom operators.
This was first published in April 2012