The concept and functionality of enterprise data warehouses have evolved over the years and their usage and features have changed drastically. Data warehouse appliances now have in-built analytical capabilities. A decade ago, an enterprise data warehouse was used only as a centralized repository for information management. At that time, for instance, the client for a bank was just an individual party; whereas today the client could be viewed as the entire household, with the worth as a whole being considered for analysis. Thus the very way in which data is perceived has changed. Integration of data takes place in new ways, and master data management enables the establishment of a central source of data, with business intelligence systems, analytical tools and dashboards sitting over this enterprise data warehouse platform.
Problem of plenty
With the advent of unstructured data from social media applications, the scope for data to be analyzed has grown. Earlier enterprise data warehouse management pertained to mainly internal data and some external data gathered through market research. The data stored in the enterprise data warehouse from ERP applications and other systems is mostly structured in nature. Now, unstructured data also has to be captured and analyzed. CRM applications along with social media add to the gamut of data with telephonic and email data entering the picture.
Users have become demanding. Earlier the user would generate reports from the data in the warehouse and try to derive strategic insights from them. Now, users expect analytical insights from their reports, and this has added a stress point to data warehouse management. This data now has to be accurate and up-to-date to meet the needs and inspire trust. There is a balancing act between the speed of recovery and the stability of the data in the report. The enterprise data warehouse has to stand strong on both these counts.
An enterprise data warehouse is not a product but a process that has to be constantly tailored and customized as needed. To ensure data quality, it is inappropriate to rely on cleansing of accumulated data after the fact. Rather, data quality management in the enterprise data warehouse is an on-going process. While there are products that can work with data quality, these are of little help without a data quality strategy in place. Data quality tools cannot help prevent bad data from coming into the system. The reasons for problems could be varied: something could go wrong with the source at the time of capture; or, an incorrect transformation rule may have been set during the data integration process; or, the fault may be in the migration process. It is thus essential to address problems across the data warehouse management process.
Three reasons for change management are:
- Technology revamp
- New business initiatives
- New data types
Consider for instance an M&A deal. The change management for the data warehouse in such a scenario is complex and the revamp cannot be accomplished overnight. The architecture that you define for your enterprise data warehouse has to be evolutionary, with interlinked components rather than tightly coupled ones.
The warehouse model has to be scalable in terms of hardware as well as platform, in order to incorporate changes, and provide for future requirements such as mobility, social media and other emerging trends. The architecture must be flexible enough to handle data changes as well.
When it comes to changing of the data model, it is more feasible to look at the enterprise data warehouse from a holistic perspective. Whimsical changing of the data model could mess up everything, creating a “garbage in, garbage out” scenario. One could look at an industry-specific model and modify it to one’s needs. If things go wrong, it is useful to take a step back and assess everything dispassionately, focusing on the objective of the data warehouse management process and establishing the reasons for the change. It could be that the model change was not the correct solution in the first place. A 360-degree audit of the enterprise data warehouse would be useful in such cases.
Impact on users
For the enterprise data warehouse data, one can consider the source layer, the data warehouse layer (integration) and the presentation layer. The changes are invariably seen at the presentation layer level, in terms of impact on users. The format of the data in the enterprise data warehouse is in the format the reporting systems can understand; changes to the reporting system will also entail changes to the data in the enterprise data warehouse. The users will have a set of metrics by which they would be building their data. The underlying metadata layer will not change but the format and system data therein would change. Users do not generate reports by running complex algorithms themselves but by running through a few parameters. Thus, the change to the data warehouse will make little difference to them.
With data warehouse management and change, guarding against getting ahead of oneself is the biggest challenge. A company adopting any technology just because it is new or offers glamorous features is a big mistake. Enterprise data warehouse in the cloud, working with Hadoop, buying data warehouse appliances and other such initiatives need detailed analyses to be conducted before taking the plunge; they cannot be undertaken merely because that technology is the ‘in-thing’ at the moment.
About the author: Srinivasan Rengarajan is program manager for data warehousing and business intelligence practice at Collabera Solutions. Prior to this, he has worked with iGATE, Teradata and Godrej. He has over fourteen years of experience managing engagements in domains such as financial services, retail, and energy and Utilities for acclaimed clientele.
(As told to Sharon D’Souza)
This was first published in December 2011