Over the past year, master data management has become a topic of interest to CIOs seeking to rationalize their enterprise information architectures. What is master data management? And what role can DB2 play in making master data management a success?
Master Data Management
In most organizations time has inevitably created (and will create) archipelagoes of data -- duplicated data with different formats, managers, owners, vital data on customers, suppliers, and the organization itself -- whose lack of coordination and visibility is not only taking the company farther away from being an on-demand organization, but is also seizing up the business processes of the organization as surely as lack of lubrication causes a car's gears to grind to a halt. MDM gives everyone in the enterprise a "common view" of key enterprise data such as customer and supplier records, and thus avoids this "seizing up."
More specifically, Infostructure Associates defines MDM as a solution including infrastructure software, business-process support, and applications that supports an organization in identifying, managing, providing a consistent view of, and leveraging key data across disparate environments. Outsourced, globalized and merged enterprises are finding that some efforts such as MDM are vital to initiatives such as information on demand or the real-time enterprise:
- By focusing on key information, MDM ensures that the most important
- data is available most rapidly.
- MDM creates an enterprise-wide metadata repository that allows rapid querying/reporting of business critical information, serves as a springboard for new applications such as business compliance, and supports a virtual operational data store as noted above.
Moreover, MDM has side-effects such as improved data quality (because disparate local views of customer data are checked against a corporate standard) and more rapid communication between central corporate and local business-unit entities (because a change to one data instance such as the release of corporate financials is instantly and semi-automatically propagated to all data instances in all business units). MDM differs from previous attempts to coordinate all data in the enterprise, such as data warehouses and enterprise resource planning (ERP) software, in three crucial characteristics:
- MDM focuses on all key enterprise data, and only key enterprise data.
- MDM's infrastructure (MDI) takes the attitude of leaving existing applications, data stores, and databases in place, and superimposing an infrastructure that leaves these existing resources the maximum of future flexibility.
- At the same time, MDM requires that master data handling be "real time" (and changes be synchronized with real-time changes in existing data stores) so that decisions can be made just as quickly as if all key data were in one gigantic database.
Implementing MDM: The key role of master data integration
Figure 1 shows MDM's components, and how they cooperate to carry out a transaction.
Note that implementing MDM on top of anexisting enterprise architecture is not easy. Often, information about key enterprise data is hard to find or non-existent; even key enterprise data is often of poor quality; and accessing key data requires not only converting widely differing data to a common format (and back) but also meshing with existing applications and other methods of data access.
Source: IBM Corp. and Infostructure Associates LLC, December 2005
One key to overcoming these obstacles is to use Master Data Integration infrastructure software as the core of MDM. Master Data Integration (MDI) is the combination of EII, EAI, ETL, and other middleware to support MDM. MDI is critical to effective MDM:
- It integrates MDM with the rest of an enterprise's architecture.
- Its flexibility allows MDM to support not only existing but also future applications.
- It ensures a comprehensive, updatable view of master data.
- Its data quality mechanisms cleanse and rationalize key data to improve data quality "where it counts," increasing organizational effectiveness.
And beyond all of these useful attributes, MDI serves as the "lubricant" of key enterprise data: automating access and management across data copies, aiding in identifying and incorporating new types of data as they arrive, and embedding data "best practices" in the organization's culture.
DB2's uses in MDI
DB2 as a component of MDI has a couple of useful roles to play in the creation of an effective enterprise-wide MDM solution:
- Cache database.
- Metadata repository.
Each organization implementing MDM must minimize the performance overhead caused by querying and updating across all instances of a piece of information such as a customer record, instead of having each application access its own version of that information. Each organization will have its own way to minimize performance overhead by choosing a different mix of links to local systems and physical "common-format" data stored in the central repository. Usually, users will store at least some common-format data in the repository, thereby effectively making this repository a "cache" of data.
IBM DB2 can be a highly useful cache database, for two reasons. First, it is able to deliver high performance for the relational format typical of common-format master data, while handling the more atypical semi-structured (text) and unstructured (graphics) that may occur. Second, it is well integrated with strong ETL, EII (Information Integrator), EAI (Ascential), and replication products, allowing high-performance conversion and transmission of data. Note also that IBM DB2 has been noted for achieving high-quality data, and therefore can help spread data-quality "best practices" throughout the organization.
Both the links to data in the central repository and any information about the various formats of the local data are in effect "metadata" (data about data). The MDM's repository is therefore a logical place to build up a metadata repository that would contain not only links and format information but also information about the relationships between various types of record (e.g., customer and product, or customer and supplier).
Again, DB2 can be a highly effective way to implement a metadata repository. Its scalability, robustness, and long experience with data dictionaries (per-database metadata repositories) make it a logical choice. Also, it is well integrated with Information Integrator, so that it can use Information Integrator's ability to semi-automatically go out and search for master data no matter what the data type, initially populate the metadata repository, and update the repository as new customer record types arrive at local sites.
MDM is not the only way to rationalize and improve enterprise information architectures — but it is the one with the most visibility to CEOs, because every CEO sees the importance of leveraging key customer or supplier data as much as possible. Therefore, MDM can be thought of as a fad with legs: possibly over-hyped and with its implementation difficulties underestimated, but offering benefits that will endure in the long run, and allowing the implementation of controls on data and infrastructure-software features that will yield other benefits (such as better business-process support) well beyond just leveraging master data.
Yet if MDM has legs, it must be built for the long term — and that means strong, well-integrated MDI components able to incorporate new technologies such as Web services — components such as IBM DB2. Thus, IBM DB2 should play a role in users' MDM plans, not only for its usefulness as a cache database and metadata repository, but also because it should be able to maintain and increase its usefulness over the next 5-10 years.
About the author: Wayne Kernochan is president of Infostructure Associates, LLC, a Lexington, Mass.-based analyst firm.