In the enterprise there are several systems managing the same data; the role of MDM is to centralize the data management to only one master copy of the data item which is then synchronized to all applications using the data. Using this approach, when referring to (for example) a customer within the enterprise, all systems are referring to the same customer.
There are basically two reasons why there are duplicated data which are inconsistent:
- The production systems within an enterprise, when implemented, have not been designed to be a part of larger set of production systems with which they should cooperate. Therefore, each system manages data on its own.
- The branches or departments of the company exist on their own without close cooperation with other departments. For example, the mortgage department deals with customers and manages the mortgage contracts. While the marketing department plans a promotion on mortgages. If the two departments do not cooperate (share the data), the marketing department may offer a mortgage to a customer who already has a mortgage. This is both a waste of money on the promotion as well as annoying to the customer.
- Company acquisitions or mergers are another example when an enterprise gets several parallel systems managing similar and sometimes overlapping data.
To handle the issues mentioned above, the common baseline for Master Data Management solutions comprises the following processes:
- Source identification - the 'system of record' needs to be identified first. If the same record is stored in multiple systems, the system which holds the most relevant copy (most valid, actual, or complete) of that record is referred to as a 'system of record'.
- Data collection - the data needs to be collected from various sources as some sources may attach a new piece of information, while dropping pieces which they are not interested in.
- Transformation - the transformation step takes place both during the input, while data are converted into a format for MDM processing, as well as on the output when distributing the master records back to the particular systems and applications.
- Data consolidation - the records from various systems which represent the same physical entity are consolidated into one record - a master record. The record is assigned a version number to enable a mechanism to check which version of record is being used in particular systems.
- Data deduplication - often there are separate records in the company's systems, which in fact identify the same customer. For example, the bank may have a record identifying a customer while the bank's insurance subsidiary or department maintains a separate database of customers having a different record for the same customer. It is vital that these two records are deduplicated and maintained as one master record.
- Error detection - based on the rules and metrics, the incomplete records or records containing inconsistent data should be identified and sent to their respective owners before publishing them to all the other applications. Providing erroneous data may compromise credibility of the company's MDM.
- Data correction - related to error detection, this step notifies the owner of the data record that there is a need to review the record manually.
- Data distribution/synchronization - the master records are distributed to the systems in the enterprise. The goal is that all the systems are using the same version of the record as soon as possible after the publication of the new record.
In the previous paragraphs, we have mentioned that each data record has to be assigned its owner or steward - a person who understands the data and is responsible for maintaining the record. The steward needs to be from the business side of the company. The reason, of course, is that only a business person understands the data and can make decisions about the data consolidation, updates, corrections and validity. On the other hand, the actual processing can be made available to either the business user via GUI or the IT department.
Source : DataIntegration.Info (Javlin - Data Solutions)