Construction principles for the information scientist: (7) Single registration of master data

In this series of blogs, I reflect on the still-valid information engineering construction principles that guarantee better “information constructs.” They are sometimes, in the pace of advancing technology, a bit forgotten. Resulting in sometimes shaky or poorly maintainable and extensible “information constructs.” This time on the need for single-entry capture of master data and the challenges this presents with respect to usage. Is copying a problem?

placeholder

Different types of data

Data comes in different types when viewed from an organization's information management perspective.

Master data (or master data) is the core data about people, addresses, products, items, and all other objects that an organization uses data from over and over again when recording processes or transactions. Master data rarely changes and forms the permanent data of an object (if master data does change - someone changes their last name, for example, or the color of a vehicle changes - it is important to also preserve the relationship between the old and new master data so that it is clear that it is still about the same object! But it would go too far to elaborate on this aspect of master data management here).


Master data can be managed by an organization itself (internally) or obtained from another organization or external source; we then refer to it as referential data; master data that is managed and maintained outside its own organization. For example, a list of country codes. But also the Key Register of Persons (BRP), seen from a customer's perspective, contains the reference data of a person for that customer. And the BRP itself uses the BAG to link addresses as external master data to a person. This is also where the problems arise when a person's residential address is considered master data. Strictly speaking, it is not because it can change. The record of a person's residential address is a transactional data. Thus, basic records also contain transactional data![1]

All recorded transactions, such as orders, deliveries, invoices, or life events contain transaction data. These are a snapshot of an event or legal fact and specifically describe that event as solidified reality. They are composed of the combination of master data/referential data and data describing the transaction. For example, if someone got married, moved away or obtained a degree.

The description of the entire data set itself is called metadata. This is an important part of the “administration” of data.


Finally, from the recording of transaction data, analysis can be done to arrive at new data; analytical data (trends, correlations, etc). For example, the fact that a customer has made ten orders this year or that sales have increased by 10% in a period.

Data as a recorded reflection of reality

In a registration, data form a picture of an event in reality. The purpose of registration determines what we want to record of reality. Different organizations (or departments within an organization) can register different data of the same object. However, they can agree to use the same set of master data of the object. Behold an application of the basic registrations when it comes to the registered objects; persons, companies, vehicles, addresses. The master data in the basic registrations form referential data for the transactions of a service provider (I will leave aside the fact that the basic registrations also contain transaction data but this deserves a separate discussion. You don't really want to mix those types of data).

The importance of single registration lies in the desire to keep the master data of one object in one place only, so that this aspect of reality is the same for all customers (internal and external) and the customer also knows that it concerns the same object. Customers may not change the master data or reference data, only use it, and necessary adjustments may only be made by the source owner or manager of the master data. Incorrect data must be reported to the source owner so that timeliness is maintained for all customers.

Duplicate master data or not?

In practice, single registration raises a few questions. Can you record reference data in your own information management system as a copy? Does that still constitute single capture? And if I do not register the data myself, can I then rely on the quality and availability of the source?

The basic principle is that master data must be current if you use it in a transaction for an accurate representation of reality. If you manage them yourself, you have that actuality under your own control. But what if you source them from someone else?

Do not duplicate but refer!

A well-known old NORA architecture principle read “Single capture multiple use.” This has since been replaced by “Inform at the source.” To this, NORA has attached the implication that it's better to refer to that source data instead of capturing a copy from that source in your own records.

That sounds logical, but in my opinion you should in many cases make a copy of the master data. You request the master data at a point in time to record it as part of a transaction. Then they should be current and not change thereafter. If you use a reference that always refers to the current data, you get into trouble if the master data changes later because then your transaction also changes and no longer reflects the reality at that time. Think of people changing their last name, a car being repainted, etc.

Using a reference to the source rather than local capture places very high demands on the source owner who must include the history of a piece of data in the reference and be able to provide it indefinitely with an availability that the customer demands.

It also means that the transaction data owner cannot maintain a stand-alone record and is constantly dependent on the source record. So how can you take responsibility for your own information housekeeping?

Master data should be copied when it becomes part of transaction data!

Do not refer but duplicate!

If master data must be current, it is inadvisable to use a proprietary copy that has been “retrieved” previously. For transactions, the principle of “inquire with the source (holder)” always applies to master data. You should synchronize at that point with the record in which the master data is held. But there may be reason to keep your own “copy” of master data. This is because using external master data creates a dependency. This may be undesirable because of the requested high performance or availability of a system from the business process.

The key then is to keep the proprietary copy in sync with the source. This can be done technically in many ways and the requirements for timeliness then usually determine which way is best. For example, a push by the source owner of changed data as soon as it is registered to the copy. Or periodic synchronization with a frequency sufficient for the timeliness required by the process (once every 24 hours, for example).


Finally, source owner and users must make conclusive organizational and legal agreements to use each other's records and to rely on the data quality (in the case of basic registrations, even laid down in laws and twelve requirements). This sometimes takes more effort than registering master data yourself. In extreme cases, an organization will still opt for its own registration if the quality and availability of the reference data is insufficient for its own process.

Conclusion

Registering and managing master data in one place as a starting point is indisputable (provided they have the required quality). But is it okay to duplicate master data? It depends! The information scientist must be aware of the requirements of the process for the timeliness of an internal or external master data and make his design choice on that basis. In addition to timeliness, performance (timeliness), availability and legitimacy (accountability) also play a role. And the costs and benefits. One central ICT facility that satisfies all customers is an architectural utopia and not possible in practice.

[1] Note that in the system of basic registries, the term “authentic data” used does not appear in this classification. It is defined as data that is contained in a basic register and designated as authentic by legal regulation. It can be either master data, referential data or transactional data! Referential data that comes from a source outside the system (such as zip codes) is called a “non-authentic data” in basic registrations.

Read the other information science principles here:

  1. Meaningless identity designation, read here.
  2. Decoupling points for complexity reduction and flexibility, maximizing independence of components, read here.
  3. Language consistency, read here.
  4. Clear distribution of responsibilities and functional separation for administration, read here.
  5. Delegating decision-making authority as low as possible, read here.
  6. Detaching authorization from identification/authentication, read here.
  7. Single registration of master data, read here.
  8. Separating data and metadata in storage and processing, read here.
  9. Applying standard patterns without deviations, read here.
  10. Separating application function from data storage, read here.

Gerelateerde Inzichten

divider