Principles of construction for information science (1): Meaningless identity designation

By Rutger Gooszen

In this series of blogs, I will focus on the enduring principles of information science that ensure better "information structures." These principles have sometimes been overlooked in the rapid advancement of technology, resulting in unstable or poorly maintainable and extensible "information structures." First, I will explore the benefits of keeping an identity designation meaningless.

In an administration, you want to be able to uniquely identify information objects. This allows you to refer to them and link characteristics of that object to one and the same "thing" or "right." Think, for example, of objects such as "customer," "vehicle," "organization," "bank account," "phone number," as well as documents like "certificate of ownership," "passport," and "driver's license." Therefore, they are given a unique identity designation. Why is it essential to keep the identity meaningless?

Principes van informatiemanagement, Decentralisatie van besluitvorming binnen de organisatie, Impact van besluitvormingsbevoegdheid

During the introduction of SEPA (Single Euro Payments Area) payment traffic, it was necessary to make account numbers unique across countries and banks. The goal was to simplify cross-border payments. The chosen coding - a country code, a bank code, and a number - ultimately had meaning. Thus, the goal of unique numbers was achieved, but another possibility was discarded; namely, portability of your account number! If we ever wanted to achieve the ability to transfer your account number to another bank (similar to mobile telephony), this coding would be unsuitable. After all, the bank code is embedded in it. This disadvantage for the consumer was either intentionally or unintentionally overlooked, but it is remarkable that there was never a proper discussion from an information science perspective and in the interest of consumers.

Another example is using a private email address as a unique designation for a "mailbox" or identity. The extension after the @ symbol often refers to the provider, and the designation before it may reveal something about the person. This might be intentional for recognition, but with that email address, you can't simply switch to another provider. Yet, it is frequently an identity designation used by you all over the web. If Google or Microsoft were to stop their email services tomorrow, many people would encounter problems, or at the very least, there would be numerous forwarding instructions. If you know that the extension after the @ symbol links to the internet address of the email server, you could also strive for an email extension where that link is adaptable without changing the extension (similar to having a custom domain extension).

Of course, it may have advantages to include meaning and information in an identity, but in general, you can also link that information to the object in another way. A wrong choice of coding can have significant consequences or limitations for other applications. You need to be aware of this when making decisions, especially if personal data sneaks into a code. Imagine if, at the time, my Citizen Service Number (BSN) had been encoded as NLADAM27121966-1234. My nationality, place of birth, and date of birth would already be known just from my BSN. While it would be a unique code assuming that no more than 9999 children are born on a single day in Amsterdam, it would not have been as widely applicable as it is now from a privacy standpoint.

When the goal is unique identification without restricting the future use of that identity designation, the construction principle for the code is, therefore, that it should be meaningless.

Related insights