To clarify the principle, I use the metaphor of transportation. The goods are the data. That is ultimately what has value for the end user. The various transport systems are the processing functions or application functions . If the goods were hard-coupled to the transport system (e.g. container transport), the good would be ‘trapped’ in that container from production in the factory to the end user. In itself, this is not an impediment to transport but it creates constraints. The goods cannot then be transported by any other transport system (by air or in a smaller vehicle). There would be container trucks driving and ships sailing everywhere, and redistribution from the container packing to another means of transport is impossible. This is inflexible from the point of view of the goods and the end user.
In other words, if the data cannot be processed separately from the application function, then it is ‘trapped’ in that application function. The user can only access the data through that application function.
In freight transport, the ability to deliver goods through different transport systems has been created to suit the use. The good can be ‘accessed’ by different systems to perform the desired function with it. If it has to be fast then it can be done via air freight. In a city via Coolblue’s cargo bike.
If we separate application function and data storage, the data can be processed by more than that one application function! That provides all kinds of advantages
Between the application and the data is an ‘API’ (an Application Programming Interface). An API is the modern decoupling component in computing that separates the data from the application. By standardizing the API, different applications can access a data store in the same way. But beware; this standardization is no mean feat! The structure of the data store and meaning of the data does need to fit the API standard. Data management should ensure that data quality is appropriate and that it is clear which data storage is the source for which data.
Moreover, access to the data (which, in the case of unsegregated storage, is via the application function) must be controlled on the storage itself. This means that the API must be authorized via a security mechanism, possibly such that the end user accessing data via the API can be authenticated and authorized to that data.
So it is certainly not easier to implement than an unsegregated application! But you get something in return in terms of vendor independence and maintainability.
One of the advantages of not being ‘trapped’ in the application function is that the user can avoid vendor lock-in. By imposing requirements on data storage, you can use it outside the vendor’s application. In the past, this was often the reason why this principle was not applied by vendors. Still, data portability by suppliers is limited and it is difficult to renew an application without major data migration processes because the data storage only works for that one application.
This principle also contributes to the FAIR (findability, accessibility, interoperability, and reusability) principles formulated in 2016 for scientific datasets. In particular, accessibility and reusability benefit from a dataset that can be accessed separately from the application.
The separation benefits the maintainability (modifiability) of both application layer and data storage. Difficult to build new functionality in the existing registry application can be added as a separate application on the same data storage. The Mendix platform has exploited this advantage.
Conversely, you can replace an obsolete database with other technology (even a cloud database) while leaving the application itself untouched. Only the API needs to link to the new database location.
This principle is one of the central tenets within Common Ground. By separating data, data can be shared between different applications without having to be stored multiple times. The goal here is to have all applications get the data from the same source. The aforementioned effort to ensure the quality and meaning of the data though is the flip side of the benefit.
Another application we see within Whatsapp where the media received on the phone is also stored in the media store so that it can be accessed separately from other apps. Since the context is a person’s phone, the security for that separate media storage is already in place.
But there are still examples of it not being applied. For example, with email. Try accessing a .pst file from Microsoft Outlook (which contains all your email) separately from Outlook. Or save an email as a separate file so that you can edit and forward it in another program as email with attachments. I am sure there are handy conversion tools on the internet but it is not easy. And that’s because no standard API has been defined for email messages.
Another example can be seen with the many drawing tools. Each of these has its own storage format so a drawing can only be edited in that tooling.
As an IT Architect, you have to weigh up this principle carefully. It is not always necessary, there must be a justification for the extra effort you have to put into developing or applying an API between application layer and data layer. But usually that justification is quickly found.
Principal Architect