The Self-descriptiveness pattern

The Self-Descriptiveness pattern can save your life many times in the course of any software project. The power behind it is to harness the computer, rather than yourself.

What is Self-Descriptiveness?

Self-Descriptiveness is a property of any system that is able to describe itself with no external help. Just ask it “describe yourself” and it will. In other words, it is a system where its documentation is embedded into itself.

This pattern is so essential, however it is not expressed often. I imagine it is too obvious for those that appreciate it, whereas other do not imagine its benefits and focus on how memory or bandwidth they save instead.

You can find this pattern in many areas:

Database

Database tables in many databases are self-descriptive, thanks to definitions or system tables. This enables to query the database for its own schema, without any other documentation or explanation. This also enables tools to browse a database just by connecting to it. Databases try to be user-friendly (SQL), so here Self-Descriptiveness also means affordability for users.

XML

XML is one of the most famous example of self-descriptive format. The use of named tags around the values to describe them is supposed to enable humans or even tools to read and extract their meaning even with no prior knowledge of the exact format. The ordering could vary, elements may be missing, it would remain readable. In this case Self-Descriptiveness means robustness against variations, perhaps to be more future-proof.

This key holder is self-descriptive
This key holder is self-descriptive

Spreadsheets, associative arrays

Spreadsheets with header columns, Map and associative arrays in general are other low-tech forms of self-descriptiveness we use all the time. You don’t have to remember what each columns represents (or each slot in a Map), the headers (or keys) tell you that directly. Here Self-Descriptiveness means convenience for the developers.

Reflection

Reflection built into languages such as Java enables the software to query its own structure. It becomes then possible to introspect each class to find out which fields and methods it has, and what class or interface its extends and implements. The metamodel of Java makes class definitions self-descriptive. This feature is essential to load at runtime code that was not known at compile time. Here Self-Descriptiveness means extensibility.

Analysis patterns

This leads to the Knowledge Level pattern (also known as Metamodel), which uses objects to describe the behavior of other objects. The Operation Level becomes self-descriptive thanks to its Knowledge Level. This is typically how databases and virtual machines implement their Self-Descriptiveness. The Knowledge Level pattern is key to achieve Self-Descriptiveness for systems.

The Quantity patterns is a simple yet very powerful form of Self-Descriptiveness for your ordinary project, especially in the Domain Layer. The idea is to keep the unit attached with the value. In the case of the Money pattern, the unit is the currency. If your application deals with percents, by all means introduce from the beginning a class Percentage that follows the Quantity pattern. This simple class has the potential to completely eliminate every percent-related bug! In the finance domain, interest rates, usually expressed in percents but that are actually in percent per year, deserve their own Quantity class as well. This class should be distinct from the usual percentage. Again this simple decision wil ensure that no one will ever confuse one for another. You could even go further and type each kind of interest rate by its destination, either a coupon rate, or a yield, and therefore have the computer verify they must not be confused.

Be explicit!

Self-Descriptiveness could be generalized into one principle: “Be explicit!”. This means that everything implicit must be identified and made explicit. Your application only deals with one currency? Make the currency explicit anyway. Your time periods are all an integer number of years? Make the time unit (years) explicit anyway. All your data come from Reuters? Add a field “Source” to track the source of the data anyway. Even if YAGNI, the consistency of your domain matters. And perhaps you will gonna need it!

This chair is self-descriptive, and explicit!
This chair is self-descriptive, and explicit!

Self-Descriptiveness is obviously very precious for debugging and maintenance. First you reduce the possibility of bugs: you can try to sum 0.05 +2% + 35bp (basis points are one-hundreds of percent) and still have a valid result (7.35%) if you use Quantity objects for them. Then debugging becomes a breeze, everything tells its story, especially if you have written judicious toString() methods.

Optimization on the other hand appears to dislike Self-Descriptiveness. XML is very verbose, a Quantity object consumes more memory than just the primitive value inside, etc. In most cases, it is not worth compromising Self-Descriptiveness for the sake of optimization. And even when you need real performance, there are solutions to retain the benefits of Self-Descriptiveness without its cost. XML compression works very well, associative maps can sometimes be implemented as random access in an array, etc. And extreme needs can be satisfied with extremely sophisticated solutions*.

To be fully honest, Self Descriptiveness can hardly be achieved in practice as explained by Mark Baker in his blog, but your project deserves as much Self Descriptiveness as you can possibly afford!

*For instance the FIX protocol, used for financial data feeds, is a very old key-value format. It is not really self-descriptive as the keys are numbers, not labels. However just like self-descriptive formats, the keys impede efficiency: they consume bandwidth for almost nothing. So instead of changing the format in favour of efficiency, the protocol FAST has been invented. This optimization done by the computer, not by developers, extracts in real time the values using a message template, compress them and send them in fixed order, in other words in a fully implicit way. On reception the message is automatically rebuilt against the template. More details about FIX and FAST can be found here and here. This is a fairly sophisticated solution.

Pictures taken at the Milano Furniture Fair 2009.

cyrille

Software development, Domain-Driven Design, patterns and agile principles enthusiast