TDD Vs. math formalism: friend or foe?

It is not uncommon to oppose the empirical process of TDD, together with its heavy use of unit tests, to the more mathematically based techniques, with the “formal methods” and formal verification at the other end of the spectrum. However I experienced again recently that the process of TDD can indeed help discover and draw upon math formalisms well-suited to the problem considered. We then benefit from the math formalism for an easier implementation and correctness.

It is quite frequent that maths structures, or more generally “established formalisms” as Eric Evans would say, are hidden everywhere in the business concepts we need to model in software.

Dates and how we take liberties with them for trading of financial instruments offer a good example of a business concept with an underlying math structure: traders of futures often use a notation like ‘U8’ to describe an expiry date like September 2018; ‘U’ means September, and the ‘8’ digit refers to 2018, but also to 2028, and 2038 etc. Notice that this notation only works for 10 years, and each code is recycled every decade.

The IMM trading floor in the early 70's (photo CME Group)

In the case of IMM contract codes, we only care about quarterly dates on:

  • March (H)
  • June (M)
  • September (U)
  • December (Z)

This yields only 4 possibilities for the month, combined with the 10 possible year digits, hence 40 different codes in total, over the range of 10 years.

How does that translate into source code?

As a software developer we are asked all the time to manage such IMM expiry codes:

  • Sort a given set of IMM contract codes
  • Find the next contract from the current “leading month” contract
  • Enumerate the next 11 codes from the current “leading month” contract, etc.

This is often done ad hoc with a gazillion of functions for each use-case, leading to thousands of lines of code hard to maintain because they involve parsing of the ‘U8’ format everytime we want to calculate something.

With TDD, we can now tackle this topic with more rigor, starting with tests to define what we want to achieve.

The funny thing is that in the process of doing TDD, the cyclic logic of the IMM codes struck me and strongly reminded me of the cyclic group Z/nZ. I had met this strange maths creature at school many years ago, I had a hard time with it by the way. But now on a real example it was definitely more interesting!

The source code (Java) for this post is on Github.

Draw on established formalisms

Thanks to Google it is easy to find something even with just a vague idea of how it’s named, and thanks to Wikipedia, it is easy to find out more about any established formalism like Cyclic Groups. In particular we find that:

Every finite cyclic group is isomorphic to the group { [0], [1], [2], …, [n ? 1] } of integers modulo n under addition

The Wikipedia page also mentions a concept of the product of cyclic groups in relation with their order (here the number of elements). Looks like this is the math-ish way to say that 4 possibilities for quarterly months combined with 10 possible year digits give 40 different codes in total.

So what? Sounds like we could identify the set of the 4 months to a cyclic group, the set of the 10 year digits to another, and that even the combination (product) of both also looks like a cyclic group of order 10 * 4 = 40 (even though the addition operation will not be called like that). So what?

Because we’ve just seen that there is an isomorphism between any finite cyclic group and the cyclic group of integer of the same order, we can just switch to the integer cyclic group logic (plain integers and the modulo operator) to simplify the implementation big time.

Basically the idea is to convert from the IMM code “Z3” to the corresponding ‘ordinal’ integer in the range 0..39, then do every operation on this ‘ordinal’ integer instead of the actual code. Then we can format back to a code “Z3” whenever we really need it.

Do I still need TDD when I have a complete formal solution?

I must insist that I did not came to this conclusion as easily. The process of TDD was indeed very helpful not to get lost in every possible direction along the way. Even when you have found a formal structure that could solve your problem in one go, even in a “formal proof-ish fashion”, then perhaps you need less tests to verify the correctness, but you sure still need tests to think on the specification part of your problem. This is your gentle reminder that TDD is not about unit tests.

Partial order in a cyclic group

Given a list of IMM codes we often need to sort them for display. The problem is that a cyclic group has no total order, the ordering depends on where you are in time.

Let’s take the example of the days of the week that also forms a cycle: MONDAY, TUESDAY, WEDNESDAY…SUNDAY, MONDAY etc.

If we only care about the future, is MONDAY before WEDNESDAY? Yes, except if we’re on TUESDAY. If we’re on TUESDAY, MONDAY means next MONDAY hence comes after WEDNESDAY, not before.

This is why we cannot unfortunately just implement Comparable to take care of the ordering. Because we need to consider a reference IMM code-aware partial order, we need to resort to a Comparator that takes the reference IMM code in its constructor.

Once we identify that situation to the cyclic group of integers, it becomes easy to shift both operands of the comparison to 0 before comparing them in a safe (total order-ish) way. Again, this trick is made possible by the freedom to experiment given by the TDD tests. As long as we’re still green, we can go ahead and try any funky approach.

Try it as a kata

This example is also a good coding kata that we’ve tried at work not long ago. Given a simple presentation of the format of an IMM contract code, you can choose to code the sort, find the next and previous code, and perhaps even optimize for memory (cache the instances, e.g. lazily) and speed (cache the toString() value, e.g. in the constructor) if you still have some time.

In closing

Maths structures are hidden behind many common business concepts. I developed an habit to look for them whenever I can, because they always help make us think, they help question our understanding of the domain problem (“is my domain problem really similar in some way to this structure?”), and of course because they often offer wonderful ready-made implementation hints!

The source code (Java) for this post is on Github.
Follow me on Twitter!
Photo: CME Group

Read More

Patterns as another kind of types

Patterns represent a couple (intent, solution), where the intent matters most. Based on that intents, that can be generic or specialized, I propose to consider patterns like types in languages with strong typing, for the compiler to enforce their constraints.

Declaring patterns: what for?

Consider the very simple Quantity pattern from Analysis Patterns (Fowler):

Represent dimensioned values with both their amount and their unit

By simply declaring this pattern, we immediately express many things:

  1. We wish to represent a quantity with its unit
  2. Quantity is a special kind of ValueObject (as in Fowler or DDD), hence:
    1. It is Immutable, therefore it has no setter, only a valued constructor
    2. Identity is based solely on its values, two value objects are equal if they share the same values
  3. In Java, the hashcode() and equals() methods must be implemented solely on the values of the immutable fields, and the hashcode can be cached internally; the toString() result can be cached as well
  4. In the spirit of Domain Driven Design, it also probably follows Closure of Operation and Side -Effect-Free Functions
  5. It must not depend on any heavyweight dependencies such as middleware, database or gui classes
  6. It must not depend on Entities and Services
Strangely typed furniture
Strangely typed furniture

If we had an explicit way to mark the use of the pattern in the code, and if the compiler was able to know about this pattern, then it could enforce all the above. Moreover, it could also generate corresponding unit tests to assert the dynamic and runtime aspects constrained by the pattern.

This approach can be considered similar to the typing used in programming languages, where the compiler can enforce many constraints according to the type declarations given by the developer.

In practice

In current languages there is no explicit way to declare the use of patterns in the source code. The reader of the source must pay attention to various hints and compare to his/her prior knowledge to decide whether or not a pattern is used. Typical hints include naming conventions, where the class or interface names ends with the name of the pattern participant: an interface TradeFactory is probably a Factory that creates Trade instances.

Since many years I have been using a custom Javadoc tag @pattern to explicitly mark the patterns I use in my source code, hoping that a tool could use it in the future. Over time I have noticed several benefits from doing that, the biggest one being that I am forced to be 100% clear about what I want. Many times I noticed that asking myself to explicitly declare in patterns what I was doing required me to think deeper. Though I know what I am doing, when trying to express that intent using patterns from the books I am forced to clarify my intent to decide which exact pattern I am applying.  This is fully in the spirit of the Pragmatic Programmer‘s “Programming by Coincidence“, and also in the spirit of Test Driven Development where the unit test is forcing you to clarify what you really want.

In other words, we could call that Pattern Driven Development. PDD, yet another acronym!

Yet another strangely typed object (the disco vaccum cleaner)
Yet another strangely typed object (the disco vaccum cleaner)

Discussion

It may not sound intuitive that there exists a documented pattern for each and every possible design decision, but I can confirm that the panel of existing patterns is very wide and already covers a lot of common design decisions. Such patterns are far from sophisticated solutions or tricks, they usually just name intents and known practices. But this is exactly what we need. Dirk Riehle, in a recent paper entitled Design Pattern Density Defined, found that in the case of open-source frameworks, and considering only classic design patterns, the density of such patterns was between 40 and 70%, where density is defined by “The design pattern density of an object-oriented framework is the percentage of its collaborations that are design pattern instances.

In addition, it becomes increasingly common for software teams to use custom patterns as a convenient way of documenting their common design practices. Actually this is how the pattern story began when Kent Beck and Ralph Johnson decided to document the design of HotDraw in the now classic paper: Patterns generate architectures. To illustrate that, the Drupal team is using custom patterns in addition to standard patterns to document their design in an efficient way, as described in the program of the DrupalCon Paris 2009 conference:

This session will cover the how and why of design patterns, and review the most common patterns used in Drupal as well as a number of common patterns we could be using. Some are Drupal-specific patterns and some more more general software design patterns.

Another major example of patterns dedicated for a particular project is the the Eclipse IDE. Erich Gamma, the main Eclipse designer -one of the Gang of Four- wrote the book Contributing to Eclipse: principles, patterns, and plug-ins (together with Kent Beck) where Eclipse-specific and standard patterns are used together as the primary mean of documenting the design.

Conclusion

Patterns are signs to denote intents, and I propose to promote the use of patterns (patterns already existing or to be defined in the context of a particular project) to trace these intents explicitly, and to enable tools to enforce pattern-specific constraints.

Pictures taken at la Biennale Internationale Design, Saint Etienne 2008

Read More