TDD Vs. math formalism: friend or foe?

It is not uncommon to oppose the empirical process of TDD, together with its heavy use of unit tests, to the more mathematically based techniques, with the “formal methods” and formal verification at the other end of the spectrum. However I experienced again recently that the process of TDD can indeed help discover and draw upon math formalisms well-suited to the problem considered. We then benefit from the math formalism for an easier implementation and correctness.

It is quite frequent that maths structures, or more generally “established formalisms” as Eric Evans would say, are hidden everywhere in the business concepts we need to model in software.

Dates and how we take liberties with them for trading of financial instruments offer a good example of a business concept with an underlying math structure: traders of futures often use a notation like ‘U8’ to describe an expiry date like September 2018; ‘U’ means September, and the ‘8’ digit refers to 2018, but also to 2028, and 2038 etc. Notice that this notation only works for 10 years, and each code is recycled every decade.

The IMM trading floor in the early 70's (photo CME Group)

In the case of IMM contract codes, we only care about quarterly dates on:

  • March (H)
  • June (M)
  • September (U)
  • December (Z)

This yields only 4 possibilities for the month, combined with the 10 possible year digits, hence 40 different codes in total, over the range of 10 years.

How does that translate into source code?

As a software developer we are asked all the time to manage such IMM expiry codes:

  • Sort a given set of IMM contract codes
  • Find the next contract from the current “leading month” contract
  • Enumerate the next 11 codes from the current “leading month” contract, etc.

This is often done ad hoc with a gazillion of functions for each use-case, leading to thousands of lines of code hard to maintain because they involve parsing of the ‘U8’ format everytime we want to calculate something.

With TDD, we can now tackle this topic with more rigor, starting with tests to define what we want to achieve.

The funny thing is that in the process of doing TDD, the cyclic logic of the IMM codes struck me and strongly reminded me of the cyclic group Z/nZ. I had met this strange maths creature at school many years ago, I had a hard time with it by the way. But now on a real example it was definitely more interesting!

The source code (Java) for this post is on Github.

Draw on established formalisms

Thanks to Google it is easy to find something even with just a vague idea of how it’s named, and thanks to Wikipedia, it is easy to find out more about any established formalism like Cyclic Groups. In particular we find that:

Every finite cyclic group is isomorphic to the group { [0], [1], [2], …, [n ? 1] } of integers modulo n under addition

The Wikipedia page also mentions a concept of the product of cyclic groups in relation with their order (here the number of elements). Looks like this is the math-ish way to say that 4 possibilities for quarterly months combined with 10 possible year digits give 40 different codes in total.

So what? Sounds like we could identify the set of the 4 months to a cyclic group, the set of the 10 year digits to another, and that even the combination (product) of both also looks like a cyclic group of order 10 * 4 = 40 (even though the addition operation will not be called like that). So what?

Because we’ve just seen that there is an isomorphism between any finite cyclic group and the cyclic group of integer of the same order, we can just switch to the integer cyclic group logic (plain integers and the modulo operator) to simplify the implementation big time.

Basically the idea is to convert from the IMM code “Z3” to the corresponding ‘ordinal’ integer in the range 0..39, then do every operation on this ‘ordinal’ integer instead of the actual code. Then we can format back to a code “Z3” whenever we really need it.

Do I still need TDD when I have a complete formal solution?

I must insist that I did not came to this conclusion as easily. The process of TDD was indeed very helpful not to get lost in every possible direction along the way. Even when you have found a formal structure that could solve your problem in one go, even in a “formal proof-ish fashion”, then perhaps you need less tests to verify the correctness, but you sure still need tests to think on the specification part of your problem. This is your gentle reminder that TDD is not about unit tests.

Partial order in a cyclic group

Given a list of IMM codes we often need to sort them for display. The problem is that a cyclic group has no total order, the ordering depends on where you are in time.

Let’s take the example of the days of the week that also forms a cycle: MONDAY, TUESDAY, WEDNESDAY…SUNDAY, MONDAY etc.

If we only care about the future, is MONDAY before WEDNESDAY? Yes, except if we’re on TUESDAY. If we’re on TUESDAY, MONDAY means next MONDAY hence comes after WEDNESDAY, not before.

This is why we cannot unfortunately just implement Comparable to take care of the ordering. Because we need to consider a reference IMM code-aware partial order, we need to resort to a Comparator that takes the reference IMM code in its constructor.

Once we identify that situation to the cyclic group of integers, it becomes easy to shift both operands of the comparison to 0 before comparing them in a safe (total order-ish) way. Again, this trick is made possible by the freedom to experiment given by the TDD tests. As long as we’re still green, we can go ahead and try any funky approach.

Try it as a kata

This example is also a good coding kata that we’ve tried at work not long ago. Given a simple presentation of the format of an IMM contract code, you can choose to code the sort, find the next and previous code, and perhaps even optimize for memory (cache the instances, e.g. lazily) and speed (cache the toString() value, e.g. in the constructor) if you still have some time.

In closing

Maths structures are hidden behind many common business concepts. I developed an habit to look for them whenever I can, because they always help make us think, they help question our understanding of the domain problem (“is my domain problem really similar in some way to this structure?”), and of course because they often offer wonderful ready-made implementation hints!

The source code (Java) for this post is on Github.
Follow me on Twitter!
Photo: CME Group

Read More

Java Enums: You have grace, elegance and power and this is what I Love!

While Java 8 is coming, are you sure you know well the enums that were introduced in Java 5? Java enums are still underestimated, and it’s a pity since they are more useful than you might think, they’re not just for your usual enumerated constants!

Java enum is polymorphic

Java enums are real classes that can have behavior and even data.

Let’s represent the Rock-Paper-Scissors game using an enum with a single method. Here are the unit tests to define the behavior:

@Test
public void paper_beats_rock() {
	assertThat(PAPER.beats(ROCK)).isTrue();
	assertThat(ROCK.beats(PAPER)).isFalse();
}
@Test
public void scissors_beats_paper() {
	assertThat(SCISSORS.beats(PAPER)).isTrue();
	assertThat(PAPER.beats(SCISSORS)).isFalse();
}
@Test
public void rock_beats_scissors() {
	assertThat(ROCK.beats(SCISSORS)).isTrue();
	assertThat(SCISSORS.beats(ROCK)).isFalse();
}
And here is the implementation of the enum, that primarily relies on the ordinal integer of each enum constant, such as the item N+1 wins over the item N. This equivalence between the enum constants and the integers is quite handy in many cases.
/** Enums have behavior! */
public enum Gesture {
	ROCK() {
		// Enums are polymorphic, that's really handy!
		@Override
		public boolean beats(Gesture other) {
			return other == SCISSORS;
		}
	},
	PAPER, SCISSORS;

	// we can implement with the integer representation
	public boolean beats(Gesture other) {
		return ordinal() - other.ordinal() == 1;
	}
}

Notice that there is not a single IF statement anywhere, all the business logic is handled by the integer logic and by the polymorphism, where we override the method for the ROCK case. If the ordering between the items was not cyclic we could implement it just using the natural ordering of the enum, here the polymorphism helps deal with the cycle.


You can do it without any IF statement! Yes you can!

This Java enum is also a perfect example that you can have your cake (offer a nice object-oriented API with intent-revealing names), and eat it too (implement with simple and efficient integer logic like in the good ol’ days).

Over my last projects I’ve used a lot enums as a substitute for classes: they are guaranted to be singleton, have ordering, hashcode, equals and serialization to and from text all built-in, without any clutter in the source code.

If you’re looking for Value Objects and if you can represent a part of your domain with a limited set of instances, then the enum is what you need! It’s a bit like the Sealed Case Class in Scala, except it’s totally restricted to a set of instances all defined at compile time. The bounded set of instances at compile-time is a real limitation, but now with continuous delivery, you can probably wait for the next release if you really need one extra case.

Well-suited for the Strategy pattern

Let’s move to to a system for the (in-)famous Eurovision song contest; we want to be able to configure the behavior on when to notify (or not) users of any new Eurovision event. It’s important. Let’s do that with an enum:

/** The policy on how to notify the user of any Eurovision song contest event */
public enum EurovisionNotification {

	/** I love Eurovision, don't want to miss it, never! */
	ALWAYS() {
		@Override
		public boolean mustNotify(String eventCity, String userCity) {
			return true;
		}
	},

	/**
	 * I only want to know about Eurovision if it takes place in my city, so
	 * that I can take holidays elsewhere at the same time
	 */
	ONLY_IF_IN_MY_CITY() {
		// a case of flyweight pattern since we pass all the extrinsi data as
		// arguments instead of storing them as member data
		@Override
		public boolean mustNotify(String eventCity, String userCity) {
			return eventCity.equalsIgnoreCase(userCity);
		}
	},

	/** I don't care, I don't want to know */
	NEVER() {
		@Override
		public boolean mustNotify(String eventCity, String userCity) {
			return false;
		}
	};

	// no default behavior
	public abstract boolean mustNotify(String eventCity, String userCity);

}

And a unit test for the non trivial case ONLY_IF_IN_MY_CITY:

@Test
public void notify_users_in_Baku_only() {
	assertThat(ONLY_IF_IN_MY_CITY.mustNotify("Baku", "BAKU")).isTrue();
	assertThat(ONLY_IF_IN_MY_CITY.mustNotify("Baku", Paris")).isFalse();
}

Here we define the method abstract, only to implement it for each case. An alternative would be to implement a default behavior and only override it for each case when it makes sense, just like in the Rock-Paper-Scissors game.

Again we don’t need the switch on enum to choose the behavior, we rely on polymorphism instead. You probably don’t need the switch on enum much, except for dependency reasons. For example when the enum is part of a message sent to the outside world as in Data Transfer Objects (DTO), you do not want any dependency to your internal code in the enum or its signature.

For the Eurovision strategy, using TDD we could start with a simple boolean for the cases ALWAYS and NEVER. It would then be promoted into the enum as soon as we introduce the third strategy ONLY_IF_IN_MY_CITY. Promoting primitives is also in the spirit of the 7th rule “Wrap all primitives” from the Object Calisthenics, and an enum is the perfect way to wrap a boolean or an integer with a bounded set of possible values.

Because the strategy pattern is often controlled by configuration, the built-in serialization to and from String is also very convenient to store your settings.

Perfect match for the State pattern

Just like the Strategy pattern, the Java enum is very well-suited for finite state machines, where by definition the set of possible states is finite.

A baby as a finite state machine (picture from www.alongcamebaby.ca)

Let’s take the example of a baby simplified as a state machine, and make it an enum:

/**
 * The primary baby states (simplified)
 */
public enum BabyState {

	POOP(null), SLEEP(POOP), EAT(SLEEP), CRY(EAT);

	private final BabyState next;

	private BabyState(BabyState next) {
		this.next = next;
	}

	public BabyState next(boolean discomfort) {
		if (discomfort) {
			return CRY;
		}
		return next == null ? EAT : next;
	}
}

And of course some unit tests to drive the behavior:

@Test
public void eat_then_sleep_then_poop_and_repeat() {
	assertThat(EAT.next(NO_DISCOMFORT)).isEqualTo(SLEEP);
	assertThat(SLEEP.next(NO_DISCOMFORT)).isEqualTo(POOP);
	assertThat(POOP.next(NO_DISCOMFORT)).isEqualTo(EAT);
}

@Test
public void if_discomfort_then_cry_then_eat() {
	assertThat(SLEEP.next(DISCOMFORT)).isEqualTo(CRY);
	assertThat(CRY.next(NO_DISCOMFORT)).isEqualTo(EAT);
}

Yes we can reference enum constants between them, with the restriction that only constants defined before can be referenced. Here we have a cycle between the states EAT -> SLEEP -> POOP -> EAT etc. so we need to open the cycle and close it with a workaround at runtime.

We indeed have a graph with the CRY state that can be accessed from any state.

I’ve already used enums to represent simple trees by categories simply by referencing in each node its elements, all with enum constants.

Enum-optimized collections

Enums also have the benefits of coming with their dedicated implementations for Map and Set: EnumMap and EnumSet.

These collections have the same interface and behave just like your regular collections, but internally they exploit the integer nature of the enums, as an optimization. In short you have old C-style data structures and idioms (bit masking and the like) hidden behind an elegant interface. This also demonstrate how you don’t have to compromise your API’s for the sake of efficiency!

To illustrate the use of these dedicated collections, let’s represent the 7 cards in Jurgen Appelo’s Delegation Poker:

public enum AuthorityLevel {

	/** make decision as the manager */
	TELL,

	/** convince people about decision */
	SELL,

	/** get input from team before decision */
	CONSULT,

	/** make decision together with team */
	AGREE,

	/** influence decision made by the team */
	ADVISE,

	/** ask feedback after decision by team */
	INQUIRE,

	/** no influence, let team work it out */
	DELEGATE;

There are 7 cards, the first 3 are more control-oriented, the middle card is balanced, and the 3 last cards are more delegation-oriented (I made that interpretation up, please refer to his book for explanations). In the Delegation Poker, every player selects a card for a given situation, and earns as many points as the card value (from 1 to 7), except the players in the “highest minority”.

It’s trivial to compute the number of points using the ordinal value + 1. It is also straightforward to select the control oriented cards by their ordinal value, or we can use a Set built from a range like we do below to select the delegation-oriented cards:

	public int numberOfPoints() {
		return ordinal() + 1;
	}

	// It's ok to use the internal ordinal integer for the implementation
	public boolean isControlOriented() {
		return ordinal() < AGREE.ordinal();
	}

	// EnumSet is a Set implementation that benefits from the integer-like
	// nature of the enums
	public static Set DELEGATION_LEVELS = EnumSet.range(ADVISE, DELEGATE);

	// enums are comparable hence the usual benefits
	public static AuthorityLevel highest(List levels) {
		return Collections.max(levels);
	}
}

EnumSet offers convenient static factory methods like range(from, to), to create a set that includes every enum constant starting between ADVISE and DELEGATE in our example, in the declaration order.

To compute the highest minority we start with the highest card, which is nothing but finding the max, something trivial since the enum is always comparable.

Whenever we need to use this enum as a key in a Map, we should use the EnumMap, as illustrated in the test below:

// Using an EnumMap to represent the votes by authority level
@Test
public void votes_with_a_clear_majority() {
	final Map<AuthorityLevel, Integer> votes = new EnumMap(AuthorityLevel.class);
	votes.put(SELL, 1);
	votes.put(ADVISE, 3);
	votes.put(INQUIRE, 2);
	assertThat(votes.get(ADVISE)).isEqualTo(3);
}

Java enums are good, eat them!

I love Java enums: they’re just perfect for Value Objects in the Domain-Driven Design sense where the set of every possible values is bounded. In a recent project I deliberatly managed to have a majority of value types expressed as enums. You get a lot of awesomeness for free, and especially with almost no technical noise. This helps improve my signal-to-noise ratio between the words from the domain and the technical jargon.

Or course I make sure each enum constant is also immutable, and I get the correct equals, hashcode, toString, String or integer serialization, singleton-ness and very efficient collections on them for free, all that with very little code.

(picture from sys-con.com – Jim Barnabee article)”]
The power of polymorphism

The enum polymorphism is very handy, and I never use instanceof on enums and I hardly need to switch on the enum either.

I’d love that the Java enum is completed by a similar construct just like the case class in Scala, for when the set of possible values cannot be bounded. And a way to enforce immutability of any class would be nice too. Am I asking too much?

Also <troll>don’t even try to compare the Java enum with the C# enum…</troll>

Read More