It is not uncommon to oppose the empirical process of TDD, together with its heavy use of unit tests, to the more mathematically based techniques, with the “formal methods” and formal verification at the other end of the spectrum. However I experienced again recently that the process of TDD can indeed help discover and draw upon math formalisms well-suited to the problem considered. We then benefit from the math formalism for an easier implementation and correctness.
It is quite frequent that maths structures, or more generally “established formalisms” as Eric Evans would say, are hidden everywhere in the business concepts we need to model in software.
Dates and how we take liberties with them for trading of financial instruments offer a good example of a business concept with an underlying math structure: traders of futures often use a notation like ‘U8’ to describe an expiry date like September 2018; ‘U’ means September, and the ‘8’ digit refers to 2018, but also to 2028, and 2038 etc. Notice that this notation only works for 10 years, and each code is recycled every decade.
The IMM trading floor in the early 70's (photo CME Group)
In the case of IMM contract codes, we only care about quarterly dates on:
March (H)
June (M)
September (U)
December (Z)
This yields only 4 possibilities for the month, combined with the 10 possible year digits, hence 40 different codes in total, over the range of 10 years.
How does that translate into source code?
As a software developer we are asked all the time to manage such IMM expiry codes:
Sort a given set of IMM contract codes
Find the next contract from the current “leading month” contract
Enumerate the next 11 codes from the current “leading month” contract, etc.
This is often done ad hoc with a gazillion of functions for each use-case, leading to thousands of lines of code hard to maintain because they involve parsing of the ‘U8’ format everytime we want to calculate something.
With TDD, we can now tackle this topic with more rigor, starting with tests to define what we want to achieve.
The funny thing is that in the process of doing TDD, the cyclic logic of the IMM codes struck me and strongly reminded me of the cyclic group Z/nZ. I had met this strange maths creature at school many years ago, I had a hard time with it by the way. But now on a real example it was definitely more interesting!
Thanks to Google it is easy to find something even with just a vague idea of how it’s named, and thanks to Wikipedia, it is easy to find out more about any established formalism like Cyclic Groups. In particular we find that:
Every finite cyclic group is isomorphic to the group { [0], [1], [2], …, [n ? 1] } of integers modulo n under addition
The Wikipedia page also mentions a concept of the product of cyclic groups in relation with their order (here the number of elements). Looks like this is the math-ish way to say that 4 possibilities for quarterly months combined with 10 possible year digits give 40 different codes in total.
So what? Sounds like we could identify the set of the 4 months to a cyclic group, the set of the 10 year digits to another, and that even the combination (product) of both also looks like a cyclic group of order 10 * 4 = 40 (even though the addition operation will not be called like that). So what?
Because we’ve just seen that there is an isomorphism between any finite cyclic group and the cyclic group of integer of the same order, we can just switch to the integer cyclic group logic (plain integers and the modulo operator) to simplify the implementation big time.
Basically the idea is to convert from the IMM code “Z3” to the corresponding ‘ordinal’ integer in the range 0..39, then do every operation on this ‘ordinal’ integer instead of the actual code. Then we can format back to a code “Z3” whenever we really need it.
Do I still need TDD when I have a complete formal solution?
I must insist that I did not came to this conclusion as easily. The process of TDD was indeed very helpful not to get lost in every possible direction along the way. Even when you have found a formal structure that could solve your problem in one go, even in a “formal proof-ish fashion”, then perhaps you need less tests to verify the correctness, but you sure still need tests to think on the specification part of your problem. This is your gentle reminder that TDD is not about unit tests.
Partial order in a cyclic group
Given a list of IMM codes we often need to sort them for display. The problem is that a cyclic group has no total order, the ordering depends on where you are in time.
Let’s take the example of the days of the week that also forms a cycle: MONDAY, TUESDAY, WEDNESDAY…SUNDAY, MONDAY etc.
If we only care about the future, is MONDAY before WEDNESDAY? Yes, except if we’re on TUESDAY. If we’re on TUESDAY, MONDAY means next MONDAY hence comes after WEDNESDAY, not before.
This is why we cannot unfortunately just implement Comparable to take care of the ordering. Because we need to consider a reference IMM code-aware partial order, we need to resort to a Comparator that takes the reference IMM code in its constructor.
Once we identify that situation to the cyclic group of integers, it becomes easy to shift both operands of the comparison to 0 before comparing them in a safe (total order-ish) way. Again, this trick is made possible by the freedom to experiment given by the TDD tests. As long as we’re still green, we can go ahead and try any funky approach.
Try it as a kata
This example is also a good coding kata that we’ve tried at work not long ago. Given a simple presentation of the format of an IMM contract code, you can choose to code the sort, find the next and previous code, and perhaps even optimize for memory (cache the instances, e.g. lazily) and speed (cache the toString() value, e.g. in the constructor) if you still have some time.
In closing
Maths structures are hidden behind many common business concepts. I developed an habit to look for them whenever I can, because they always help make us think, they help question our understanding of the domain problem (“is my domain problem really similar in some way to this structure?”), and of course because they often offer wonderful ready-made implementation hints!
There’s a metaphor I had in mind for a long time when thinking about software design: because I’m proudly lazy, in order to make the code smaller and easier to learn, I must do my best to reduce the “surface-area over volume ratio” of the software.
Surface-area over volume ratio?
I like the Surface-area over volume ratio as a metaphor to express how to make software cheaper to discover and learn, and smaller to maintain as well.
For a given object, the surface-area over volume ratio is the amount of surface area per unit volume. For buildings and for animals, the smaller this ratio, the less the heat loss during the winter, hence a better thermal efficiency.
Have you ever noticed that huge warehouses were always cool even during the summer when it’s hot? This is just because in our real 3D world the surface-area over volume ratio is much smaller when the absolute size of the building increases.
The theory also mentions that the sphere is the optimal shape with respect to this ratio. In fact, the more “compact” the less the ratio, or the other way round we could define compactness of an object directly by its surface-area-over-volume ratio.
A dodecahedron, a volume that approximates a sphere with just 2D facets (Wikipedia picture)
What about software design?
Let’s consider that each method signature of each interface is part of the surface-area of the software, because this is what I have to learn primarily when I join the project. The larger the surface-area, the more time I’ll need to learn, provided I can even remember all of it.
Larger surface is not good for the developers.
On the other hand, the implementation is part of what I would call the volume of the software, i.e. this is where the code is really doing its stuff. The more volume, the more powerful and richer the software. And of course the point of Object Orientation is that you don’t have to learn all the implementation in order to work on the project, as opposed to the interfaces and their method signatures.
Larger volume is good for the users (or the value brought by the software in general)
As a consequence we should try to minimize the surface-area over volume ratio, just like we’re trying to reduce it when designing a green building.
Can we extrapolate that we should design software to be more “compact” and more “sphere”-like?
Facets-like interfaces
Reusing the same interface as much as possible is obviously a way to reduce the surface-area of the software. Adhering to interfaces from the JDK or Google Guava, interfaces that are already well-known, helps even better: in our metaphor, an interface that we don’t have to learn comes for free, like a perfectly isolated wall in a building. We can almost omit it in our ratio.
To further reduce the ratio, we can find out every opportunity to use as much as possible the minimum set of common interfaces, even over unrelated concepts. At the extreme of this approach we get duck typing in dynamic languages. In our usual languages like Java or C# we must introduce additional small interfaces, usually with one single method.
For example in a trading system, every class with a isInCurrency(Currency) method can implement a common interface CurrencySpecific. As a result, a lot of processing (filtering etc.) on stuff that is related to currencies in some way can be done on all these classes without any knowledge about them, except their currency-specificity.
In this example, the currency-specificity we extracted into one interface is like a single facet over a larger volume made of several implementation. It makes our design more compact, it will be easier to learn, while offering a rich set of behaviors.
The limit for this approach of putting a lot of implementation code under the same interface is that sometimes it really makes no domain sense. Since code is primarily meant to describe the domain, without causing confusion we must be careful not to go too far. We must also take great care when sharing interfaces between bounded contexts, there’s a high risk of excessive coupling.
Faceted artwork (picture from http://reinierdejong.wordpress.com)
Yet another metric?
This metric could be measured by a tool, however the primary value is not in checking the figures, but in the thinking and taking care of making the design easy to learn (less surface-area), while delivering a lot of valuable behaviors (more volume).
The Composite pattern is a very powerful design pattern that you use regularly to manipulate a group of things through the very same interface than a single thing. By doing so you don’t have to discriminate between the singular and plural cases, which often simplifies your design.
Yet there are cases where you are tempted to use the Composite pattern but the interface of your objects does not fit quite well. Fear not, some simple refactorings on the methods signatures can make your interfaces Composite-friendly, because it’s worth it.
Always start with examples
Imagine an interface for a financial instrument with a getter on its currency:
public interface Instrument {
Currency getCurrency();
}
This interface is alright for a single instrument, however it does not scale for a group of instruments (Composite pattern), because the corresponding getter in the composite class would look like (notice that return type is now a collection):
public class CompositeInstrument {
// list of instruments...
public Set getCurrencies() {...}
}
We must admit that each instrument in a composite instrument may have a different currency, hence the composite may be multi-currency, hence the collection return type. This breaks the goal of the Composite pattern which is to unify the interfaces for single and multiple elements. If we stop there, we now have to discriminate between a single Instrument and a CompositeInstrument, and we have to discriminate that on every call site. I’m not happy with that.
The composite pattern applied to a lamp: same plug for one or several lamps
The brutal approach
The brutal approach is to generalize the initial interface so that it works for the composite case:
public interface Instrument {
Set getCurrencies() ;
}
This interface now works for both the single case and the composite case, but at the cost of always having to deal with a collection as return value. In fact I’m not that sure that we’ve simplified our design with this approach: if the composite case is not used that often, we even have complicated the design for little benefit, because the returned collection type always goes on our way, requiring a loop every time it is called.
The trick to improve that is just to investigate what our interface is really used for. The getter on the initial interface only reveals that we did not think about the actual use before, in other words it shows a design decision “by default”, or lack of.
Turn it into a boolean method
Very often this kind of getter is mostly used to test whether the instrument (single or composite) has something to do with a given currency, for example to check if an instrument is acceptable for a screen in USD or tradable by a trader who is only granted the right to trade in EUR.
In this case, you can revamp the method into another intention-revealing method that accepts a parameter and returns a boolean:
public interface Instrument {
boolean isInCurrency(Currency currency);
}
This interface remains simple, is closer to our needs, and in addition it now scales for use with a Composite, because the result for a Composite instrument can be derived from each result on each single instrument and the AND operator:
public class CompositeInstrument {
// list of instruments...
public boolean isInCurrency(Currency currency) {
boolean result;
// for each instrument, result &= isInCurrency(currency);
return result;
}
}
Something to do with Fold
As shown above the problem is all about the return value. Generalizing on boolean and their boolean logic from the previous example (‘&=’), the overall trick for a Composite-friendly interface is to define methods that return a type that is easy to fold over successive executions. For example the trick is to merge (“fold”) the boolean result of several calls into one single boolean result. You typically do that with AND or OR on boolean types.
If the return type is a collection, then you could perhaps merge the results using addAll(…) if it makes sense for the operation.
Technically, this is easily done when the return type is closed under an operation (magma), i.e. when the result of some operation is of the same type than the operand, just like ‘boolean1 AND boolean2‘ is also a boolean.
This is obviously the case for boolean and their boolean logic, but also for numbers and their arithmetic, collections and their sets operations, strings and their concatenation, and many other types including your own classes, as Eric Evans suggests you favour “Closure of Operations” in his book Domain-Driven Design.
Fire hydrants: from one pipe to multiple pipes (composite)
Turn it into a void method
Though not possible in our previous example, void methods work very well with the Composite pattern: with nothing to return, there is no need to unify or fold anything:
public class CompositeFunction {
List functions = ...;
public void apply(...) {
// for each function, function.apply(...);
}
}
Continuation-passing style
The last trick to help with the Composite pattern is to adopt the continuation passing style by passing a continuation object as a parameter to the method. The method then sets its result into it instead of using its return value.
As an example, to perform search on every node of a tree, you may use a continuation like this:
public class SearchResults {
public void addResult(Node node){ // append to list of results...}
public List getResults() { // return list of results...}
}
public class Node {
List children = ...;
public void search(SarchResults sr) {
//...
if (found){
sr.addResult(this);
}
// for each child, child.search(sr);
}
}
By passing a continuation as argument to the method, the continuation takes care of the multiplicity, and the method is now well suited for the Composite pattern. You may consider that the continuation indeed encapsulates into one object the process of folding the result of each call, and of course the continuation is mutable.
This style does complicates the interface of the method a little, but also offers the advantage of a single allocation of one instance of the continuation across every call.
That's continuation passing style (CC Some rights reserved by 2011 BUICK REGAL)
One word on exceptions
Methods that can throw exceptions (even unchecked exceptions) can complicate the use in a composite. To deal with exceptions within the loop that calls each child, you can just throw the first exception encountered, at the expense of giving up the loop. An alternative is to collect every caught exception into a Collection, then throw a composite exception around the Collection when you’re done with the loop. On some other cases the composite loop may also be a convenient place to do the actual exception handling, such as full logging, in one central place.
In closing
We’ve seen some tricks to adjust the signature of your methods so that they work well with the Composite pattern, typically by folding the return type in some way. In return, you don’t have to discriminate manually between the single and the multiple, and one single interface can be used much more often; this is with these kinds of details that you can keep your design simple and ready for any new challenge.
In the first part of this article we introduced predicates, which bring some of the benefits of functional programming to object-oriented languages such as Java, through a simple interface with one single method that returns true or false. In this second and last part, we’ll cover some more advanced notions to get the best out of your predicates.
Testing
One obvious case where predicates shine is testing. Whenever you need to test a method that mixes walking a data structure and some conditional logic, by using predicates you can test each half in isolation, walking the data structure first, then the conditional logic.
In a first step, you simply pass either the always-true or always-false predicate to the method to get rid of the conditional logic and to focus just on the correct walking on the data structure:
// check with the always-true predicate
final Iterable<PurchaseOrder> all = orders.selectOrders(Predicates.<PurchaseOrder> alwaysTrue());
assertEquals(2, Iterables.size(all));
// check with the always-false predicate
assertTrue(Iterables.isEmpty(orders.selectOrders(Predicates.<PurchaseOrder> alwaysFalse())));
In a second step, you just test each possible predicate separately.
final CustomerPredicate isForCustomer1 = new CustomerPredicate(CUSTOMER_1);
assertTrue(isForCustomer1.apply(ORDER_1)); // ORDER_1 is for CUSTOMER_1
assertFalse(isForCustomer1.apply(ORDER_2)); // ORDER_2 is for CUSTOMER_2
This example is simple but you get the idea. To test more complex logic, if testing each half of the feature is not enough you may create mock predicates, for example a predicate that returns true once, then always false later on. Forcing the predicate like that may considerably simplify your test set-up, thanks to the strict separation of concerns.
Predicates work so good for testing that if you tend to do some TDD, I mean if the way you can test influences the way you design, then as soon as you know predicates they will surely find their way into your design.
Explaining to the team
In the projects I’ve worked on, the team was not familiar with predicates at first. However this concept is easy and fun enough for everyone to get it quickly. In fact I’ve been surprised by how the idea of predicates spread naturally from the code I had written to the code of my colleagues, without much evangelism from me. I guess that the benefits of predicates speak for themselves. Having mature API’s from big names like Apache or Google also helps convince that it is serious stuff. And now with the functional programming hype, it should be even easier to sell!
Simple optimizations
This engine is so big, no optimization is required (Chicago Auto Show).
The usual optimizations are to make predicates immutable and stateless as much as possible to enable their sharing with no consideration of threading. This enables using one single instance for the whole process (as a singleton, e.g. as static final constants). Most frequently used predicates that cannot be enumerated at compilation time may be cached at runtime if required. As usual, do it only if your profiler report really calls for it.
When possible a predicate object can pre-compute some of the calculations involved in its evaluation in its constructor (naturally thread-safe) or lazily.
A predicate is expected to be side-effect-free, in other words “read-only”: its execution should not cause any observable change to the system state. Some predicates must have some internal state, like a counter-based predicate used for paging, but they still must not change any state in the system they apply on. With internal state, they also cannot be shared, however they may be reused within their thread if they support reset between each successive use.
Fine-grained interfaces: a larger audience for your predicates
In large applications you find yourself writing very similar predicates for types totally different but that share a common property like being related to a Customer. For example in the administration page, you may want to filter logs by customer; in the CRM page you want to filter complaints by customer.
For each such type X you’d need yet another CustomerXPredicate to filter it by customer. But since each X is related to a customer in some way, we can factor that out (Extract Interface in Eclipse) into an interface CustomerSpecific with one method:
public interface CustomerSpecific {
Customer getCustomer();
}
This fine-grained interface reminds me of traits in some languages, except it has no reusable implementation. It could also be seen as a way to introduce a touch of dynamic typing within statically typed languages, as it enables calling indifferently any object with a getCustomer() method. Of course our class PurchaseOrder now implements this interface.
Once we have this interface CustomerSpecific, we can define predicates on it rather than on each particular type as we did before. This helps leverage just a few predicates throughout a large project. In this case, the predicate CustomerPredicate is co-located with the interface CustomerSpecific it operates on, and it has a generic type CustomerSpecific:
public final class CustomerPredicate implements Predicate<CustomerSpecific>, CustomerSpecific {
private final Customer customer;
// valued constructor omitted for clarity
public Customer getCustomer() {
return customer;
}
public boolean apply(CustomerSpecific specific) {
return specific.getCustomer().equals(customer);
}
}
Notice that the predicate can itself implement the interface CustomerSpecific, hence could even evaluate itself!
When using trait-like interfaces like that, you must take care of the generics and change a bit the method that expects a Predicate<PurchaseOrder> in the class PurchaseOrders, so that it also accepts any predicate on a supertype of PurchaseOrder:
public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
return Iterables.filter(orders, condition);
}
Specification in Domain-Driven Design
Eric Evans and Martin Fowler wrote together the pattern Specification, which is clearly a predicate. Actually the word “predicate” is the word used in logic programming, and the pattern Specification was written to explain how we can borrow some of the power of logic programming into our object-oriented languages.
In the book Domain-Driven Design, Eric Evans details this pattern and gives several examples of Specifications which all express parts of the domain. Just like this book describes a Policy pattern that is nothing but the Strategy pattern when applied to the domain, in some sense the Specification pattern may be considered a version of predicate dedicated to the domain aspects, with the additional intent to clearly mark and identify the business rules.
As a remark, the method name suggested in the Specification pattern is: isSatisfiedBy(T): boolean, which emphasises a focus on the domain constraints. As we’ve seen before with predicates, atoms of business logic encapsulated into Specification objects can be recombined using boolean logic (or, and, not, any, all), as in the Interpreter pattern.
The book also describes some more advanced techniques such as optimization when querying a database or a repository, and subsumption.
Optimisations when querying
The following are optimization tricks, and I’m not sure you will ever need them. But this is true that predicates are quite dumb when it comes to filtering datasets: they must be evaluated on just each element in a set, which may cause performance problems for huge sets. If storing elements in a database and given a predicate, retrieving every element just to filter them one after another through the predicate does not sound exactly a right idea for large sets…
When you hit performance issues, you start the profiler and find the bottlenecks. Now if calling a predicate very often to filter elements out of a data structure is a bottleneck, then how do you fix that?
One way is to get rid of the full predicate thing, and to go back to hard-coded, more error-prone, repetitive and less-testable code. I always resist this approach as long as I can find better alternatives to optimize the predicates, and there are many.
First, have a deeper look at how the code is being used. In the spirit of Domain-Driven Design, looking at the domain for insights should be systematic whenever a question occurs.
Very often there are clear patterns of use in a system. Though statistical, they offer great opportunities for optimisation. For example in our PurchaseOrders class, retrieving every PENDING order may be used much more frequently than every other case, because that’s how it makes sense from a business perspective, in our imaginary example.
Friend Complicity
Weird complicity (Maeght foundation)
Based on the usage pattern you may code alternate implementations that are specifically optimised for it. In our example of pending orders being frequently queried, we would code an alternate implementation FastPurchaseOrder, that makes use of some pre-computed data structure to keep the pending orders ready for quick access.
Now, in order to benefit from this alternate implementation, you may be tempted to change its interface to add a dedicated method, e.g. selectPendingOrders(). Remember that before you only had a generic selectOrders(Predicate) method. Adding the extra method may be alright in some cases, but may raise several concerns: you must implement this extra method in every other implementation too, and the extra method may be too specific for a particular use-case hence may not fit well on the interface.
A trick for using the internal optimization through the exact same method that only expects predicates is just to make the implementation recognize the predicate it is related to. I call that “Friend Complicity“, in reference to the friend keyword in C++.
/** Optimization method: pre-computed list of pending orders */
private Iterable<PurchaseOrder> selectPendingOrders() {
// ... optimized stuff...
}
public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
// internal complicity here: recognize friend class to enable optimization
if (condition instanceof PendingOrderPredicate) {
return selectPendingOrders();// faster way
}
// otherwise, back to the usual case
return Iterables.filter(orders, condition);
}
It’s clear that it increases the coupling between two implementation classes that should otherwise ignore each other. Also it only helps with performance if given the “friend” predicate directly, with no decorator or composite around.
What’s really important with Friend Complicity is to make sure that the behaviour of the method is never compromised, the contract of the interface must be met at all times, with or without the optimisation, it’s just that the performance improvement may happen, or not. Also keep in mind that you may want to switch back to the unoptimized implementation one day.
SQL-compromised
If the orders are actually stored in a database, then SQL can be used to query them quickly. By the way, you’ve probably noticed that the very concept of predicate is exactly what you put after the WHERE clause in a SQL query.
Ron Arad designed a chair that encompasses another chair: this is subsumption
A first and simple way to still use predicate yet improve performance is for some predicates to implement an additional interface SqlAware, with a method asSQL(): String that returns the exact SQL query corresponding for the evaluation of the predicate itself. When the predicate is used against a database-backed repository, the repository would call this method instead of the usual evaluate(Predicate) or apply(Predicate) method, and would then query the database with the returned query.
I call that approach SQL-compromised as the predicate is now polluted with database-specific details it should ignore more often than not.
Alternatives to using SQL directly include the use of stored procedures or named queries: the predicate has to provide the name of the query and all its parameters. Double-dispatch between the repository and the predicate passed to it is also an alternative: the repository calls the predicate on its additional method selectElements(this) that itself calls back the right pre-selection method findByState(state): Collection on the repository; the predicate then applies its own filtering on the returned set and returns the final filtered set.
Subsumption
Subsumption is a logic concept to express a relation of one concept that encompasses another, such as “red, green, and yellow are subsumed under the term color” (Merriam-Webster). Subsumption between predicates can be a very powerful concept to implement in your code.
Let’s take the example of an application that broadcasts stock quotes. When registering we must declare which quotes we are interested in observing. We can do that by simply passing a predicate on stocks that only evaluates true for the stocks we’re interested in:
public final class StockPredicate implements Predicate<String> {
private final Set<String> tickers;
// Constructors omitted for clarity
public boolean apply(String ticker) {
return tickers.contains(ticker);
}
}
Now we assume that the application already broadcasts standard sets of popular tickers on messaging topics, and each topic has its own predicates; if it could detect that the predicate we want to use is “included”, or subsumed in one of the standard predicates, we could just subscribe to it and save computation. In our case this subsumption is rather easy, by just adding an additional method on our predicates:
public boolean encompasses(StockPredicate predicate) {
return tickers.containsAll(predicate.tickers);
}Subsumption is all about evaluating another predicate for "containment". This is easy when your predicates are based on sets, as in the example, or when they are based on intervals of numbers or dates. Otherwise You may have to resort to tricks similar to Friend Complicity, i.e. recognizing the other predicate to decide if it is subsumed or not, in a case-by-case fashion.
Overall, remember that subsumption is hard to implement in the general case, but even partial subsumption can be very valuable, so it is an important tool in your toolbox.
Conclusion
Predicates are fun, and can enhance both your code and the way you think about it!
You keep hearing about functional programming that is going to take over the world, and you are still stuck to plain Java? Fear not, since you can already add a touch of functional style into your daily Java. In addition, it’s fun, saves you many lines of code and leads to fewer bugs.
What is a predicate?
I actually fell in love with predicates when I first discovered Apache Commons Collections, long ago when I was coding in Java 1.4. A predicate in this API is nothing but a Java interface with only one method:
evaluate(Object object): boolean
That’s it, it just takes some object and returns true or false. A more recent equivalent of Apache Commons Collections is Google Guava, with an Apache License 2.0. It defines a Predicate interface with one single method using a generic parameter:
apply(T input): boolean
It is that simple. To use predicates in your application you just have to implement this interface with your own logic in its single method apply(something).
A simple example
As an early example, imagine you have a list orders of PurchaseOrder objects, each with a date, a Customer and a state. The various use-cases will probably require that you find out every order for this customer, or every pending, shipped or delivered order, or every order done since last hour. Of course you can do that with foreach loops and a if inside, in that fashion:
//List<PurchaseOrder> orders...
public List<PurchaseOrder> listOrdersByCustomer(Customer customer) {
final List<PurchaseOrder> selection = new ArrayList<PurchaseOrder>();
for (PurchaseOrder order : orders) {
if (order.getCustomer().equals(customer)) {
selection.add(order);
}
}
return selection;
}
And again for each case:
public List<PurchaseOrder> listRecentOrders(Date fromDate) {
final List<PurchaseOrder> selection = new ArrayList<PurchaseOrder>();
for (PurchaseOrder order : orders) {
if (order.getDate().after(fromDate)) {
selection.add(order);
}
}
return selection;
}
The repetition is quite obvious: each method is the same except for the condition inside the if clause, emphasized in bold here. The idea of using predicates is simply to replace the hard-coded condition inside the if clause by a call to a predicate, which then becomes a parameter. This means you can write only one method, taking a predicate as a parameter, and you can still cover all your use-cases, and even already support use-cases you do not know yet:
public List<PurchaseOrder> listOrders(Predicate<PurchaseOrder> condition) {
final List<PurchaseOrder> selection = new ArrayList<PurchaseOrder>();
for (PurchaseOrder order : orders) {
if (condition.apply(order)) {
selection.add(order);
}
}
return selection;
}
Each particular predicate can be defined as a standalone class, if used at several places, or as an anonymous class:
final Customer customer = new Customer("BruceWaineCorp");
final Predicate<PurchaseOrder> condition = new Predicate<PurchaseOrder>() {
public boolean apply(PurchaseOrder order) {
return order.getCustomer().equals(customer);
}
};
Your friends that use real functional programming languages (Scala, Clojure, Haskell etc.) will comment that the code above is awfully verbose to do something very common, and I have to agree. However we are used to that verbosity in the Java syntax and we have powerful tools (auto-completion, refactoring) to accommodate it. And our projects probably cannot switch to another syntax overnight anyway.
Predicates are collections best friends
Didn't find any related picture, so here's an unrelated picture from my library
Coming back to our example, we wrote a foreach loop only once to cover every use-case, and we were happy with that factoring out. However your friends doing functional programming “for real” can still laugh at this loop you had to write yourself. Luckily, both API from Apache or Google also provide all the goodies you may expect, in particular a class similar to java.util.Collections, hence named Collections2 (not a very original name).
This class provides a method filter() that does something similar to what we had written before, so we can now rewrite our method with no loop at all:
public Collection<PurchaseOrder> selectOrders(Predicate<PurchaseOrder> condition) {
return Collections2.filter(orders, condition);
}
In fact, this method returns a filtered view:
The returned collection is a live view of unfiltered (the input collection); changes to one affect the other.
This also means that less memory is used, since there is no actual copy from the initial collection unfiltered to the actual returned collection filtered.
On a similar approach, given an iterator, you could ask for a filtered iterator on top of it (Decorator pattern) that only gives you the elements selected by your predicate:
Since Java 5 the Iterable interface comes very handy for use with the foreach loop, so we’d prefer indeed use the following expression:
public Iterable<PurchaseOrder> selectOrders(Predicate<PurchaseOrder> condition) {
return Iterables.filter(orders, condition);
}
// you can directly use it in a foreach loop, and it reads well:
for (PurchaseOrder order : orders.selectOrders(condition)) {
//...
}
Ready-made predicates
To use predicates, you could simply define your own interface Predicate, or one for each type parameter you need in your application. This is possible, however the good thing in using a standard Predicate interface from an API such as Guava or Commons Collections is that the API brings plenty of excellent building blocks to combine with your own predicate implementations.
First you may not even have to implement your own predicate at all. If all you need is a condition on whether an object is equal to another, or is not-null, then you can simply ask for the predicate:
// gives you a predicate that checks if an integer is zero
Predicate<Integer> isZero = Predicates.equalTo(0);
// gives a predicate that checks for non null objects
Predicate<String> isNotNull = Predicates.notNull();
// gives a predicate that checks for objects that are instanceof the given Class
Predicate<Object> isString = Predicates.instanceOf(String.class);
Given a predicate, you can inverse it (true becomes false and the other way round):
Predicates.not(predicate);
Combine several predicates using boolean operators AND, OR:
Predicates.and(predicate1, predicate2);
Predicates.or(predicate1, predicate2);
// gives you a predicate that checks for either zero or null
Predicate<Integer> isNullOrZero = Predicates.or(isZero, Predicates.isNull());
Of course you also have the special predicates that always return true or false, which are really, really useful, as we’ll see later for testing:
I often used to make anonymous predicates at first, however they always ended up being used more often so were often promoted to actual classes, nested or not.
Because predicates manipulates objects of a certain type, I like to co-locate them close to the type they take as parameter. For example, the classes CustomerOrderPredicate, PendingOrderPredicate and RecentOrderPredicate should reside in the same package than the class PurchaseOrder that they evaluate, or in a sub-package if you have a lot of them. Another option would be to define them nested within the type itself. Obviously, the predicates are quite coupled to the objects they operate on.
In the next part, we’ll have a look at how predicates simplify testing, how they relate to Specifications in Domain-Driven Design, and some additional stuff to get the best out of your predicates.
As suggested by its name, Domain-Driven Design is not only about Event Sourcing and CQRS. It all starts with the domain and a lot of key insights that are too easy to overlook at first. Even if you’ve read the “blue book” already, I suggest you read it again as the book is at the same time wide and deep.
You got talent
A new natural language that makes heavy use of your thumbs
Behind the basics of Domain-Driven Design, one important idea is to harness the huge talent we all have: the ability to speak, and this talent of natural language can help us reason about the considered domain.
Just like multi-touch and tangible interfaces aim at reusing our natural strength in using our fingers, Eric Evans suggests that we use our language ability as an actual tool to try out loud modelling concepts, and to test if they pass the simple test of being useful in sentences about the domain.
This is a simple idea, yet powerful. No need for any extra framework or tool, one of the most powerful tool we can imagine is already there, wired in our brain. The trick is to find a middle way between natural language in all its fuzziness, and an expressive model that we can discuss without ambiguity, and this is exactly what the Ubiquitous Language addresses.
One model to rule them all
Another key insight in Domain-Driven Design is to identify -equate- the implementation model with the analysis model, so that there is only one model across every aspect of the software process, from requirements and analysis to code.
This does not mean you must have only one domain model in your application, in fact you will probably get more than one model across the various areas* of the application. But this means that in each area there must be only one model shared by developers and domain experts. This clearly opposes to some early methodologies that advocated a distinct analysis modelling then a separate, more detailed implementation modelling. This also leads naturally to the Ubiquitous Language, a common language between domain experts and the technical team.
The key driver is that the knowledge gained through analysis can be directly used in the implementation, with no gap, mismatch or translation. This assumes of course that the underlying programming language is modelling-oriented, which object oriented languages obviously are.
What form for the model?
Text is supplemented by pictures
Shall the model be expressed in UML? Eric Evans is again quite pragmatic: nothing beats natural language to express the two essential aspects of a model: the meaning of its concepts, and their behaviour. Text, in English or any other spoken language, is therefore the best choice to express a model, while diagrams, standard or not, even pictures, can supplement to express a particular structure or perspective.
If you try to express the entirety of the model using UML, then you’re just using UML as a programming language. Using only a programming language such as Java to represent a model exhibits by the way the same shortcoming: it is hard to get the big picture and to grasp the large scale structure. Simple text documents along with some diagrams and pictures, if really used and therefore kept up-to-date, help explain what’s important about the model, otherwise expressed in code.
A final remark
The beauty in Domain-Driven Design is that it is not just a set of independent good ideas on why and how to build domain models; it is itself a complete system of inter-related ideas, each useful on their own but that also supplement each other. For example, the idea of using natural language as a modelling tool and the idea of sharing one same model for analysis and implementation both lead to the Ubiquitous Language.
* Areas would typically be different Bounded Contexts
If you happen to create your own annotations, for instance to use with Java 6 Pluggable Annotation Processors, here are some patterns that I collected over time. Nothing new, nothing fancy, just putting everything into one place, with some proposed names.
Local-name annotation
Have your tools accept any annotation as long as its single name (without the fully-qualified prefix) is the expected one. For example com.acme.NotNull and net.companyname.NotNull would be considered the same. This enables to use your own annotations rather than the one packaged with the tools, in order not to depend on them.
Annotations can have annotations as values. This allows for some complex and tree-like configurations, such as mappings from one format to another (from/to XML, JSon, RDBM).
Java does not allow to use several times the same annotation on a given target.
To workaround that limitation, you can create a special annotation that expects a collection of values of the desired annotation type. For example, you’d like to apply several times the annotation @Advantage, so you create the Multiplicity Wrapper annotation: @Advantages (advantages = {@Advantage}).
Typically the multiplicity wrapper is named after the plural form of its enclosed elements.
It is not possible in Java for annotations to derive from each other. To workaround that, the idea is simply to annotate your new annotation with the “super” annotation, which becomes a meta annotation.
Whenever you use your own annotation with a meta-annotation, the tools will actually consider it as if it was the meta-annotation.
This kind of meta-inheritance helps centralize the coupling to the external annotation in one place, while making the semantics of your own annotation more precise and meaningful.
Example in Spring annotations, with the annotation @Component, but also works with annotation @Qualifier:
Create your own custom stereotype annotation that is itself annotated with @Component:
@Component
public @interface MyComponent {
String value() default "";
}
@BindingAnnotation@Target({ FIELD, PARAMETER, METHOD })@Retention(RUNTIME)public@interfacePayPal{}
// Then use it
publicclassRealBillingServiceimplementsBillingService{@InjectpublicRealBillingService(@PayPalCreditCardProcessor processor,TransactionLog transactionLog){...}
Refactoring-proof values
Prefer values that are robust to refactorings rather than String litterals. MyClass.class is better than “com.acme.MyClass”, and enums are also encouraged.
Convention over Configuration and Sensible Defaults are two existing patterns that make a lot of sense with respect to using annotations as part of a configuration strategy. Having no need to annotate is way better than having to annotate for little value.
Annotations are by nature embedded in the code, hence they are not well-suited for every case of configuration, in particular when it comes to deployment-specific configuration. The solution is of course to mix annotations with other mechanisms and use each of them where they are more appropriate.
The following approach, based on precedence rule, and where each mechanism overrides the previous one, appears to work well:
Default value < Annotation < XML < programmatic configuration
For example, the default values could be suited for unit testing, while the annotation define all the stable configuration, leaving the other options to configure for deployments at the various stages, like production or QA environments.
This principle is common (Spring, Java 6 EE among others), for example in JPA:
This post is mostly a notepad of various patterns on how to use annotations, for instance when creating tools that process annotations, such as the Annotation Processing Tools in Java 5 and the Pluggable Annotations Processors in Java 6.
Don’t hesitate to contribute better patterns names, additional patterns and other examples of use.
EDIT: A related previous post, with a focus on how annotations can lead to coupling hence dependencies.
Pictures Creative Commons from Flicker, by ninaksimon and Iwan Gabovitch.
Domain-Driven Design encourages to analyse the domain deeply in a process called Supple Design. In his book (the blue book) and in his talks Eric Evans gives some examples of this process, and in this blog I suggest some sources of inspirations and some recommendations drawn from my practice in order to help about this process.
When a common formalism fits the domain well, you can factor it out and adapt its rules to the domain.
A known formalism can be reused as a ready-made, well understood model.
Obvious sources of inspiration
Analysis patterns
It is quite obvious in the book, DDD builds clearly on top of Martin Fowler analysis patterns. The patterns Knowledge Level (aka Meta-Model), and Specification (a Strategy used as a predicate) are from Fowler, and Eric Evans mentions using and drawing insight from analysis patterns many times in the book.
Reading analysis patterns helps to appreciate good design; when you’ve read enough analysis patterns, you don’t even have to remember them to be able to improve your modelling skills. In my own experience, I have learnt to look for specific design qualities such as explicitness and traceability in my design as a result of getting used to analysis patterns such as Phenomenon or Observation.
Design patterns
Design patterns are another source of inspiration, but usually less relevant to domain modelling. Evans mentions the Strategy pattern, also named Policy (I rather like using an alternative name to make it clear that we are talking about the domain, not about a technical concerns), and the pattern Composite. Evans suggests considering other patterns, not just the GoF patterns, and to see whether they make sense at the domain level.
Programming paradigms
Eric Evans also mentions that sometimes the domain is naturally well-suited for particular approaches (or paradigms) such as state machines, predicate logic and rules engines. Now the DDD community has already expanded to include event-driven as a favourite paradigm, with the Event-Sourcing and CQRS approaches in particular.
On paradigms, my design style has also been strongly influenced by elements of functional programming, that I originally learnt from using Apache Commons Collections, together with a increasingly pronounced taste for immutability.
Maths
It is in fact the core job of mathematicians to factor out formal models of everything we experience in the world. As a result it is no surprise we can draw on their work to build deeper models.
Graph theory
The great benefit of any mathematical model is that it is highly formal, ready with plenty of useful theorems that depend on the set of axioms you can assume. In short, all the body of maths is just work already done for you, ready for you to reuse. To start with a well-known example, used extensively by Eric Evans, let’s consider a bit of graph theory.
If you recognize that your problem is similar (mathematicians would say isomorphic or something like that) to a graph, then you can jump in graph theory and reuse plenty of exciting results, such as how to compute a shortest-path using a Dijkstra or A* algorithm. Going further, the more you know or read about your theory, the more you can reuse: in other words the more lazy you can be!
In his classical example of modelling cargo shipping using Legs or using Stops, Eric Evans, could also refer to the concept of Line Graph, (aka edge-to-vertex dual) which comes with interesting results such as how to convert a graph into its edge-to-vertex dual.
Trees and nested sets
Other maths concepts common enough include trees and DAG, which come with useful concepts such as the topological sort. Hierarchy containment is another useful concept that appear for instance in every e-commerce catalog. Again, if you recognize the mathematical concept hidden behind your domain, then you can then search for prior knowledge and techniques already devised to manipulate the concept in an easy and correct way, such as how to store that kind of hierarchy into a SQL database.
Consider a domain, for example an online bookshop project that we call BuyCheapBooks. The Ubiquitous Language for this domain would talk about Book, Category, Popularity, ShoppingCart etc.
Business Domains
From scratch, coding this domain can be quite fast, and we can play with the fully unit-tested domain layer quickly. However if we want to ship, we will have to spend several times more effort because of all the extra cross-cutting concerns we must deal with: persistence, user preferences, transactions, concurrency and logging (see non-functional requirements). They are not part of the domain, but developers often spend a large amount of their time on them, and by the way, middleware and Java EE almost exclusively focus on these concerns through JPA, JTA, JMX and many others.
On first approximation, our application is made of a domain and of several cross-cutting concerns. However, when it is time to implement the cross-cutting concerns, they each become the core domain -a technical one- of another dedicated project in its own right. These technical projects are managed by someone else, somewhere not in your team, and you would usually use these specific technical projects to address your cross-cutting concerns, rather than doing it yourself from scratch with code.
Technical Domains
For example, persistence is precisely the core domain of an ORM like Hibernate. The Ubiquitous Language for such project would talk about Data Mapper, Caching, Fetching Strategy (Lazy Load etc.), Inheritance Mapping (Single Table Inheritance, Class Table Inheritance, Concrete Table Inheritance) etc. These kinds of projects also deal with their own cross-cutting concerns such as logging and administration, among others.
Logging is the core domain of Log4j, and it must itself deal with cross-cutting concerns such as configuration.
In this perspective, the cross-cutting concerns of a project are the core domains of other satellite projects, which focus on technical domains.
Hence we see that the very idea of core domain Vs. cross-cutting concerns is essentially relative to the project considered.
Note, for the sake of it, that there may even be cycles between the core domains and the required cross-cutting concerns of several projects. For example there is a cycle between a (hypothetical) project Conf4J that focuses on configuration (its core domain) and that requires logging (as a cross-cutting concern), and another project Log4J that focuses on logging (its core domain) and that requires configuration (as a cross-cutting concern).
Conclusion
There is no clear and definite answer as to whether a concept is part of the domain or whether it is just a cross-cutting concern: it depends on the purpose of the project. There is almost always a project which domain addresses the cross-cutting concern of another.
For projects that target end-users, we usually tend to reuse the code that deals with cross-cutting concerns through middleware and APIs, in order to focus on the usually business-oriented domain, the one that our users care about. But when our end-users are developers, the domain may well be technical.
Small details matter because you deal with them often. Any enhancement you make thus yields a benefit often, hence a bigger overall benefit. In other words: invest small care, get big return. This is an irresistible proposal!
Every single step matters
Examples of small design-level details that I care about because I have experienced great payback from them:
All these details emphasize that code is written once then used many times. The extra care at time of writing pays back at time of using, each time, again and again. Each enhancement that minimises brain effort at time of use is welcome, because software design is a matter of economy.
Other kinds of “details” that I care about involve the human aspects of crafting software: being on site, face-to-face communication rather than electronic media, respect and consideration at all times, always celebrate achievements, etc. Because ultimately, it also boils down to people that feel like building something together.