The untold art of Composite-friendly interfaces

The Composite pattern is a very powerful design pattern that you use regularly to manipulate a group of things through the very same interface than a single thing. By doing so you don’t have to discriminate between the singular and plural cases, which often simplifies your design.

Yet there are cases where you are tempted to use the Composite pattern but the interface of your objects does not fit quite well. Fear not, some simple refactorings on the methods signatures can make your interfaces Composite-friendly, because it’s worth it.

Always start with examples

Imagine an interface for a financial instrument with a getter on its currency:

public interface Instrument {
  Currency getCurrency();
}

This interface is alright for a single instrument, however it does not scale for a group of instruments (Composite pattern), because the corresponding getter in the composite class would look like (notice that return type is now a collection):

public class CompositeInstrument {
  // list of instruments...

  public Set getCurrencies() {...}
}

We must admit that each instrument in a composite instrument may have a different currency, hence the composite may be multi-currency, hence the collection return type. This breaks the goal of the Composite pattern which is to unify the interfaces for single and multiple elements. If we stop there, we now have to discriminate between a single Instrument and a CompositeInstrument, and we have to discriminate that on every call site. I’m not happy with that.

The composite pattern applied to a lamp: same plug for one or several lamps

The brutal approach

The brutal approach is to generalize the initial interface so that it works for the composite case:

public interface Instrument {
  Set getCurrencies() ;
}

This interface now works for both the single case and the composite case, but at the cost of always having to deal with a collection as return value. In fact I’m not that sure that we’ve simplified our design with this approach: if the composite case is not used that often, we even have complicated the design for little benefit, because the returned collection type always goes on our way, requiring a loop every time it is called.

The trick to improve that is just to investigate what our interface is really used for. The getter on the initial interface only reveals that we did not think about the actual use before, in other words it shows a design decision “by default”, or lack of.

Turn it into a boolean method

Very often this kind of getter is mostly used to test whether the instrument (single or composite) has something to do with a given currency, for example to check if an instrument is acceptable for a screen in USD or tradable by a trader who is only granted the right to trade in EUR.

In this case, you can revamp the method into another intention-revealing method that accepts a parameter and returns a boolean:

public interface Instrument {
  boolean isInCurrency(Currency currency);
}

This interface remains simple, is closer to our needs, and in addition it now scales for use with a Composite, because the result for a Composite instrument can be derived from each result on each single instrument and the AND operator:

public class CompositeInstrument {
  // list of instruments...

  public boolean isInCurrency(Currency currency) {
     boolean result;
     // for each instrument, result &= isInCurrency(currency);
     return result;
  }
}

Something to do with Fold

As shown above the problem is all about the return value. Generalizing on boolean and their boolean logic from the previous example (‘&=’), the overall trick for a Composite-friendly interface is to define methods that return a type that is easy to fold over successive executions. For example the trick is to merge (“fold”) the boolean result of several calls into one single boolean result. You typically do that with AND or OR on boolean types.

If the return type is a collection, then you could perhaps merge the results using addAll(…) if it makes sense for the operation.

Technically, this is easily done when the return type is closed under an operation (magma), i.e. when the result of some operation is of the same type than the operand, just like ‘boolean1 AND boolean2‘ is also a boolean.

This is obviously the case for boolean and their boolean logic, but also for numbers and their arithmetic, collections and their sets operations, strings and their concatenation, and many other types including your own classes, as Eric Evans suggests you favour “Closure of Operations” in his book Domain-Driven Design.

Fire hydrants: from one pipe to multiple pipes (composite)

Turn it into a void method

Though not possible in our previous example, void methods work very well with the Composite pattern: with nothing to return, there is no need to unify or fold anything:

public class CompositeFunction {
  List functions = ...;

  public void apply(...) {
     // for each function, function.apply(...);
  }
}

Continuation-passing style

The last trick to help with the Composite pattern is to adopt the continuation passing style by passing a continuation object as a parameter to the method. The method then sets its result into it instead of using its return value.

As an example, to perform search on every node of a tree, you may use a continuation like this:

public class SearchResults {
   public void addResult(Node node){ // append to list of results...}
   public List getResults() { // return list of results...}
}

public class Node {
  List children = ...;

  public void search(SarchResults sr) {
     //...
     if (found){
         sr.addResult(this);
     }
     // for each child, child.search(sr);
  }
}

By passing a continuation as argument to the method, the continuation takes care of the multiplicity, and the method is now well suited for the Composite pattern. You may consider that the continuation indeed encapsulates into one object the process of folding the result of each call, and of course the continuation is mutable.

This style does complicates the interface of the method a little, but also offers the advantage of a single allocation of one instance of the continuation across every call.

That's continuation passing style (CC Some rights reserved by 2011 BUICK REGAL)

One word on exceptions

Methods that can throw exceptions (even unchecked exceptions) can complicate the use in a composite. To deal with exceptions within the loop that calls each child, you can just throw the first exception encountered, at the expense of giving up the loop. An alternative is to collect every caught exception into a Collection, then throw a composite exception around the Collection when you’re done with the loop. On some other cases the composite loop may also be a convenient place to do the actual exception handling, such as full logging, in one central place.

In closing

We’ve seen some tricks to adjust the signature of your methods so that they work well with the Composite pattern, typically by folding the return type in some way. In return, you don’t have to discriminate manually between the single and the multiple, and one single interface can be used much more often; this is with these kinds of details that you can keep your design simple and ready for any new challenge.

Follow me on Twitter! Credits: Pictures from myself, except the assembly line by BUICK REGAL (Flickr)

Read More

What next language will you choose?

Now that enterprises have chosen stable platforms (JVM and .Net), on top of which we can choose a syntax out of many available, which language will you pick up as your next favourite?

New languages to have a look at (my own selection)

Based on what I read everyday in various blogs, I arbitrarily reduced the list of candidates to just a few language that I believe are rising and promising:

  • Scala
  • F#
  • Clojure
  • Ruby
  • Groovy

Of course each language has its own advantages, and we should definitely take a look at several of them, not just one. But the popularity of the languages is also important to have a chance of using them in your day work.

Growth rates stats

In order to get some facts about the trends of these new programming languages in the real world, Google trends is your friend. Here is the graph of rate of growth (NOT absolute numbers), worldwide:

I’m impressed at how Clojure is taking off so brutally… In absolute terms, Ruby comes first, Groovy second.

So that was for the search queries on Google. What about the job posts? Again these are growth rates, not absolute numbers, and with a focus on the US.

Again, the growth of Clojure is impressive, even though it remains very small in absolute terms, where Ruby comes first, followed by Groovy.

Popularity index

So far the charts only showed how new languages are progressing compared to each other.

To get an indication of the actual present popularity of each language, the usual place to go is at the TIOBE indices (last published June 2010):

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the number of skilled engineers world-wide, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, Wikipedia and YouTube are used to calculate the ratings.

In short:

  • Ruby is ranked 12th; Ruby is kinda mainstream already
  • LISP/Scheme/Clojure are ranked together 16th, almost the same rank than in 2005 when Clojure did not exist yet.
  • Scala, Groovy, F#/Caml are ranked 43, 44 and 45th respectively

Conclusion

Except Ruby, the new languages Scala, Groovy, F# and Clojure are not yet well established, but they do progress quickly, especially Clojure then Scala.

In absolute terms, and within my selection of languages, Groovy is the more popular after Ruby, followed by Scala. Clojure and F# are still far behind.

I have a strong feeling that the time has come for developers to mix alternate syntax in addition to their legacy language (Java or C#), still on top of the same platform, something also called polyglot programming. It’s also time for the ideas of functional programming to become more popular.

In practice, these new languages are more likely to be introduced initially for automated testing and build systems, before being used for production code, especially with high productivity web frameworks that leverage their distinct strengths.

So which one to choose? If you’re already on the JVM, why not choose Groovy or Scala first, then Clojure to get a grasp on something different. If you’re on .Net, why not choose Scala and then F#. Ad if you were to choose Ruby, I guess you would have done it already.

Read More

The String Obsession anti-pattern

This anti-pattern is about Using String instances all the time instead of more powerful objects. This is a special case of primitive obsession.

Examples

Here are some flavours of this anti-pattern:

  • String as keys in a Map: I have seen calls to toString() on objects just to use their String representation as a key in a Map: map.put(myPerson.toString(), “someValue”).
  • String as enum (“In Progress” , “Done”, “Error”), with tests using equal() to trigger the corresponding logic.
  • Array or Collection of String to mimic a data structure, in which case the ordering matters: {“firstname”, “lastname”}.
  • String with a special format that makes it parsable: “value1_value2”, but used all over the place as is, with parsing code all over the place.
  • String in XML format used everywhere as is, rather than a corresponding tree of objects (a DOM); again there is a lot of parsing everywhere.

The problem

string_obsessionWhy not use “real” Java objects instead of Strings? Value Objects would be much easier to use, and would be more efficient in many cases, whereas String obsession comes with serious troubles:

  • Repeated parsing is CPU-intensive and allocates a lot of new objects each time
  • The code to parse and interpret a String must be spread all over the code base, leading to clutter and plenty of quasi-duplication
  • When used to represent meaningful domain concepts, Strings have no method dedicated to the business. The domain operations must be managed elsewhere and selected with conditionals; on the contrary, using value objects makes your life easier everytime you use them.
  • String values also need to be parsed to be validated. String values do not catch potential issues early, they will only be discovered when really used; on the other hand, value objects can be validated and can catch issues as soon as they are constructed

Solution

Use Value Objects, that are immutable, with efficient and safe equals() and hashcode(), the convenient toString() and every useful member method that can make them easy to use and. On inputs, convert the incoming format (e.g. String) into the corresponding full-featured object as soon as possible, and use the object as much as possible. In practice the objects can be used everywhere, everytime, as long as they remain within the same VM.

Why the String obsession?

Sure, there are some reasons why some people favour Strings rather than creating additional objects:

  • String are already available in i/o, hence you may consider it is memory efficient to just keep them
  • String are immutable, behave well as keys, and are easy to inspect in debug
  • You may assume that i/o represent the majority of cases in the application
  • You may consider that creating additional value objects mean more code therefore more effort and more maintenance

I strongly believe all these reasons are short-sighted, if not wrong. The burden of having to interpret the business meaning of a String each time is definitely a loss compared to creating a object with a meaning as soon as possible in i/o and use the object and its methods thereafter. Even for persistence and on-screen display, having an object with methods and accessors make our life easier than just using Strings!

Read More

Design your value objects carefully and your life will get better!

Many concepts look obvious because they are used often, not because they are really simple.

Small quantities that we encounter all the time in most projects, such as date, time, price, quantity of items, a min and a max… hardly are a subject of interest from developers “as it is obvious that we can represent them with primitives”.

Yes you can, but you should not.

I have experienced through different projects that every decision to use a new value value object or to improve an existing value object, even very small and trivial, was probably the design decision that yields the most benefit in every further development, maintenance or evolution. This comes from the simple fact that using something simpler “all the time” just makes our life simpler all the time.

We often dont pay attention to the usual chair, however it was carefully designed.
We often dont pay attention to the usual chair, however it was carefully designed.

As a reminder, value objects are objects with an identity solely defined by their data. Using value objects brings many typical benefits of Object Orientation, such as:

  1. One value object often gathers several primitives together, and one thing is always simpler that a few things
  2. The value object can be fully unit-tested: this gives a tremendous return, because once it is trusted there will simply be no bug about it any more
  3. Methods that encapsulate even the simplest expression with a good name tells exactly what it does, instead of having to interpret the expression every time (think of isEmpty() instead of “size() == 0”, it does save some time each time)
  4. The value object is a place to put documentation such as the corresponding unit and other conventions, and have them enforced. Static creators makes creation easy, self-explanatory, can enforce the validations and help select the right unit (percent, meter, currency)
  5. The method toString() can just tell in a pretty-formatted fashion what it is, and this is precious!
  6. The value object can also define various abilities, such as being Comparable, being immutable, be reused in a Flyweight fashion etc.

And of course, when needed it can be changed to evolve, without breaking the code using it. I could not imagine a team claiming to be really agile without the ability at least to evolve its most basic concepts in an easy way.

The first time it takes some extra time to build a value object, as opposed to jumping to a few primitives; but being lazy actually means doing less in the long run, not just right now.

Read More

Toward smarter dependency constraints (patterns to the rescue)

Low coupling between objects is a key principle to help you win the battle against software entropy. Making sure your dependencies are under control matters. Several tools can enforce dependencies restrictions, such as JDepend. However in a real project with many classes, packages and modules, the real issue is how to decide and configure the allowed and forbidden dependencies. Per class? Per package? Per Module? Based on gut feeling? Is there a theory for that?

Of course, in a layered architecture, the layers specify the dependencies. This is not bad, but I am sure we can do better.

Smarter dependencies

To go further, I suggest expanding our vocabulary of concepts. In OO languages such as Java, everything is a class (or an interface), grouped into packages. Such classification is not really helpful. Fortunately, several books provide ready to use vocabularies in the form of patterns languages (not only design patterns, but patterns in general). Some of these patterns are foundations on which rules to manage dependencies can be proposed.

Disclaimer: the dependencies rules suggested below are hypothesises to be debated and verified against a corpus of actual projects, I would be happy to be given counter-examples and counter arguments.

The child really depend upon the mother
The child really depend upon the mother

Domain Driven Design

The book Domain Driven Design by Eric Evans defines a rich vocabulary of concepts used in every application, and we can leverage that vocabulary to propose some dependencies principles between them:

  • ValueObject never depends upon Entity nor Services
  • Entities should not depend upon Services (maybe not a hard rule)
  • Generic SubDomain should not depend upon Core Domain
  • Core Domain should not depend upon Cohesive Mechanism (the “What” should not depend upon the “How”)
  • Domain Layer should not depend on any infrastructure code
  • Abstract Core module never depends on its specialized or implementation modules

Analysis Patterns

The book Analysis Patterns by Martin Fowler also provides patterns as a richer vocabulary, from which we could propose:

  • Elements from a Knowledge Level should not depend upon elements from the corresponding Operation Level

I did not find that rule written in the book but every example appears to support it. Considering that classes and subclasses in usual OOP are a special case of Knowledge Level built-into the language, this would lead to:

  • Abstraction never depends upon their Implementations

which is similar to the second part of the Dependency inversion principle by Robert C. Martin:

Abstractions should not depend upon details. Details should depend upon abstractions.

Since many analysis patterns in the Analysis Patterns book involve the Knowledge Level pattern, this single dependency rule already applies to many analysis patterns: Party Type Generalizations, Measurement, Observation, Protocol, Associated Observation, Measurement Protocol etc. The pattern Quantity can be seen as a specialized ValueObject (see Domain Driven Design above) hence should also not depend on any Entity nor Service.

Design Patterns

The book Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma et al. presents the classic design patterns. These patterns define participants which are named.  In the pattern participant ignorance principle I discussed the concepts of ignorant Vs. dedicated participants within a pattern, and their consequences for dependencies:

  • Ignorant pattern participants should never depend on dedicated participants
  • Pattern participant never depend on the “Client” participant
  • For each ConcreteX participant, the corresponding abstract X never depends on it (Abstraction never depends upon their Implementations)

In practice, this means:

  • In the Adapter pattern, the Adaptee should not depend upon the Adapter, and the Target should depend upon nothing
  • In the Facade pattern, the sub systems should not depend upon the Facade
  • In the Iterator pattern, the Aggregate should not depend upon the Iterator; however every Java collection is a counter example as they contain their own ConcreteIterator.
  • In creational patterns (Abstract Factory, Prototype, Builder etc.), the Product and ConcreteProduct should not depend on the dedicated participant that does the allocation (the Factory etc.)
  • And so on for other patterns, some of which being already discussed in the pattern participant ignorance principle.

In short, if we look at the design patterns as a set of types with inheritance/implementation, invocation/delegation and creation relationships between them, the dependencies should not flow in the reverse direction of the relationships; in other words, using UML arrows, the dependencies should only be allowed in the direction of the arrows.

Addiction to sugar is a kind of dependency
Addiction to sugar is a kind of dependency

Patterns of Enterprise Architecture

In the book Patterns of Enterprise Application Architecture by Martin Fowler, the Separated Interface Pattern proposes a way to manage dependencies by defining the interface between packages in a separate package from its implementation. This is similar to the Dependency inversion principle, as discussed here, which states:

A. High-level modules should not depend upon low-level modules. Both should depend upon abstractions.

By the way this is also very similar to the recommendation in Domain Driven Design:

Abstract Core module never depends on its specialized or implementation modules.

Finally, in the spirit of UML stereotypes that we sometimes put on packages to express their intent:

  • Utils never depends on anything but other Utils

What for?

If we manage to make every use of the above pattern explicit in the source code, for instance using Java annotations or simply Javadoc tags, then it becomes possible for a tool to deduce dependencies constraints and automatically enforce them.

Imagine, just add @pattern ValueObject in your Javadoc comment, and voila! A tool is now able to deduce that if you happen to import anything but basic java.* you must be warned.

Of course the fine tuning of the default behavior can take some time (do we accept that ValueObjects may depend upon low level utils like StringUtils? Probably yes), but the result will at least be stable regardless of the refactorings.

Given the existing variety of patterns over there, I am confident that just any class or interface within a project can be declared as being a participant in at least one pattern, and have therefore its dependency constraints deduced at the same time.

Read More

After Technical Debt, technical Options

In finance, options are powerful tools for traders, and many design practices including design patterns can be seen as options. Options can -perhaps- yield a great benefit for a certain and immediate cost. If this cost is cheap enough it can be quite attractive.

An option to buy a stock is a right to buy the stock at a predefined price (let’s say EUR25) before a given date in the future. Until this date, you can freely exercise the option (i.e. buy the stock at EUR25), or not (do nothing). Of course if it happens that the stock price is greater than EUR25 (let’s say the stock is now EUR35), if you exercise the option you earn money (EUR10), and for that reason the option is worth something (EUR10 here, or even more if we expect the price to go even higher in the future).

In finance, the price of the option is directly linked to the expected benefit it can bring, even though this benefit is not certain. To be more precise, the more the stock price moves up and down, the bigger the option value: this instability of the price is called volatility, and you can think of it like a standard deviation in statistics. Options traders actually trade volatility: the higher the volatility (i.e. the less stable the prices), the more valuable the option.

Many design practices can be seen as options, and some are indeed cheap, almost free options. A free option is something that a reasonable investor cannot refuse to get, since it costs nothing while it can perhaps yields a benefit: this is actually an uncertain version of a “free lunch”. A design practice that would cost nothing is hard to resist, since it yields potential return in exchange for nothing.

In the spirit of the Technical Debt, we will focus on the cost of maintaining the code over time. Any change to the code must be weighted in accordance to the burden it will bring compared to its benefit.

Plenty of options here
Plenty of options here

Technical options in the code

In practice, everything that improves the flexibility of the software is de facto like an option: it creates additional opportunities that we can use or not in the future. Typically, it can enable the designer to change his mind at a later stage at no cost, or to defer a design decision.

The design principle to code against abstractions rather than implementation comes immediately to mind: coding against interfaces is probably the number one technical option. Therefore the question is: given this piece of concrete code, should I introduce an interface and reference the code through the interface, or should I leave it like that? Just like in finance, the answer is: it depends on the upfront cost of the option (adding the extra interface) compared to the anticipated stability of the feature the code is doing (is it likely to change in the future?).

In this case, the upfront cost depends on the tools (with modern IDEs with wizards and refactoring capabilities it is easier to add or even extract an interface automatically), and perhaps on what you like to do or not (if you hate having too many classes then adding an interface will cost you more, subjectively). Some commercial source control systems that are not convenient with widespread refactoring also make the cost of future changes much higher, and as such they encourage building flexibility upfront.

Going further, many design patterns can be considered as options to improve flexibility for a minimal cost. Out of the GoF patterns, the patterns Abstract Factory, Strategy, State, Visitor, Iterator, Bridge and Builder directly plan for easy change of behaviour or extension in the future (typically through additional interfaces as well), and thus are simple yet efficient ways to buy technical options.

Various elements combine together for bigger value added
Various elements combine together for bigger value added

Technical options Vs. YAGNI

Knowing the future is hard, and YAGNI is there to remind us that. If there are practices that cost a little to introduce and that can bring benefits later, and if we are so poor at anticipating change, why not just introduce them later? This is right, in most cases YAGNI applies, unless we can anticipate a need. The funny thing is that while the introduction of interfaces (directly or through the patterns mentioned before) creates opportunities to vary the implementations, it also creates opportunities for other opportunities. The new interface enables the addition of non-intrusive structural patterns that revolve around an existing interface, such as Decorator, Composite, Proxy, Adapter, Null Object, and every combination of them as suggested by the Interpreter pattern. (This introduction of such patterns, by the way, is precisely what most IoC containers, application servers and AOP frameworks do to add functionalities). How these patterns combine together under the common “umbrella” interface is kind of fascinating to me, and it always reminds me about the Network Effect: the more we use an interface, with implementations or opportunistic patterns, the more useful it becomes. I believe that a net of patterns like that can become so useful that it becomes irresistible. You will gonna need it.

Of course there is nothing really new in all that, nothing more than a new metaphor to help explain to managers. Even if they do not know what an option is, I am sure they can understand the concept very quickly.

I have been pushing this approach that I called “Think options, not solutions” in my former team to emphasize that we can build the system even if some of its business behaviours are not completely known yet, as long as we can define their contours and start with an arbitrary solution. The goal is that we are confident we can change it later with no pain. This was justified from a business point of view because we did not know what the market was expecting, for the product (an interest rate swaps multilateral trading facility) was too novel. In that environment, change was definitely not something we feared, rather the opposite: it was stimulating.

Just like traders speculate on volatility using options, developers can speculate on the stability of the business features using (or not) design practices that increase flexibility. The little extra effort to introduce these practices must be weighted by the expectation that it will be useful. And just for the fun, anyone interested in computing the value of a technical option using the Black-Scholes pricing model?

Read More

Systematic combination of subpatterns generates design patterns

Composite patterns, such a the Bureaucracy pattern, are patterns built by the composition of other “smaller” patterns. However even usual design patterns can be considered composite patterns made of smaller subpatterns.

The goal is therefore to find out which are the main subpatterns that enable to reconstruct as many design patterns as possible.

The subpatterns

From an early analysis I believe that four subpatterns form the foundation of many GoF design patterns:

  1. Type Hierarchy and its variants (interface, abstract class, non final class…)
  2. Type Set, a set of types with no particular relationship between them
  3. Delegation and its variants (one-to-one, one-to-many)
  4. Allocation (creation of instances) and its variants

The experiment

To validate this proposal, I have coded a small experiment to generate every combination of two subpatterns out of (1) and (2) plus one relationship between them out of (3) and (4), or every combination of one subpattern out of (1) and (2) plus a relationship of (3) and (4) to either itself, or in the case of (1), between a super type and a subtype or the other way round.

For each generated combination I have automatically generated its corresponding class diagram, and grouped them together into one SVG document, converted into PDF using Inkscape. The resulting picture is available here, and you can see a partial preview below:

combinations_thumb

Results

The interesting finding is that this simple combinatorial method re-discovers 11 well-known design patterns (in the reading order of the picture):

  • Template Method (delegation to self, expecting a subtype to provide the service)
  • Composite (Considering a one-to-many delegation) or Decorator / Proxy (one-to-one delegation)
  • Prototype (creates instance of this type)
  • Bridge (one Abstraction hierarchy delegates to another Implementor hierarchy)
  • Abstract Factory (one Factory hierarchy creates instances of another (or several others) Products hierarchy)
  • … skip 4 next …
  • Strategy or State or Builder (some client delegates to some hierarchy)

Note that there are several cases where we could see the Factory Method, Adapter or Facade patterns. This would lead to some 13 design patterns out of the 23 in the GoF book.

Discussion

It is obvious anyway that the four subpatterns used here are not enough to rebuild the full set of the 23 design patterns. The Command pattern is unique in its use of a separate invoker, and the Chain of Responsibility is clearly about how handlers self-assign a given task without any centralized management. In these two examples however, the unique characteristics is not in their structure but rather in their dynamics. Sequence diagrams would be much more relevant to convey that kind of information.

The full test runs in one go in a fraction of second. The source code is available here: patternitygraphic_src, the code of the experiment is in CombinationTest.

EDIT: the plain SVG file that was generated is also available here for download:

Read More

Documenting the design of software using patterns?

I have experimented an approach that considers every design pattern as the recursive composition of smaller patterns. This led to a prototype tool to illustrate its benefits by generating design-level documentation of annotated source code.

Eat your own dog food

The source code of this tool itself was used as the code base to apply the tool on, in order to generate its own design documentation. The Java Doclet API was used to retrieve the Javadoc tags in the class Javadoc comments, e.g. :

@pattern KnowledgeLevel OperationLevel=MyOtherClass some free text comment…

The pattern catalogue

A subset of design patterns and some other patterns such as the Knowledge Level (Fowler) has been defined in small definitions files then loaded at tool startup. We call this set of patterns definitions the pattern catalogue.

Here is an example of the definition file for the Builder pattern, notice the declaration of the member roles at the bottom, that also declares the expected pattern kind:

Encapsulates the creation of a complex object from a source object.
@extends DesignPattern
@category DesignPattern
@category Creational
@author GoF
@book [GoF 95]  E. Gamma, R. Helm. R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley 1995.
@style ObjectOrientedProgramming
@reference [http://en.wikipedia.org/wiki/Builder_pattern ]
@defaultrole Builder
@role Director=TypeHierarchy
@role Builder=TypeHierarchy
@role Product=TypeSet
@role Factory=AbstractFactory

Parsing the annotations to build the pattern occurrences graph

Every time a Javadoc tag “@pattern Pattern” is met in the Javadoc part of a class, an occurrence of the corresponding pattern is created in memory, linking to the corresponding class in the Java programming language metamodel.

Whenever such pattern uses a Type Hierarchy (this is declared in the definition of the pattern), then the tool will look for every subtype of the declared (super) type in order to complete the pattern occurrence as well as possible.

Even patterns that are usually considered as atomic patterns can be explained as the composition of smaller “elemental” patterns. For instance, A Builder [GoF] usually defines an AbstractBuilder role and a ConcreteBuilder role that implements it. We can express this part of the Builder pattern as a TypeHierarchy elemental pattern in the role Builder; the supertype of the TypeHierarchy therefore represents the AbstractBuilder role, while every other concrete type of this hierarchy represents ConcreteBuilder roles.

By analogy with the members of a class, we call members patterns and member occurrences the patterns and occurrences (respectively) a pattern or occurrence is composed of. The tool totally ignore fields and methods, it only considers types (interfaces and classes).

The benefits of this approach are to reuse a bigger part of the pattern definitions through composition and to help address the variant problem.

Once the pattern occurrences graph is completely built, it is rendered into an UML class diagram using Graphviz dot. Here is an example of overview for the full project:

coremodulediagram_final
Example of overview diagram for the Core module of the prototype source code

In the above diagram we can see how the patterns naturally sit on top of the several type hierarchies while connecting them together. Also the relationships between patterns become very obvious, such as the Interpreter that references the Builder and the Composite sub-patterns. This suggests that such diagrams are very valuable to document the design of a code base.

Topological sort

The approach of representing every pattern as the composition of other patterns yields to directed graphes, at the pattern level (the “meta” level) and at the pattern occurrences level. One consequence is that we can apply a topological ordering on such graphs. In the current prototype we need to do a topological sort on the patterns definitions so that we can know in advance in what order to process pattern occurrences declarations: the patterns used must be built before the patterns using them.

The pattern occurrences graph can also be navigated using the usual Depth-First search or Breadth-First search. DFS is convenient to generate the Overview diagrams such as the diagram shown above, whereas BFS is convenient to drill down the design from the top to the bottom, one level at a time. This can therefore generate the following table of contents:

Patterns graph navigation index
Patterns graph navigation index

Natural language generation alternative

As an alternative to visual diagrams, a tool can also generate natural language text to communicate the exact same information. I have tried this approach using simple text templates in each pattern definitions. Each pattern therefore defines its own design description, but using variable names to be filled later with the data from the pattern occurrences graph.

The biggest work in this approach is to deal with the plural, the enumeration of collections (comma between each item but the last etc.), truncating collections that are too long, and the complete omission of a section if a pattern member is totally missing.

Here is a sample of text generation for the prototype tool itself. Notice how the structure is rigid for each pattern occurrence: Pattern name, short description, then template text evaluated with the actual types names, followed by the comment that was put after the @pattern declaration:

Design of the module Core
This module is essentially made of 9 pattern occurences: Strategy, Visitor, KnowledgeLevel, Interpreter, Builder and AbstractFactory.

Visitor
The visitor design pattern is a way of separating an algorithm from an object structure. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures
This module defines a Visitor OccurrenceVisitor. comment

Strategy
A Strategy object encapsulates an algorithms that can be selected on-the-fly at runtime.
This module defines the interface of a Strategy: Plugin. It enables plugins to manipulate the Patternity metamodel.

AbstractFactory
Provides one or more method(s) that encapsulate(s) the creation of object.
The type OccurrenceFactory defines the interface of an AbstractFactory that creates instances of Occurrence and TypeOccurrence.

Interpreter
Define a representation for a language along with an interpreter that uses the representation to interpret sentences in the language.
The type(s) Occurrence and TypeOccurrence define(s) an object representation of the considered language. The considered language describes a graph of patterns occurrences. It uses the Visitor Visitor to add new operations to the object representation without modifying this structure.

Strategy
A Strategy object encapsulates an algorithms that can be selected on-the-fly at runtime.
This module defines the interface of a Strategy: Loader. Encapsulates how to load every pattern definition into the pattern.repository.

[...]

KnowledgeLevel
A Knowledge Level is a group of objects that describes how another group of objects should behave.
This module defines a KnowledgeLevel (also known as metamodel). It is made of two levels of objects: objects in the knowledge level, i.e. the instances of Pattern define how objects in the operational level can behave, i.e. instances of Occurrence. Typically each type in the operation level maintains a reference to its corresponding type in the knowledge level in order to know how to behave.

Conclusion

Later, at work, when asked to produce a design documentation targeted at other developers, I have experimented doing a similar effort manually, using nothing but patterns. I am still impatient to receive some feedback from the people that will read the resulting documentation in order to validate or infirm this approach.

Read More

Group together things that go together

Information hiding is one of the very essential principles of object orientation. If you dont know it well, I suggest you take a look at it on the Web, e-g at UncleBob.

But information hiding is much more than just putting data private through accessors to protect them, it is especially a great tool to manage complexity.

Everytime you put together data that go together, you hide these data behind the object that contains them, and as a result, instead of dealing with 2 or 5 pieces of data, you can only deal with one, so you win the complexity battle. If you simply do that several time, you can end up managing hundreds of data with little effort.

This very little principle can really save your life. One extremely common application is the Quantity pattern (Fowler), as value and unit always go together (or they should), they are ideally grouped together as a quantity object, like Money, Percent, Period… Just putting together a value and its unit already yields surprisingly high return, (and of course you become unit-safe).

Let’s illustrate this post with our current project; we replaced some terrible 25-fields legacy data objects by objects with 2 to 5 fields, themselves again made of 2 to 5 fields objects, and so forth. The actual object structure is a tree, but most of the time we only care of one level at a time: we always have a simple object, it is easy to deal with it.

This way, we get many benefits: each object is simple, easy to create; each such object that groups together some fields has a name, it is a unit of concept, we can talk about it, and this very point is really valuable; we can very often reuse the objects as-is (the same instance) from different pieces of the application, no more need to copy fields to fields; they can often be made immutable; and of course we can unit test such small objects (yes we can have bugs even in simple things).

And if you claim that having several such objects instead of just dealing with “flat” primitive fields is costly, then you are victim of the Primitive Obsession syndrom!

Initially published on Jroller on Wednesday October 19, 2005

Read More