Principles for using annotations

Deciding where and how to place the annotations is not innocent. The last thing we want is to create extra maintenance effort because of the annotations. In other words, we want annotations that are stable, or that change for the same reasons and at the same time than the elements they annotate. This article suggests some good practices on how to design annotations.

Annotations are location-based

annotations
A special kind of wall annotation

Language annotations or even good-old xDoclet tags enable to augment program elements with additional semantics, which can be used to configure tools, frameworks or containers.

Configuration is now increasingly done through annotations spread all over the project elements. The key advantage is that the location of the annotation directly references the program element (interface, class etc.), as opposed to configuration files that must reference program elements using awkward and error-prone qualified names: “com.mycompany.somepackage.MyClass”, that are also fragile to refactoring.

For example, we can annotate an entity to declare how it must be persisted, we can annotate a class to declare how it must be instantiated by a Dependency Injection framework, and we can annotate test methods to declare their purpose.

If not placed and thought carefully, annotations can make your code harder to maintain. This happens when annotations are placed at the “wrong” place, or when they introduce undesirable coupling, as we will see.

Dependencies still matter

The question of coupling between elements of the code base is also relevant for annotations. That the coupling is done via an annotation rather than plain code does not make it more acceptable.

We want to group together things that change together. As a consequence, put your annotations on the elements that change with the annotations.

In particular, when the annotation is used to declare a dependency:

Only annotate the dependent element, not the element depended on

If you use Dependency Injection and you want the class MyServiceImpl to be injected everywhere the interface MyService is used, then Guice offers the annotation @ImplementedBy:

@ImplementedBy(MyServiceImpl.class)
interface MyService {... }

This annotation is a direct violation of the advice above, since it makes a pure abstraction (an interface) aware of an implementation, whereas the regular dependency should be the other way round: only the implementation must depend on the interface.

I must however acknowledge that the annotation @ImplementedBy is quite convenient for unit tests anyway, to declare a default implementation for the interface. And it was done just for that, as described in the Guice documentation along with a warning:

Use @ImplementedBy carefully; it adds a compile-time dependency from the interface to its implementation.

Favor intrinsic annotations

annotation2
Another annotation on the wall in Paris

If you want to declare that a service is stateless, you cannot get it wrong: just put the annotation @Stateless on its interface. This is straightforward because being stateless is a truly intrinsic property. It also makes perfect sense to annotate a method argument with the @Nullable annotation, as the capability to expect null or not is really intrinsic to the method.

On the other hand, a service interface does not really care about how it is called. It can be called by another object (local call) or remotely, through some remote proxy. The object is not intrinsically local or remote in itself.

The point is that the decision to consume the service locally or remotely does not belong to the service, in itself, but depends on each client. It actually belongs to each use-case considered.

Said another way, specifying @Remotable or @Local directly on the service would require the developer of the service to guess how it will be used!

Intrinsic properties really belong to the element and therefore are stable, as opposed to use-case-specific properties that vary with the particular case of use. Hence, if we want stable annotations:

Only annotate an element about its intrinsic properties, and avoid annotating about use-case-specific properties.

Annotations as pointcuts

Let’s consider an example of  an accounting service in a bank. Only selected categories of staff can access this service. We can use annotations to declare its security configuration:

@RolesAllowed({"auditor", "bankmanager", "admin"})

The problem with that approach is that it couples the service directly to the user roles defined elsewhere; as a consequence, if we want to change the user roles (we now need to add the user role “externalauditor”), we will have to review every security annotation and change them. On the other hand, if we want to change the access policy (which happen every time a new senior management comes into place), we will also have to change annotations all over the code. How can we improve that?

We can improve the situation by going back to the business analysis on the topic and separate what’s intrinsic and what’s not. In other words, we want to find out how did a BA came up with the security roles for the service.

Rather than specifying the need for security in terms of allowed user roles, we can instead declare the facts: the service is “sensitive” and is about “accounting”:

@Domain(Accounting)
@Confidentiality(Sensitive)
And now a beautiful car annotation
And now a beautiful car annotation

Then we can define expressions that use the declared annotations (which are now stable because they are intrinsic) to select elements (here services) and associate them to allowed user roles. These rules should be defined outside of the elements they apply to, typically in configuration files.

Thanks to the annotations that already define half of the security knowledge, expressing the rules becomes much simpler that doing it method by method. So next time the senior management changes and decides that from now on, “every service that is both Confidentiality(Sensitive) and Domain(Accounting) is only allowed to corporate-officer roles”, you just have to update a few rules expressed in terms of domain and confidentiality level, rather than by listing many method.

The mindset is very similar to AOP where we first define pointcuts, and then attach advices to them. Here we use annotations as an alternative way to declare the pointcuts.

Conclusion

Annotations are very efficient to declare properties about program elements directly on the elements. They are robust versus refactoring and are easier to use than specifying long qualified names in XML files.

To get the best of annotations, we still need to consider the coupling they can introduce, in particular with respect to dependencies. If a class should not know about another, its annotations should not either.

Annotations are much more stable (less likely to change) when they only relate to intrinsic properties of the elements they are located on. When we need to configure cross-cutting concerns (security, transactions etc.) annotations can be used to declare the half of the knowledge that is really intrinsic to the elements, in the same spirit than pointcuts in AOP.

All that leads to the acknowledgement that even though annotations can be of huge value, in practice there is still a case for configuration files to complement them. In this approach, annotations on elements declare what belongs to the elements, while each use-case-specific configuration file makes use of the annotations and as a result is much simpler.

Read More

Pattern grammar for the variant problem

For tools to be aware of patterns, the patterns must be formalized, at least partially. At this point I must quote Gregor Hohpe to clarify my thoughts, as I strongly agree with his skipticism:

Typically, when people ask me about “codifying” or “toolifying” patterns my first reaction is one of skepticism. Patterns are meant to be a human-to-human communication mechanism, not a human-to-machine mechanism. After all, I have pointed many people to the fact that a pattern is not just the piece of code in the example section. It’s the context-problem-forces-solution combination that makes patterns so useful.

Patterns link together a problem part to a solution part. This is expressed within the limits a stated context outside of which it is no more applicable. Patterns also emphasize the forces involved, that you must consider to decide how and whether or not to apply the pattern.

Patterns litterature usually describes examples of application of patterns. In your project, you will have to do more work to adapt the pattern solution to your exact need. A pattern solution may be stretched a lot, but the pattern remains as long as humans still recognize its presence. Every different way of applying a pattern is called a pattern variant.

Addressing the variant problem

Formal descriptions of anything human is too restrictive, and this is especially true for patterns which are the product of human analysis, in that they resist simple formalization. However if we focus on sub-parts of patterns, it becomes easier to formalize them, at least for their solution part.

For example a design pattern that uses (in its solution part) some form of inheritance admits several variants. At first, the pattern solution seems to resist against its formalization. However if we now focus on the inheritance part only, we can enumerate every possible alternative for it. For example we can use:

  • interface and classes that implement it
  • abstract class and classes that extend it
  • concrete class provided it is not final (assume we’re in Java), so that it can be extended

Notice that each alternative is a solution to the same problem “How to realize some

Tree structure in volume (Milano International Fair 2009)
Tree structure in volume (Milano International Fair 2009)

form of inheritance”. We can say that each alternative is indeed a pattern, and by the way Mark Grand already described them in Patterns in Java Volume 1. These patterns are easy to formalize, as they can be precisely described in terms of programming language elements.

How can we split a pattern into parts? The idea is to identify the areas that are fragile with respect to the variant problem in the solution part of a pattern, and to consider them as lower-level problems embedded inside the bigger pattern.

In the example before, the problem was to achieve “some form of inheritance”, and we listed three patterns that address this problem.

Provided it can be split into sub-parts (hence into smaller problems), any pattern solution can be formalized by recursively formalizing its sub-parts. If a sub-part cannot be easily made formal, then it can be split again into sub-parts, and so on until each sub-part can.

Given a pattern solution that we want to formalize:

  • If it can be described formally directly, then we are done (terminal)
  • If it cannot be fully described formally, then extract the problematic sub-parts into sub-problems, then find every pattern that addresses each sub-problem, and formalize their solution part

We can then represent a pattern as a tree of smaller patterns, where the solution part of each patter is connected to the problem part of another pattern.

From pattern language to pattern grammar

In the car, some parts can be replaced by other alternate parts that play the same contract (e.g. the wheels)
In the car, some parts can be replaced by other alternate parts that play the same contract (e.g. the wheels)

When the pattern is applied into the code, at each node in the tree there is actually a selection of which variant of the sub pattern to use. As such, each selected pattern represents an atom of decision in the design process.

It then appears that we have a form of grammar for patterns, where there are terminal patterns solutions T (easier to formalize in term of programming language elements), non-terminal parts of patterns solutions N (that cannot be easily formalized but that can formally ask for help to solve their sub-problems), and where the production rules P are nothing but the patterns themselves:

Patterns are production rules that link:
elements of N (the problems) to elements of (N Union T)

Conclusion

I have suggested quickly a way to formalize patterns solutions in spite of their fragility with respect to their variants. This approach identifies patterns as production rules in some grammar over the set of patterns considered.

This perspective is well-suited for tools to work on patterns in real-world projects, where the patterns are indeed applied in many variant forms. The problem of this approach is that every known pattern that is variant-fragile must be reconsidered and have its solution split into a formal part and one or several sub problems to be addressed by specific, lower-level patterns, which themselves must be formalised in the same way.

It is essential that for every problem (“intent”) we can enumerate every pattern that addresses it. Intents can be also classified as a taxonomy, where some intents are specialized versions of more generic intents.

This approach does not claim to formalize the full potential of patterns, it only aims at enabling tools to understand patterns that are already there so that to assist the developers for various tasks.

Read More

Systematic combination of subpatterns generates design patterns

Composite patterns, such a the Bureaucracy pattern, are patterns built by the composition of other “smaller” patterns. However even usual design patterns can be considered composite patterns made of smaller subpatterns.

The goal is therefore to find out which are the main subpatterns that enable to reconstruct as many design patterns as possible.

The subpatterns

From an early analysis I believe that four subpatterns form the foundation of many GoF design patterns:

  1. Type Hierarchy and its variants (interface, abstract class, non final class…)
  2. Type Set, a set of types with no particular relationship between them
  3. Delegation and its variants (one-to-one, one-to-many)
  4. Allocation (creation of instances) and its variants

The experiment

To validate this proposal, I have coded a small experiment to generate every combination of two subpatterns out of (1) and (2) plus one relationship between them out of (3) and (4), or every combination of one subpattern out of (1) and (2) plus a relationship of (3) and (4) to either itself, or in the case of (1), between a super type and a subtype or the other way round.

For each generated combination I have automatically generated its corresponding class diagram, and grouped them together into one SVG document, converted into PDF using Inkscape. The resulting picture is available here, and you can see a partial preview below:

combinations_thumb

Results

The interesting finding is that this simple combinatorial method re-discovers 11 well-known design patterns (in the reading order of the picture):

  • Template Method (delegation to self, expecting a subtype to provide the service)
  • Composite (Considering a one-to-many delegation) or Decorator / Proxy (one-to-one delegation)
  • Prototype (creates instance of this type)
  • Bridge (one Abstraction hierarchy delegates to another Implementor hierarchy)
  • Abstract Factory (one Factory hierarchy creates instances of another (or several others) Products hierarchy)
  • … skip 4 next …
  • Strategy or State or Builder (some client delegates to some hierarchy)

Note that there are several cases where we could see the Factory Method, Adapter or Facade patterns. This would lead to some 13 design patterns out of the 23 in the GoF book.

Discussion

It is obvious anyway that the four subpatterns used here are not enough to rebuild the full set of the 23 design patterns. The Command pattern is unique in its use of a separate invoker, and the Chain of Responsibility is clearly about how handlers self-assign a given task without any centralized management. In these two examples however, the unique characteristics is not in their structure but rather in their dynamics. Sequence diagrams would be much more relevant to convey that kind of information.

The full test runs in one go in a fraction of second. The source code is available here: patternitygraphic_src, the code of the experiment is in CombinationTest.

EDIT: the plain SVG file that was generated is also available here for download:

Read More

New: Java API for UML diagrams

As part of the Patternity effort, I spent some time creating a simple Java API to generate UML diagrams programmatically from Java, in SVG format.

This small API called for now Patternity Graphic is working and available here: patternitygraphic_src as a source Zip (alpha release of course).

It can render small class diagrams with hierarchic, flow and radial layouts, and arbitrary sequence diagrams with unlimited nesting of method calls and embedded comments. My focus was to support the subset of diagram elements and capabilities required to display patterns occurrences.

Here is a sample sequence diagram with a nice call stack:

sequence1

And here is a class diagram for a simple dummy hierarchy:

hierarchy

Boxes and links have various styles, defined in a template.svg template file, here is a random display of the boxes styles:

boxstyles

Apart from unit tests there is no documentation. If you are interested to reuse that please contact me for help. The Zip was exported in Eclipse with the project Export… function, and the project requires Commons Collections. Svg diagrams can be converted into images using Batik.

Other projects you might consider to generate UML diagrams: UMLGraph, modsl, umlspeed, jsigner, MetaUML, and of course GraphViz.

Any feedback appreciated!

Read More

Documenting the design of software using patterns?

I have experimented an approach that considers every design pattern as the recursive composition of smaller patterns. This led to a prototype tool to illustrate its benefits by generating design-level documentation of annotated source code.

Eat your own dog food

The source code of this tool itself was used as the code base to apply the tool on, in order to generate its own design documentation. The Java Doclet API was used to retrieve the Javadoc tags in the class Javadoc comments, e.g. :

@pattern KnowledgeLevel OperationLevel=MyOtherClass some free text comment…

The pattern catalogue

A subset of design patterns and some other patterns such as the Knowledge Level (Fowler) has been defined in small definitions files then loaded at tool startup. We call this set of patterns definitions the pattern catalogue.

Here is an example of the definition file for the Builder pattern, notice the declaration of the member roles at the bottom, that also declares the expected pattern kind:

Encapsulates the creation of a complex object from a source object.
@extends DesignPattern
@category DesignPattern
@category Creational
@author GoF
@book [GoF 95]  E. Gamma, R. Helm. R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley 1995.
@style ObjectOrientedProgramming
@reference [http://en.wikipedia.org/wiki/Builder_pattern ]
@defaultrole Builder
@role Director=TypeHierarchy
@role Builder=TypeHierarchy
@role Product=TypeSet
@role Factory=AbstractFactory

Parsing the annotations to build the pattern occurrences graph

Every time a Javadoc tag “@pattern Pattern” is met in the Javadoc part of a class, an occurrence of the corresponding pattern is created in memory, linking to the corresponding class in the Java programming language metamodel.

Whenever such pattern uses a Type Hierarchy (this is declared in the definition of the pattern), then the tool will look for every subtype of the declared (super) type in order to complete the pattern occurrence as well as possible.

Even patterns that are usually considered as atomic patterns can be explained as the composition of smaller “elemental” patterns. For instance, A Builder [GoF] usually defines an AbstractBuilder role and a ConcreteBuilder role that implements it. We can express this part of the Builder pattern as a TypeHierarchy elemental pattern in the role Builder; the supertype of the TypeHierarchy therefore represents the AbstractBuilder role, while every other concrete type of this hierarchy represents ConcreteBuilder roles.

By analogy with the members of a class, we call members patterns and member occurrences the patterns and occurrences (respectively) a pattern or occurrence is composed of. The tool totally ignore fields and methods, it only considers types (interfaces and classes).

The benefits of this approach are to reuse a bigger part of the pattern definitions through composition and to help address the variant problem.

Once the pattern occurrences graph is completely built, it is rendered into an UML class diagram using Graphviz dot. Here is an example of overview for the full project:

coremodulediagram_final
Example of overview diagram for the Core module of the prototype source code

In the above diagram we can see how the patterns naturally sit on top of the several type hierarchies while connecting them together. Also the relationships between patterns become very obvious, such as the Interpreter that references the Builder and the Composite sub-patterns. This suggests that such diagrams are very valuable to document the design of a code base.

Topological sort

The approach of representing every pattern as the composition of other patterns yields to directed graphes, at the pattern level (the “meta” level) and at the pattern occurrences level. One consequence is that we can apply a topological ordering on such graphs. In the current prototype we need to do a topological sort on the patterns definitions so that we can know in advance in what order to process pattern occurrences declarations: the patterns used must be built before the patterns using them.

The pattern occurrences graph can also be navigated using the usual Depth-First search or Breadth-First search. DFS is convenient to generate the Overview diagrams such as the diagram shown above, whereas BFS is convenient to drill down the design from the top to the bottom, one level at a time. This can therefore generate the following table of contents:

Patterns graph navigation index
Patterns graph navigation index

Natural language generation alternative

As an alternative to visual diagrams, a tool can also generate natural language text to communicate the exact same information. I have tried this approach using simple text templates in each pattern definitions. Each pattern therefore defines its own design description, but using variable names to be filled later with the data from the pattern occurrences graph.

The biggest work in this approach is to deal with the plural, the enumeration of collections (comma between each item but the last etc.), truncating collections that are too long, and the complete omission of a section if a pattern member is totally missing.

Here is a sample of text generation for the prototype tool itself. Notice how the structure is rigid for each pattern occurrence: Pattern name, short description, then template text evaluated with the actual types names, followed by the comment that was put after the @pattern declaration:

Design of the module Core
This module is essentially made of 9 pattern occurences: Strategy, Visitor, KnowledgeLevel, Interpreter, Builder and AbstractFactory.

Visitor
The visitor design pattern is a way of separating an algorithm from an object structure. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures
This module defines a Visitor OccurrenceVisitor. comment

Strategy
A Strategy object encapsulates an algorithms that can be selected on-the-fly at runtime.
This module defines the interface of a Strategy: Plugin. It enables plugins to manipulate the Patternity metamodel.

AbstractFactory
Provides one or more method(s) that encapsulate(s) the creation of object.
The type OccurrenceFactory defines the interface of an AbstractFactory that creates instances of Occurrence and TypeOccurrence.

Interpreter
Define a representation for a language along with an interpreter that uses the representation to interpret sentences in the language.
The type(s) Occurrence and TypeOccurrence define(s) an object representation of the considered language. The considered language describes a graph of patterns occurrences. It uses the Visitor Visitor to add new operations to the object representation without modifying this structure.

Strategy
A Strategy object encapsulates an algorithms that can be selected on-the-fly at runtime.
This module defines the interface of a Strategy: Loader. Encapsulates how to load every pattern definition into the pattern.repository.

[...]

KnowledgeLevel
A Knowledge Level is a group of objects that describes how another group of objects should behave.
This module defines a KnowledgeLevel (also known as metamodel). It is made of two levels of objects: objects in the knowledge level, i.e. the instances of Pattern define how objects in the operational level can behave, i.e. instances of Occurrence. Typically each type in the operation level maintains a reference to its corresponding type in the knowledge level in order to know how to behave.

Conclusion

Later, at work, when asked to produce a design documentation targeted at other developers, I have experimented doing a similar effort manually, using nothing but patterns. I am still impatient to receive some feedback from the people that will read the resulting documentation in order to validate or infirm this approach.

Read More