Patterns for using custom annotations

If you happen to create your own annotations, for instance to use with Java 6 Pluggable Annotation Processors, here are some patterns that I collected over time. Nothing new, nothing fancy, just putting everything into one place, with some proposed names.

annotation

Local-name annotation

Have your tools accept any annotation as long as its single name (without the fully-qualified prefix) is the expected one. For example com.acme.NotNull and net.companyname.NotNull would be considered the same. This enables to use your own annotations rather than the one packaged with the tools, in order not to depend on them.

Example in the Guice documentation:

Guice recognizes any @Nullable annotation, like edu.umd.cs.findbugs.annotations.Nullable or javax.annotation.Nullable.

Composed annotations

Annotations can have annotations as values. This allows for some complex and tree-like configurations, such as mappings from one format to another (from/to XML, JSon, RDBM).

Here is a rather simple example from the Hibernate annotations documentation:

@AssociationOverride( 
   name="propulsion", 
   joinColumns = @JoinColumn(name="fld_propulsion_fk") 
)

Multiplicity Wrapper

Java does not allow to use several times the same annotation on a given target.

To workaround that limitation, you can create a special annotation that expects a collection of values of the desired annotation type. For example, you’d like to apply several times the annotation @Advantage, so you create the Multiplicity Wrapper annotation: @Advantages (advantages = {@Advantage}).

Typically the multiplicity wrapper is named after the plural form of its enclosed elements.

Example in Hibernate annotations documentation:

@AttributeOverrides( {
   @AttributeOverride(name="iso2", column = @Column(name="bornIso2") ),
   @AttributeOverride(name="name", column = @Column(name="bornCountryName") )
} )

annotationbis

Meta-inheritance

It is not possible in Java for annotations to derive from each other. To workaround that, the idea is simply to annotate your new annotation with the “super” annotation, which becomes a meta annotation.

Whenever you use your own annotation with a meta-annotation, the tools will actually consider it as if it was the meta-annotation.

This kind of meta-inheritance helps centralize the coupling to the external annotation in one place, while making the semantics of your own annotation more precise and meaningful.

Example in Spring annotations, with the annotation @Component, but also works with annotation @Qualifier:

Create your own custom stereotype annotation that is itself annotated with @Component:

@Component
public @interface MyComponent {
String value() default "";
}
@MyComponent
public class MyClass...

Another example in Guice, with the Binding Annotation:

@BindingAnnotation
@Target({ FIELD, PARAMETER, METHOD })
@Retention(RUNTIME)
public @interface PayPal {}

// Then use it
public class RealBillingService implements BillingService {
  @Inject
  public RealBillingService(@PayPal CreditCardProcessor processor,
      TransactionLog transactionLog) {
    ...
  }

Refactoring-proof values

Prefer values that are robust to refactorings rather than String litterals. MyClass.class is better than “com.acme.MyClass”, and enums are also encouraged.

Example in Hibernate annotations documentation:

@ManyToOne( cascade = {CascadeType.PERSIST, CascadeType.MERGE}, targetEntity=CompanyImpl.class )

And another example in the Guice documentation:

@ImplementedBy(PayPalCreditCardProcessor.class)

Configuration Precedence rule

Convention over Configuration and Sensible Defaults are two existing patterns that make a lot of sense with respect to using annotations as part of a configuration strategy. Having no need to annotate is way better than having to annotate for little value.

Annotations are by nature embedded in the code, hence they are not well-suited for every case of configuration, in particular when it comes to deployment-specific configuration. The solution is of course to mix annotations with other mechanisms and use each of them where they are more appropriate.

The following approach, based on precedence rule, and where each mechanism overrides the previous one, appears to work well:

Default value < Annotation < XML < programmatic configuration

For example, the default values could be suited for unit testing, while the annotation define all the stable configuration, leaving the other options to  configure for deployments at the various stages, like production or QA environments.

This principle is common (Spring, Java 6 EE among others), for example in JPA:

The concept of configuration by exception is central to the JPA specification.

Conclusion

This post is mostly a notepad of various patterns on how to use annotations, for instance when creating tools that process annotations, such as the Annotation Processing Tools in Java 5 and the Pluggable Annotations Processors in Java 6.

Don’t hesitate to contribute better patterns names, additional patterns and other examples of use.

EDIT: A related previous post, with a focus on how annotations can lead to coupling hence dependencies.

Pictures Creative Commons from Flicker, by ninaksimon and Iwan Gabovitch.

Read More

Patterns express intents

Patterns represent a couple (intent, solution); sometime they refer to a solution, more often they essentially represents an intent, independently of its solution.

Sometimes the solution part of patterns includes a trick or a workaround to overcome the limits of a language, but patterns cannot be reduced to that trick. Indeed, a very important role of patterns (not only design patterns but patterns in general ) is that they represent stereotypes of intents.

A matter of intent

Therefore, it does not really matter if the Strategy pattern can be expressed using a Java interface, a C++ functor or a first-class function: it remains a Strategy because this is just what we want: “Strategy lets the algorithm vary independently from clients that use it“.

Is the intent of this book holder clear?
Is the intent of this book holder clear?

Another similar pattern is the Command pattern, which intent is:” Encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations.” Here the intent talks about ‘object’ because it was written for an object-oriented context, however it can easily be made generic if you think ‘handle on function’ (or closure etc.) instead of object. Again, even if first class functions such as delegates in C# can achieve this goal, they do not replace the need to declare the precise intent: “you want to parameterize clients with different requests, queue or log requests, and support undoable operations.”  So in some sense, just using a functor without declaring that the intent is to do a Strategy or a Command is like using untyped variables: you are supposed to know what you are doing, but it is implicit.*

Yet another example with the Visitor pattern and its intent: “Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.” This is typically achieved through double-dispatch in languages lacking multimethods, but regardless of how it is implemented the intent remains, and this is what matters most.

Generic Vs. specialized intents

For example, the intent of the generic Proxy pattern defined in a paper from James Noble:

The Proxy pattern is used to “Provide a surrogate or placeholder for another object the Subject to control access to it”. A Proxy object provides the same interface as the original Subject object, but intercepts any messages directed to the Subject. A Proxy object can therefore be used in place of the Subject by a client which is designed to access the Subject, without the client being aware the Subject has been replaced by a Proxy.

This intent can then be specialized for various purposes, leading to several specialized patterns:

What's your intent if you buy that? (yes this is brand new furniture for sale)
What's your intent if you buy that? (yes this is brand new furniture for sale)
  • Remote Proxy: provides a local representative to an object that is only available on a remote machine
  • Protection Proxy: checks the access rights before directing to the original object
  • Virtual Proxy: creates expensive object only on demand so that they are created only when necessary
  • Cache Proxy: an object representative that remembers the result of calling the methods of an object to avoid directing subsequent call again to this object
  • Counter Proxy (smart pointer management), etc.

We say that these patterns are specializations of the Proxy pattern. The main Proxy pattern introduces the common solution—providing a placeholder for an object. Every specialized proxy pattern is a special kind of Proxy: A “Protection Proxy” is a special kind of “Proxy”. The specialization here only deals with the Intent part of the patterns.

Conclusion

Now that functional languages are getting more attention, it becomes fashionable to question the usefulness of patterns: “Scala does that without the need for patterns”. I agree Scala is great, I disagree this argument. Patterns are first of all signs to denote intents, even if they can do more.

References

Patterns as Signs, James Noble and Robert Biddle, Victoria University of Wellington, New Zealand

Classifying Relationships Between Object-Oriented Design Patterns, James Noble, Microsoft Research Institute, Macquarie University

* By the way, how to achieve “undoable operations” by using first class functions in an elegant way? this would require passing two functions always together: do() and undo()

Read More

The Self-descriptiveness pattern

The Self-Descriptiveness pattern can save your life many times in the course of any software project. The power behind it is to harness the computer, rather than yourself.

What is Self-Descriptiveness?

Self-Descriptiveness is a property of any system that is able to describe itself with no external help. Just ask it “describe yourself” and it will. In other words, it is a system where its documentation is embedded into itself.

This pattern is so essential, however it is not expressed often. I imagine it is too obvious for those that appreciate it, whereas other do not imagine its benefits and focus on how memory or bandwidth they save instead.

You can find this pattern in many areas:

Database

Database tables in many databases are self-descriptive, thanks to definitions or system tables. This enables to query the database for its own schema, without any other documentation or explanation. This also enables tools to browse a database just by connecting to it. Databases try to be user-friendly (SQL), so here Self-Descriptiveness also means affordability for users.

XML

XML is one of the most famous example of self-descriptive format. The use of named tags around the values to describe them is supposed to enable humans or even tools to read and extract their meaning even with no prior knowledge of the exact format. The ordering could vary, elements may be missing, it would remain readable. In this case Self-Descriptiveness means robustness against variations, perhaps to be more future-proof.

This key holder is self-descriptive
This key holder is self-descriptive

Spreadsheets, associative arrays

Spreadsheets with header columns, Map and associative arrays in general are other low-tech forms of self-descriptiveness we use all the time. You don’t have to remember what each columns represents (or each slot in a Map), the headers (or keys) tell you that directly. Here Self-Descriptiveness means convenience for the developers.

Reflection

Reflection built into languages such as Java enables the software to query its own structure. It becomes then possible to introspect each class to find out which fields and methods it has, and what class or interface its extends and implements. The metamodel of Java makes class definitions self-descriptive. This feature is essential to load at runtime code that was not known at compile time. Here Self-Descriptiveness means extensibility.

Analysis patterns

This leads to the Knowledge Level pattern (also known as Metamodel), which uses objects to describe the behavior of other objects. The Operation Level becomes self-descriptive thanks to its Knowledge Level. This is typically how databases and virtual machines implement their Self-Descriptiveness. The Knowledge Level pattern is key to achieve Self-Descriptiveness for systems.

The Quantity patterns is a simple yet very powerful form of Self-Descriptiveness for your ordinary project, especially in the Domain Layer. The idea is to keep the unit attached with the value. In the case of the Money pattern, the unit is the currency. If your application deals with percents, by all means introduce from the beginning a class Percentage that follows the Quantity pattern. This simple class has the potential to completely eliminate every percent-related bug! In the finance domain, interest rates, usually expressed in percents but that are actually in percent per year, deserve their own Quantity class as well. This class should be distinct from the usual percentage. Again this simple decision wil ensure that no one will ever confuse one for another. You could even go further and type each kind of interest rate by its destination, either a coupon rate, or a yield, and therefore have the computer verify they must not be confused.

Be explicit!

Self-Descriptiveness could be generalized into one principle: “Be explicit!”. This means that everything implicit must be identified and made explicit. Your application only deals with one currency? Make the currency explicit anyway. Your time periods are all an integer number of years? Make the time unit (years) explicit anyway. All your data come from Reuters? Add a field “Source” to track the source of the data anyway. Even if YAGNI, the consistency of your domain matters. And perhaps you will gonna need it!

This chair is self-descriptive, and explicit!
This chair is self-descriptive, and explicit!

Self-Descriptiveness is obviously very precious for debugging and maintenance. First you reduce the possibility of bugs: you can try to sum 0.05 +2% + 35bp (basis points are one-hundreds of percent) and still have a valid result (7.35%) if you use Quantity objects for them. Then debugging becomes a breeze, everything tells its story, especially if you have written judicious toString() methods.

Optimization on the other hand appears to dislike Self-Descriptiveness. XML is very verbose, a Quantity object consumes more memory than just the primitive value inside, etc. In most cases, it is not worth compromising Self-Descriptiveness for the sake of optimization. And even when you need real performance, there are solutions to retain the benefits of Self-Descriptiveness without its cost. XML compression works very well, associative maps can sometimes be implemented as random access in an array, etc. And extreme needs can be satisfied with extremely sophisticated solutions*.

To be fully honest, Self Descriptiveness can hardly be achieved in practice as explained by Mark Baker in his blog, but your project deserves as much Self Descriptiveness as you can possibly afford!

*For instance the FIX protocol, used for financial data feeds, is a very old key-value format. It is not really self-descriptive as the keys are numbers, not labels. However just like self-descriptive formats, the keys impede efficiency: they consume bandwidth for almost nothing. So instead of changing the format in favour of efficiency, the protocol FAST has been invented. This optimization done by the computer, not by developers, extracts in real time the values using a message template, compress them and send them in fixed order, in other words in a fully implicit way. On reception the message is automatically rebuilt against the template. More details about FIX and FAST can be found here and here. This is a fairly sophisticated solution.

Pictures taken at the Milano Furniture Fair 2009.

Read More

Software design as a matter of economy

Economy is the management of limited resources. In software development the limited resources (*) are:

  • memory of the developer: names of types, methods signatures, tricks to know
  • time available to learn or discover new things

Therefore, every design decision should attempt to reach an optimal balance between both memory and time being consumed in the developer head.

Memory effort for the developers is positively reduced with:

  • less abstract types to know (make interfaces more generic, use more polymorphism, favor generic approaches)
  • less methods in interfaces (minimal interfaces)
  • logical system of conventions: classes/methods naming (idioms)
  • frequent use of the same set of concepts (patterns)
  • sharing of one same ubiquitous language between business talking and technical models (sharing of words as resources)

Time to learn:

  • more expressive names, dedicated for the case at hand (less generic)
  • humane interfaces (easier to find how to use them)
  • value objects, that bring explicit names even for simple primitive-like things
  • favor specialized classes and interfaces (that reflect the concrete elements of the domain) rather than generic interfaces
  • use practices, idioms and patterns already well known within the team

As usual, the optimum has to balance opposite forces, and will be somewhere between the two extreme approaches: smaller, full generic but totally unexpressive, or fully specialized, hence with some duplication of ideas but more expressive.

By making more concepts of an application sub-types of few interfaces, we obviously reduce the memory effort to deal with the code base: once created, instance of various types can be used under the same interface. Design patterns often also propose their participants to adhere to some existing interface (think about the Composite, Decorator, Proxy, Object Adapter) also encourage this trend.

On the other end there are design patterns that definitly make a design more costly for both limited resources, typically by introducing extra interfaces with no gain of expressivity. To discuss this case we need to introduce an extra quality criteria on top of our limited resources: the need for extensibility or flexibility. In this discussion design decision need to consider three antagonist forces: the memory effort in the developer head, the time needed by the developer to learn the code base, and the need for flexibility.

Every developer that feels concerned about maintenability or teamwork should be aware about this trade off in order to address it explicitly rather than just by gut feeling. For instance, the more unexperienced the coworkers, the less generic a design should be if they have to support it without help, but opportunities to rationalize the design will be wasted, hence some loss of agility. On the other hand one can decide to go more generic anyway while doing something to improve the expressivity of the code, e-g by introducing empty, useless but expressive sub interfaces just for the sake of expressivity (Taxonomy), or doing some effort on the documentation and training.

This approach of explicitly dealing with the memory and time being consumed in the developer head may suggest new metrics about design quality. I do not have the pretention to propose good metrics here but here are some ideas:

  • number of different interfaces (or number of fully qualified method signatures) over size of the code base (maybe with smaller weighting on constructors) as a way to estimate how much memory is needed to be on top of the code base
  • proportion of domain-specific names (e-g “Tax”, “Shipping”) within the code base as a way to estimate how specialized the code is
  • proportion of abstract words from a corpus (such as “data” or “processor”) within the code base as a way to estimate how generic the code is
  • amount of abbreviation and Flesher reading test applied for type and methods naming as a way to estimate how fast it is to discover the code base
  • number of explicitely mentioned idioms and patterns (either by javadoc documentation or obvious naming conventions) over size of the code base

(*) We assume here that runtime efficiency is not a constraint since this discussion focuses on software development methodology and human to machine aspects.

Read More

Sharing a common architecture or vision within a team

What do you think about when you hear the word “architecture” about software? Fowler defines it: whoNeedsArchitect “In most successful software projects, the expert developers working on that project have a shared understanding of the design system design. This shared understanding is called ‘architecture.'”.

However for most people the word “architecture” comes full of middleware connotations, hence I consider this word now inappropriate. Let us consider here the concept of a shared “vision” of the design.

First it has to be a shared vision of the domain, for instance the interest rate swaps trading business. Words must have only one meaning, known by everyone, from IT, marketing or support. A shared glossary is key for that.

Then the IT team must share a consistent set of practices: consistent between developers, and consistent between practices themselves. This also means that the set of practices de-facto excludes other practice that conflict with it, and this has to be clearly acknowledged.

Even though any set of practices can be considered arbitrary, everybody in a team must adhere to it, preferably wholeheartedly. Of course the more accepted and documented the practices, the easier it is for everybody to adhere to them.

For example, a team can have a set of practices that includes Agile programming (be prepared for change, no big design upfront), automated testing (unit tests, stress tests), domain-driven analysis as devised by Eric Evans, the use of proven solutions when they exist (hence design and analysis patterns from GoF, POSA, Fowler, Doug Lea…), a bit of functional-programming style (favour immutability and side-effect-free functions, pipe and filters…), and Eclipse default coding conventions.

Some practices that also belong to the shared set of practices might not be directly documented, though they derived from articles, papers, blogs and some personal opinions. Examples of such custom practices that I like a lot include “Align the code with the business” and “think options not solutions”.

“Align the code with the business”, focuses at the smallest level, i.e. at the class and method level. Just every bit of the application must make sense with the domain considered. A class must reflect some relevant concept, concrete or more abstract, of the domain, not of the requirement. The rationale is that while requirements change all the time, the domain concepts are very stable; they are not true or false, they just exist!

“Think options not solutions” refers to the very basics of Agile and XP: “be prepared for change”. It also acknowledges the fact that most of the code in a well done software project provides standalone, encapsulated and reusable building blocks that are then assembled together using a smaller amount of glue code, that can even be reduced by using a dependency-injection framework such as Spring. Typically, a concept from the domain, such as an interpolation algorithm in the finance domain, translates into an Interpolation interface along with several alternate implementations. Most of the time, only one implementation out of the ten possible is really used at a time. However the option to use the other nine alternatives is there. This also means that we can assemble a release of the software using some subset of its building blocks, and assemble another release using another subset, at the same time. One codebase, several possible applications: this is the goal.

In addition to every explicit practices, the existing code base implicitly shows practices that can either be reused just for the sake of consistency, or replaced as widely as possible when they conflict. Prior art, when of good-enough quality, can form a skeleton for a shared vision of how to create software. Again this is arbitrary, but consistent, and consistent design is valuable in itself, and often more than the design decisions themselves.

Of course there is a point where more shared practices become counter-productive, for instance when there are too many of them (more efforts to learn by newcomers, feeling of loosing freedom, lack of flexibility). However when they are standard enough, or becoming standard enough, developer should consider that acquiring these practices a long term, beneficial, investment.

Initially published on Jroller on Monday January 21, 2008

Read More

Choosing good names for your Java classes

With Object-Oriented programming we often have many classes, therefore it is really important to name them well.

It’s amazing how I can spend so much time searching for good names for classes, and I do think it is worth doing it well, after all naming is all about making the code easy to read and to understand, which is a major concern in professional programming.

However it is not easy to find good names, and by the way what does “good” naming means ? I claim that a name is good when most people understand it the same way. This gives already a valuable advise to check a name: ask your colleagues what they think of it, or what they would suggest instead. In pair programming, this is obviously always the case, and this sure helps.

But how to find good names in the first place ? First the domain analysis gives good names, because they are names everybody agrees on [1].

Then whenever you use design patterns (or analysis, UI or J2EE patterns), these patterns suggest rather standard prefixes or postfixes for your classes and methods names. For example, if you implement the State pattern, one would expect classes with names like “MyDomainThingState”. Of course this is not a rule, but if you have no good reason not to do this way, you’d better follow the most standard way.

Also, any convention or API that is well-known can be used as a naming guideline, such as the Java API, Commons Collections and some major open-source projects. Any Iterator must be called “SomeIterator”, and a class that purpose is to filter objects is a “SomePredicate”. Naming interfaces that define a capability with “SomeCapability-able” is a very common usage, as well as the use of plural to name a class that operates on a type hierarchy such as Collections (note the plural) that operates over several Collection instances.

Of course Hungarian Notation should be avoided (do not prefix interfaces with “I”) as nobody should ever care that a type is an interface or not to use it the same fashion, only the factory has to know about this implementation detail.

Some words do not mean anything precise, like “data”, “object”, “wrapper”… As a result they do not bring any benefit, they are just in the way. Just try to get rid of them or replace them with words from the domain (with Rename refactoring this is now very easy) and you will be surprised of the level of expressiveness you gain.

Often you think of a name, but for many reasons you cannot use it: maybe you already used it to mean something else, or it suggests something wrong to some people, or it does not suggest anything, or you really do not like it… Then I very often use an online dictionary and a thesaurus to find synonymouses and related words.

A random search in a web search engine is also a good way to go in-depth in the semantic context of a name, in order to find similar and related words that may be used for your naming. This is particularly relevant in specific domains, like finance, where dictionaries are not up-to-date or relevant. By seeing words already used in their context, you can use them more accurately. For example, in finance, we can talk a a group of products as a Package, a Bundle, or a Strategy. The latter is not good because it could mislead to the Strategy pattern. A google search only could help to choose Package, or Bundle, according to the very case you are dealing with.

Package names also help in this topic. Do not forget a class does not have to tell everything in its name, as its package name is part of the actual name and has the purpose to give a context to the short class name. As a consequence, it is not a problem to have the same short class name several times in a project, as long as they are not used together too often. Typically if you have an abstract factory of two families of things, each thing in each family can have the same name, in a dedicated package.

[1] See Ubiquitous Language, by Eric Evans – www.domaindrivendesign.org

UPDATE: An interesting reader’s comment have been destroyed when cleaning spams: this comment suggested to use Eclipse ctrl-shift-t/ctrl-shift-r to see lists of existing class names. Apologises for that.

UPDATE: A class name can reflect what the class really is, or what it is used for. In many cases naming a class according to what it is makes its more reusable, in which case the instance name (field of variable name) tells its purpose for the particular use.

Initially published on Jroller on Thursday April 27, 2006

Read More