Make Money vs Reduce Risks dichotomy

In sports, football for example, players have only one goal in mind: score, score, again and again, as often as possible. Close to them, but not too close, arbiters have only one goal in mind: detect quickly all violations of the rules of the game, and sanction them.

The players know the rules, still we need an antagonism role, the arbiters, to keep the game fair. It is never perfect, but this is not plain chaos either.

Many mature businesses have chosen a similar structure. There is a role to make money, as much money as possible, and another role to control risk under an acceptable level.

20141125-001955.jpg

In finance, this schema is visible at several levels. Traders and sales people in the front-office focus on making money, while officers in the risk department closely monitor their activity to control they don’t go too far. We hear very loud in the news when the traders go too far. We don’t hear much when the risk people go too far, but reducing risk usually hurts profitability in the short term.

This schema occurs again between banks, who want to make money, and the regulators, who is supposed to protect the country and the customers. That the regulators do a good job or not is not the point of this article, my point is that there is a common business pattern there.

When there is a common business pattern, and when the business is heavily supported by software systems, does this mean there is a corresponding pattern in the software itself? I believe there is, a bit like a generalized Conway’s Law. The corresponding software pattern is: when the business has an obvious antagonism like “Making Money vs Reducing Risk”, then it probably calls for two distinct Bounded Contexts in the corresponding software.

This dichotomy is not a rule, it is just an heuristics to suggest there may be a need for two distinct Bounded Contexts.

Who is the key decision maker is probably the question that shapes everything. I learnt that a few years ago in a course with @ziobrando. In particular, when two management hierarchies are involved, even if their visions coincides right now, it’s unlikely that both visions will evolve the same way over time. This is a reason to split the solution in two Bounded Contexts that will evolve independently. So if you have a Direction of Trading and the Direction of Risk, you’re in this situation.

Modeling in the two contexts

Making money typically involves good commercial relationships and a competitive pricing expertise, plus enough speed to react to opportunities.

Software systems for that typically manages the business one deal at a time. They often need to be real-time, or fast enough not to lose impatient customers. Sometimes we may even accept to trade calculation accuracy in exchange for speed. For example we may be using floating point calculations instead of Big Decimals, or an approximation instead of the exact formula.

Software systems to support making money need to help people doing the sales to be fast, for example with rich defaulting of the input values.

By contrast, software for officers who want to reduce or control risk often computes risk metrics out of a lot of deals. It may be fraud analysis or a stress tests simulating markets crisis. It is often just computing the overall risk taken by summing up the numbers from each deal. Some do that in realtime too, but usually it can accommodate much slower paces like on-demand, daily, weekly or even monthly.

20141125-002012.jpg

Sometimes the competition is so tight that risk control becomes the key differentiator to make money between competitors. In this situation risk control has become another miniature sub-domain within the domain of selling and pricing. Still, it has its own risk-oriented perspective on the business, and it is like a delegation of responsibility from the risk officers to the front office people and their trading bots. Even in this situation there will also be a full-featured domain of risk control outside, with the corresponding software in its own Bounded Context.

A developer example

DevOps is the classical example in software development: developers want to release often to deliver more value. However ops people know that each release comes with risks for the production so they traditionally prefer to release less frequently. “No release, no risk” would be ideal for them.

20141125-002020.jpg

In this scheme, developers and ops teams use different tools, and don’t monitor the same indicators. When they get closer as in DevOps, the ops usually delegate some risk control to the development team and their automated testing tools, but they keep their own expertise and their specific tooling.

Many thanks to @ziobrando and @mathiasverraes for the early feedback and some complements incorporated into the text.

Read More

TDD Vs. math formalism: friend or foe?

It is not uncommon to oppose the empirical process of TDD, together with its heavy use of unit tests, to the more mathematically based techniques, with the “formal methods” and formal verification at the other end of the spectrum. However I experienced again recently that the process of TDD can indeed help discover and draw upon math formalisms well-suited to the problem considered. We then benefit from the math formalism for an easier implementation and correctness.

It is quite frequent that maths structures, or more generally “established formalisms” as Eric Evans would say, are hidden everywhere in the business concepts we need to model in software.

Dates and how we take liberties with them for trading of financial instruments offer a good example of a business concept with an underlying math structure: traders of futures often use a notation like ‘U8’ to describe an expiry date like September 2018; ‘U’ means September, and the ‘8’ digit refers to 2018, but also to 2028, and 2038 etc. Notice that this notation only works for 10 years, and each code is recycled every decade.

The IMM trading floor in the early 70's (photo CME Group)

In the case of IMM contract codes, we only care about quarterly dates on:

  • March (H)
  • June (M)
  • September (U)
  • December (Z)

This yields only 4 possibilities for the month, combined with the 10 possible year digits, hence 40 different codes in total, over the range of 10 years.

How does that translate into source code?

As a software developer we are asked all the time to manage such IMM expiry codes:

  • Sort a given set of IMM contract codes
  • Find the next contract from the current “leading month” contract
  • Enumerate the next 11 codes from the current “leading month” contract, etc.

This is often done ad hoc with a gazillion of functions for each use-case, leading to thousands of lines of code hard to maintain because they involve parsing of the ‘U8’ format everytime we want to calculate something.

With TDD, we can now tackle this topic with more rigor, starting with tests to define what we want to achieve.

The funny thing is that in the process of doing TDD, the cyclic logic of the IMM codes struck me and strongly reminded me of the cyclic group Z/nZ. I had met this strange maths creature at school many years ago, I had a hard time with it by the way. But now on a real example it was definitely more interesting!

The source code (Java) for this post is on Github.

Draw on established formalisms

Thanks to Google it is easy to find something even with just a vague idea of how it’s named, and thanks to Wikipedia, it is easy to find out more about any established formalism like Cyclic Groups. In particular we find that:

Every finite cyclic group is isomorphic to the group { [0], [1], [2], …, [n ? 1] } of integers modulo n under addition

The Wikipedia page also mentions a concept of the product of cyclic groups in relation with their order (here the number of elements). Looks like this is the math-ish way to say that 4 possibilities for quarterly months combined with 10 possible year digits give 40 different codes in total.

So what? Sounds like we could identify the set of the 4 months to a cyclic group, the set of the 10 year digits to another, and that even the combination (product) of both also looks like a cyclic group of order 10 * 4 = 40 (even though the addition operation will not be called like that). So what?

Because we’ve just seen that there is an isomorphism between any finite cyclic group and the cyclic group of integer of the same order, we can just switch to the integer cyclic group logic (plain integers and the modulo operator) to simplify the implementation big time.

Basically the idea is to convert from the IMM code “Z3” to the corresponding ‘ordinal’ integer in the range 0..39, then do every operation on this ‘ordinal’ integer instead of the actual code. Then we can format back to a code “Z3” whenever we really need it.

Do I still need TDD when I have a complete formal solution?

I must insist that I did not came to this conclusion as easily. The process of TDD was indeed very helpful not to get lost in every possible direction along the way. Even when you have found a formal structure that could solve your problem in one go, even in a “formal proof-ish fashion”, then perhaps you need less tests to verify the correctness, but you sure still need tests to think on the specification part of your problem. This is your gentle reminder that TDD is not about unit tests.

Partial order in a cyclic group

Given a list of IMM codes we often need to sort them for display. The problem is that a cyclic group has no total order, the ordering depends on where you are in time.

Let’s take the example of the days of the week that also forms a cycle: MONDAY, TUESDAY, WEDNESDAY…SUNDAY, MONDAY etc.

If we only care about the future, is MONDAY before WEDNESDAY? Yes, except if we’re on TUESDAY. If we’re on TUESDAY, MONDAY means next MONDAY hence comes after WEDNESDAY, not before.

This is why we cannot unfortunately just implement Comparable to take care of the ordering. Because we need to consider a reference IMM code-aware partial order, we need to resort to a Comparator that takes the reference IMM code in its constructor.

Once we identify that situation to the cyclic group of integers, it becomes easy to shift both operands of the comparison to 0 before comparing them in a safe (total order-ish) way. Again, this trick is made possible by the freedom to experiment given by the TDD tests. As long as we’re still green, we can go ahead and try any funky approach.

Try it as a kata

This example is also a good coding kata that we’ve tried at work not long ago. Given a simple presentation of the format of an IMM contract code, you can choose to code the sort, find the next and previous code, and perhaps even optimize for memory (cache the instances, e.g. lazily) and speed (cache the toString() value, e.g. in the constructor) if you still have some time.

In closing

Maths structures are hidden behind many common business concepts. I developed an habit to look for them whenever I can, because they always help make us think, they help question our understanding of the domain problem (“is my domain problem really similar in some way to this structure?”), and of course because they often offer wonderful ready-made implementation hints!

The source code (Java) for this post is on Github.
Follow me on Twitter!
Photo: CME Group

Read More

A touch of functional style in plain Java with predicates – Part 2

In the first part of this article we introduced predicates, which bring some of the benefits of functional programming to object-oriented languages such as Java, through a simple interface with one single method that returns true or false. In this second and last part, we’ll cover some more advanced notions to get the best out of your predicates.

Testing

One obvious case where predicates shine is testing. Whenever you need to test a method that mixes walking a data structure and some conditional logic, by using predicates you can test each half in isolation, walking the data structure first, then the conditional logic.

In a first step, you simply pass either the always-true or always-false predicate to the method to get rid of the conditional logic and to focus just on the correct walking on the data structure:

// check with the always-true predicate
final Iterable<PurchaseOrder> all = orders.selectOrders(Predicates.<PurchaseOrder> alwaysTrue());
assertEquals(2, Iterables.size(all));

// check with the always-false predicate
assertTrue(Iterables.isEmpty(orders.selectOrders(Predicates.<PurchaseOrder> alwaysFalse())));

In a second step, you just test each possible predicate separately.

final CustomerPredicate isForCustomer1 = new CustomerPredicate(CUSTOMER_1);
assertTrue(isForCustomer1.apply(ORDER_1)); // ORDER_1 is for CUSTOMER_1
assertFalse(isForCustomer1.apply(ORDER_2)); // ORDER_2 is for CUSTOMER_2

This example is simple but you get the idea. To test more complex logic, if testing each half of the feature is not enough you may create mock predicates, for example a predicate that returns true once, then always false later on. Forcing the predicate like that may considerably simplify your test set-up, thanks to the strict separation of concerns.

Predicates work so good for testing that if you tend to do some TDD, I mean if the way you can test influences the way you design, then as soon as you know predicates they will surely find their way into your design.

Explaining to the team

In the projects I’ve worked on, the team was not familiar with predicates at first. However this concept is easy and fun enough for everyone to get it quickly. In fact I’ve been surprised by how the idea of predicates spread naturally from the code I had written to the code of my colleagues, without much evangelism from me. I guess that the benefits of predicates speak for themselves. Having mature API’s from big names like Apache or Google also helps convince that it is serious stuff. And now with the functional programming hype, it should be even easier to sell!

Simple optimizations

This engine is so big, no optimization is required (Chicago Auto Show).

The usual optimizations are to make predicates immutable and stateless as much as possible to enable their sharing with no consideration of threading.  This enables using one single instance for the whole process (as a singleton, e.g. as static final constants). Most frequently used predicates that cannot be enumerated at compilation time may be cached at runtime if required. As usual, do it only if your profiler report really calls for it.

When possible a predicate object can pre-compute some of the calculations involved in its evaluation in its constructor (naturally thread-safe) or lazily.

A predicate is expected to be side-effect-free, in other words “read-only”: its execution should not cause any observable change to the system state. Some predicates must have some internal state, like a counter-based predicate used for paging, but they still must not change any state in the system they apply on. With internal state, they also cannot be shared, however they may be reused within their thread if they support reset between each successive use.

Fine-grained interfaces: a larger audience for your predicates

In large applications you find yourself writing very similar predicates for types totally different but that share a common property like being related to a Customer. For example in the administration page, you may want to filter logs by customer; in the CRM page you want to filter complaints by customer.

For each such type X you’d need yet another CustomerXPredicate to filter it by customer. But since each X is related to a customer in some way, we can factor that out (Extract Interface in Eclipse) into an interface CustomerSpecific with one method:

public interface CustomerSpecific {
   Customer getCustomer();
}

This fine-grained interface reminds me of traits in some languages, except it has no reusable implementation. It could also be seen as a way to introduce a touch of dynamic typing within statically typed languages, as it enables calling indifferently any object with a getCustomer() method. Of course our class PurchaseOrder now implements this interface.

Once we have this interface CustomerSpecific, we can define predicates on it rather than on each particular type as we did before. This helps leverage just a few predicates throughout a large project. In this case, the predicate CustomerPredicate is co-located with the interface CustomerSpecific it operates on, and it has a generic type CustomerSpecific:

public final class CustomerPredicate implements Predicate<CustomerSpecific>, CustomerSpecific {
  private final Customer customer;
  // valued constructor omitted for clarity
  public Customer getCustomer() {
    return customer;
  }
  public boolean apply(CustomerSpecific specific) {
    return specific.getCustomer().equals(customer);
  }
}

Notice that the predicate can itself implement the interface CustomerSpecific, hence could even evaluate itself!

When using trait-like interfaces like that, you must take care of the generics and change a bit the method that expects a Predicate<PurchaseOrder> in the class PurchaseOrders, so that it also accepts any predicate on a supertype of PurchaseOrder:

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
    return Iterables.filter(orders, condition);
}

Specification in Domain-Driven Design

Eric Evans and Martin Fowler wrote together the pattern Specification, which is clearly a predicate. Actually the word “predicate” is the word used in logic programming, and the pattern Specification was written to explain how we can borrow some of the power of logic programming into our object-oriented languages.

In the book Domain-Driven Design, Eric Evans details this pattern and gives several examples of Specifications which all express parts of the domain. Just like this book describes a Policy pattern that is nothing but the Strategy pattern when applied to the domain, in some sense the Specification pattern may be considered a version of predicate dedicated to the domain aspects, with the additional intent to clearly mark and identify the business rules.

As a remark, the method name suggested in the Specification pattern is: isSatisfiedBy(T): boolean, which emphasises a focus on the domain constraints. As we’ve seen before with predicates, atoms of business logic encapsulated into Specification objects can be recombined using boolean logic (or, and, not, any, all), as in the Interpreter pattern.

The book also describes some more advanced techniques such as optimization when querying a database or a repository, and subsumption.

Optimisations when querying

The following are optimization tricks, and I’m not sure you will ever need them. But this is true that predicates are quite dumb when it comes to filtering datasets: they must be evaluated on just each element in a set, which may cause performance problems for huge sets. If storing elements in a database and given a predicate, retrieving every element just to filter them one after another through the predicate does not sound exactly a right idea for large sets…

When you hit performance issues, you start the profiler and find the bottlenecks. Now if calling a predicate very often to filter elements out of a data structure is a bottleneck, then how do you fix that?

One way is to get rid of the full predicate thing, and to go back to hard-coded, more error-prone, repetitive and less-testable code. I always resist this approach as long as I can find better alternatives to optimize the predicates, and there are many.

First, have a deeper look at how the code is being used. In the spirit of Domain-Driven Design, looking at the domain for insights should be systematic whenever a question occurs.

Very often there are clear patterns of use in a system. Though statistical, they offer great opportunities for optimisation. For example in our PurchaseOrders class, retrieving every PENDING order may be used much more frequently than every other case, because that’s how it makes sense from a business perspective, in our imaginary example.

Friend Complicity

Weird complicity (Maeght foundation)

Based on the usage pattern you may code alternate implementations that are specifically optimised for it. In our example of pending orders being frequently queried, we would code an alternate implementation FastPurchaseOrder, that makes use of some pre-computed data structure to keep the pending orders ready for quick access.

Now, in order to benefit from this alternate implementation, you may be tempted to change its interface to add a dedicated method, e.g. selectPendingOrders(). Remember that before you only had a generic selectOrders(Predicate) method. Adding the extra method may be alright in some cases, but may raise several concerns: you must implement this extra method in every other implementation too, and the extra method may be too specific for a particular use-case hence may not fit well on the interface.

A trick for using the internal optimization through the exact same method that only expects predicates is just to make the implementation recognize the predicate it is related to. I call that “Friend Complicity“, in reference to the friend keyword in C++.

/** Optimization method: pre-computed list of pending orders */
private Iterable<PurchaseOrder> selectPendingOrders() {
  // ... optimized stuff...
}

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
  // internal complicity here: recognize friend class to enable optimization
  if (condition instanceof PendingOrderPredicate) {
     return selectPendingOrders();// faster way
  }
  // otherwise, back to the usual case
  return Iterables.filter(orders, condition);
}

It’s clear that it increases the coupling between two implementation classes that should otherwise ignore each other. Also it only helps with performance if given the “friend” predicate directly, with no decorator or composite around.

What’s really important with Friend Complicity is to make sure that the behaviour of the method is never compromised, the contract of the interface must be met at all times, with or without the optimisation, it’s just that the performance improvement may happen, or not. Also keep in mind that you may want to switch back to the unoptimized implementation one day.

SQL-compromised

If the orders are actually stored in a database, then SQL can be used to query them quickly. By the way, you’ve probably noticed that the very concept of predicate is exactly what you put after the WHERE clause in a SQL query.

Ron Arad designed a chair that encompasses another chair: this is subsumption

A first and simple way to still use predicate yet improve performance is for some predicates to implement an additional interface SqlAware, with a method asSQL(): String that returns the exact SQL query corresponding for the evaluation of the predicate itself. When the predicate is used against a database-backed repository, the repository would call this method instead of the usual evaluate(Predicate) or apply(Predicate) method, and would then query the database with the returned query.

I call that approach SQL-compromised as the predicate is now polluted with database-specific details it should ignore more often than not.

Alternatives to using SQL directly include the use of stored procedures or named queries: the predicate has to provide the name of the query and all its parameters. Double-dispatch between the repository and the predicate passed to it is also an alternative: the repository calls the predicate on its additional method selectElements(this) that itself calls back the right pre-selection method findByState(state): Collection on the repository; the predicate then applies its own filtering on the returned set and returns the final filtered set.

Subsumption

Subsumption is a logic concept to express a relation of one concept that encompasses another, such as “red, green, and yellow are subsumed under the term color” (Merriam-Webster). Subsumption between predicates can be a very powerful concept to implement in your code.

Let’s take the example of an application that broadcasts stock quotes. When registering we must declare which quotes we are interested in observing. We can do that by simply passing a predicate on stocks that only evaluates true for the stocks we’re interested in:

public final class StockPredicate implements Predicate<String> {
   private final Set<String> tickers;
   // Constructors omitted for clarity

   public boolean apply(String ticker) {
     return tickers.contains(ticker);
   }
 }

Now we assume that the application already broadcasts standard sets of popular tickers on messaging topics, and each topic has its own predicates; if it could detect that the predicate we want to use is “included”, or subsumed in one of the standard predicates, we could just subscribe to it and save computation. In our case this subsumption is rather easy, by just adding an additional method on our predicates:

 public boolean encompasses(StockPredicate predicate) {
   return tickers.containsAll(predicate.tickers);
 }Subsumption is all about evaluating another predicate for "containment". This is easy when your predicates are based on sets, as in the example, or when they are based on intervals of numbers or dates. Otherwise You may have to resort to tricks similar to Friend Complicity, i.e. recognizing the other predicate to decide if it is subsumed or not, in a case-by-case fashion.

Overall, remember that subsumption is hard to implement in the general case, but even partial subsumption can be very valuable, so it is an important tool in your toolbox.

Conclusion

Predicates are fun, and can enhance both your code and the way you think about it!

Cheers,

The single source file for this part is available for download cyriux_predicates_part2.zip (fixed broken link)

Read More

Patterns for using custom annotations

If you happen to create your own annotations, for instance to use with Java 6 Pluggable Annotation Processors, here are some patterns that I collected over time. Nothing new, nothing fancy, just putting everything into one place, with some proposed names.

annotation

Local-name annotation

Have your tools accept any annotation as long as its single name (without the fully-qualified prefix) is the expected one. For example com.acme.NotNull and net.companyname.NotNull would be considered the same. This enables to use your own annotations rather than the one packaged with the tools, in order not to depend on them.

Example in the Guice documentation:

Guice recognizes any @Nullable annotation, like edu.umd.cs.findbugs.annotations.Nullable or javax.annotation.Nullable.

Composed annotations

Annotations can have annotations as values. This allows for some complex and tree-like configurations, such as mappings from one format to another (from/to XML, JSon, RDBM).

Here is a rather simple example from the Hibernate annotations documentation:

@AssociationOverride( 
   name="propulsion", 
   joinColumns = @JoinColumn(name="fld_propulsion_fk") 
)

Multiplicity Wrapper

Java does not allow to use several times the same annotation on a given target.

To workaround that limitation, you can create a special annotation that expects a collection of values of the desired annotation type. For example, you’d like to apply several times the annotation @Advantage, so you create the Multiplicity Wrapper annotation: @Advantages (advantages = {@Advantage}).

Typically the multiplicity wrapper is named after the plural form of its enclosed elements.

Example in Hibernate annotations documentation:

@AttributeOverrides( {
   @AttributeOverride(name="iso2", column = @Column(name="bornIso2") ),
   @AttributeOverride(name="name", column = @Column(name="bornCountryName") )
} )

annotationbis

Meta-inheritance

It is not possible in Java for annotations to derive from each other. To workaround that, the idea is simply to annotate your new annotation with the “super” annotation, which becomes a meta annotation.

Whenever you use your own annotation with a meta-annotation, the tools will actually consider it as if it was the meta-annotation.

This kind of meta-inheritance helps centralize the coupling to the external annotation in one place, while making the semantics of your own annotation more precise and meaningful.

Example in Spring annotations, with the annotation @Component, but also works with annotation @Qualifier:

Create your own custom stereotype annotation that is itself annotated with @Component:

@Component
public @interface MyComponent {
String value() default "";
}
@MyComponent
public class MyClass...

Another example in Guice, with the Binding Annotation:

@BindingAnnotation
@Target({ FIELD, PARAMETER, METHOD })
@Retention(RUNTIME)
public @interface PayPal {}

// Then use it
public class RealBillingService implements BillingService {
  @Inject
  public RealBillingService(@PayPal CreditCardProcessor processor,
      TransactionLog transactionLog) {
    ...
  }

Refactoring-proof values

Prefer values that are robust to refactorings rather than String litterals. MyClass.class is better than “com.acme.MyClass”, and enums are also encouraged.

Example in Hibernate annotations documentation:

@ManyToOne( cascade = {CascadeType.PERSIST, CascadeType.MERGE}, targetEntity=CompanyImpl.class )

And another example in the Guice documentation:

@ImplementedBy(PayPalCreditCardProcessor.class)

Configuration Precedence rule

Convention over Configuration and Sensible Defaults are two existing patterns that make a lot of sense with respect to using annotations as part of a configuration strategy. Having no need to annotate is way better than having to annotate for little value.

Annotations are by nature embedded in the code, hence they are not well-suited for every case of configuration, in particular when it comes to deployment-specific configuration. The solution is of course to mix annotations with other mechanisms and use each of them where they are more appropriate.

The following approach, based on precedence rule, and where each mechanism overrides the previous one, appears to work well:

Default value < Annotation < XML < programmatic configuration

For example, the default values could be suited for unit testing, while the annotation define all the stable configuration, leaving the other options to  configure for deployments at the various stages, like production or QA environments.

This principle is common (Spring, Java 6 EE among others), for example in JPA:

The concept of configuration by exception is central to the JPA specification.

Conclusion

This post is mostly a notepad of various patterns on how to use annotations, for instance when creating tools that process annotations, such as the Annotation Processing Tools in Java 5 and the Pluggable Annotations Processors in Java 6.

Don’t hesitate to contribute better patterns names, additional patterns and other examples of use.

EDIT: A related previous post, with a focus on how annotations can lead to coupling hence dependencies.

Pictures Creative Commons from Flicker, by ninaksimon and Iwan Gabovitch.

Read More

Domain-Driven Design: where to find inspiration for Supple Design? [part1]

Domain-Driven Design encourages to analyse the domain deeply in a process called Supple Design. In his book (the blue book) and in his talks Eric Evans gives some examples of this process, and in this blog I suggest some sources of inspirations and some recommendations drawn from my practice in order to help about this process.

When a common formalism fits the domain well, you can factor it out and adapt its rules to the domain.

A known formalism can be reused as a ready-made, well understood model.

Obvious sources of inspiration

Analysis patterns

It is quite obvious in the book, DDD builds clearly on top of Martin Fowler analysis patterns. The patterns Knowledge Level (aka Meta-Model), and Specification (a Strategy used as a predicate) are from Fowler, and Eric Evans mentions using and drawing insight from analysis patterns many times in the book.Analysis Patterns: Reusable Object Models (Addison-Wesley Object Technology Series)

Reading analysis patterns helps to appreciate good design; when you’ve read enough analysis patterns, you don’t even have to remember them to be able to improve your modelling skills. In my own experience, I have learnt to look for specific design qualities such as explicitness and traceability in my design as a result of getting used to analysis patterns such as Phenomenon or Observation.

Design patterns

Design patterns are another source of inspiration, but usually less relevant to domain modelling. Evans mentions the Strategy pattern, also named Policy (I rather like using an alternative name to make it clear that we are talking about the domain, not about a technical concerns), and the pattern Composite. Evans suggests considering other patterns, not just the GoF patterns, and to see whether they make sense at the domain level.

Programming paradigms

Eric Evans also mentions that sometimes the domain is naturally well-suited for particular approaches (or paradigms) such as state machines, predicate logic and rules engines. Now the DDD community has already expanded to include event-driven as a favourite paradigm, with the  Event-Sourcing and CQRS approaches in particular.

On paradigms, my design style has also been strongly influenced by elements of functional programming, that I originally learnt from using Apache Commons Collections, together with a increasingly pronounced taste for immutability.

Maths

It is in fact the core job of mathematicians to factor out formal models of everything we experience in the world. As a result it is no surprise we can draw on their work to build deeper models.

Graph theory

The great benefit of any mathematical model is that it is highly formal, ready with plenty of useful theorems that depend on the set of axioms you can assume. In short, all the body of maths is just work already done for you, ready for you to reuse. To start with a well-known example, used extensively by Eric Evans, let’s consider a bit of graph theory.

If you recognize that your problem is similar (mathematicians would say isomorphic or something like that) to a graph, then you can jump in graph theory and reuse plenty of exciting results, such as how to compute a shortest-path using a Dijkstra or A* algorithm. Going further, the more you know or read about your theory, the more you can reuse: in other words the more lazy you can be!

In his classical example of modelling cargo shipping using Legs or using Stops, Eric Evans, could also refer to the concept of Line Graph, (aka edge-to-vertex dual) which comes with interesting results such as how to convert a graph into its edge-to-vertex dual.

Trees and nested sets

Other maths concepts common enough include trees and DAG, which come with useful concepts such as the topological sort. Hierarchy containment is another useful concept that appear for instance in every e-commerce catalog. Again, if you recognize the mathematical concept hidden behind your domain, then you can then search for prior knowledge and techniques already devised to manipulate the concept in an easy and correct way, such as how to store that kind of hierarchy into a SQL database.

Don’t miss the next part: part 2

  • Maths continued
  • General principles

Read More

Pattern grammar for the variant problem

For tools to be aware of patterns, the patterns must be formalized, at least partially. At this point I must quote Gregor Hohpe to clarify my thoughts, as I strongly agree with his skipticism:

Typically, when people ask me about “codifying” or “toolifying” patterns my first reaction is one of skepticism. Patterns are meant to be a human-to-human communication mechanism, not a human-to-machine mechanism. After all, I have pointed many people to the fact that a pattern is not just the piece of code in the example section. It’s the context-problem-forces-solution combination that makes patterns so useful.

Patterns link together a problem part to a solution part. This is expressed within the limits a stated context outside of which it is no more applicable. Patterns also emphasize the forces involved, that you must consider to decide how and whether or not to apply the pattern.

Patterns litterature usually describes examples of application of patterns. In your project, you will have to do more work to adapt the pattern solution to your exact need. A pattern solution may be stretched a lot, but the pattern remains as long as humans still recognize its presence. Every different way of applying a pattern is called a pattern variant.

Addressing the variant problem

Formal descriptions of anything human is too restrictive, and this is especially true for patterns which are the product of human analysis, in that they resist simple formalization. However if we focus on sub-parts of patterns, it becomes easier to formalize them, at least for their solution part.

For example a design pattern that uses (in its solution part) some form of inheritance admits several variants. At first, the pattern solution seems to resist against its formalization. However if we now focus on the inheritance part only, we can enumerate every possible alternative for it. For example we can use:

  • interface and classes that implement it
  • abstract class and classes that extend it
  • concrete class provided it is not final (assume we’re in Java), so that it can be extended

Notice that each alternative is a solution to the same problem “How to realize some

Tree structure in volume (Milano International Fair 2009)
Tree structure in volume (Milano International Fair 2009)

form of inheritance”. We can say that each alternative is indeed a pattern, and by the way Mark Grand already described them in Patterns in Java Volume 1. These patterns are easy to formalize, as they can be precisely described in terms of programming language elements.

How can we split a pattern into parts? The idea is to identify the areas that are fragile with respect to the variant problem in the solution part of a pattern, and to consider them as lower-level problems embedded inside the bigger pattern.

In the example before, the problem was to achieve “some form of inheritance”, and we listed three patterns that address this problem.

Provided it can be split into sub-parts (hence into smaller problems), any pattern solution can be formalized by recursively formalizing its sub-parts. If a sub-part cannot be easily made formal, then it can be split again into sub-parts, and so on until each sub-part can.

Given a pattern solution that we want to formalize:

  • If it can be described formally directly, then we are done (terminal)
  • If it cannot be fully described formally, then extract the problematic sub-parts into sub-problems, then find every pattern that addresses each sub-problem, and formalize their solution part

We can then represent a pattern as a tree of smaller patterns, where the solution part of each patter is connected to the problem part of another pattern.

From pattern language to pattern grammar

In the car, some parts can be replaced by other alternate parts that play the same contract (e.g. the wheels)
In the car, some parts can be replaced by other alternate parts that play the same contract (e.g. the wheels)

When the pattern is applied into the code, at each node in the tree there is actually a selection of which variant of the sub pattern to use. As such, each selected pattern represents an atom of decision in the design process.

It then appears that we have a form of grammar for patterns, where there are terminal patterns solutions T (easier to formalize in term of programming language elements), non-terminal parts of patterns solutions N (that cannot be easily formalized but that can formally ask for help to solve their sub-problems), and where the production rules P are nothing but the patterns themselves:

Patterns are production rules that link:
elements of N (the problems) to elements of (N Union T)

Conclusion

I have suggested quickly a way to formalize patterns solutions in spite of their fragility with respect to their variants. This approach identifies patterns as production rules in some grammar over the set of patterns considered.

This perspective is well-suited for tools to work on patterns in real-world projects, where the patterns are indeed applied in many variant forms. The problem of this approach is that every known pattern that is variant-fragile must be reconsidered and have its solution split into a formal part and one or several sub problems to be addressed by specific, lower-level patterns, which themselves must be formalised in the same way.

It is essential that for every problem (“intent”) we can enumerate every pattern that addresses it. Intents can be also classified as a taxonomy, where some intents are specialized versions of more generic intents.

This approach does not claim to formalize the full potential of patterns, it only aims at enabling tools to understand patterns that are already there so that to assist the developers for various tasks.

Read More

Patterns in general, not just design patterns

Over time, patterns have appeared on many different topics, not all related to programming. Here is a list of patterns and pointers to other lists of patterns, to illustrate two things:

  1. Knowledge and experience in general can be packaged into patterns, often using the pattern form. Patterns are convenient for reuse, in any domain.
  2. There are already plenty of patterns, and they cover a really wide range of situations. Given the number of patterns available today, whatever your problem, you will likely find helpful patterns for it.

This should encourage you to search for existing patterns whenever you need additional insight, or just an already documented reference for what you’ve just done (documenting the design of software using patterns).

The list in this post will grow, but without the ambition of listing every possible pattern.

Software Design

Here are two essential lists of books on patterns:

Other books and/or websites focus on patterns for software development, each with a specialized perspective:

Please notice that while many patterns are to be applied directly to software design or source code, there are also many patterns about the process of building software.

This list could not be complete without referring to the big Pattern Almanach by Linda Rising, available on her website here in PDF or in a web version here.

The Hillside.net Patterns catalog and the many papers published after each PLoP also provide many patterns that may well cover your problem.

User interfaces patterns

Testing patterns

Misc

Probably any domain can benefit from using patterns to represent chunks of knowledge and experience. Here are some examples quite foreign to software developement:

Conclusion

Though this listing is far from extensive (there are many independently published articles on patterns everywhere), it shows that the pattern community has been quite active to mine as many patterns it could, and this knowledge has been carefully documented in the pattern form, ready for reuse in your next projects.

Of course you don’t need to know them all, not even read them all in advance. However it does help to be aware that for most problems, there probably already exists a few patterns that can prove helpful, and they are only an Internet search away from your needs!

Read More

Patterns as stored arrangements: toward more efficient tools

In nature, out of every possible arrangement of several elements, only a few arrangements are stable. This is illustrated with atoms combined together, or smaller particles arranged together into atoms, where not every combination is sustainable.

Unstable arrangements tend to move toward stable ones over time. Whenever you observe the elements, you mostly see stable arrangements of them. Because there is only a relatively small number of stable arrangements, a brain can be trained to recognize them, and they can even be named and incorporated into the language.

Better with a brain

The capability to recognize common arrangements of elements is beneficial because it saves a lot of time and thinking. Rather than describing in the details each arrangement each time, it is therefore very economical -cheaper- to describe each stable arrangement once, and then to declare that such arrangement happens in this or that case. Saying: “this is an equilateral triangle” is times more efficient than explaining what it is: “there are 3 lines of the same length, each connected to the two others such as they form a closed path, etc.” It also enables thinking at a higher level.

In software development

Small parts arranged together into bigger (and higher-level functionality) parts
Small parts arranged together into bigger (and higher-level functionality) parts

In software development, the usual elements are classes, interfaces, methods, fields, associations (implementation, delegation, instantiation) and various constraints between them. Given a few of these elements we can form many possible arrangements, however only a relatively small number of them is useful and of interest. This happens because the useless arrangements tend to be quickly replaced by the skill-full developer into other that are more useful. For example, any arrangement of two distinct classes that depend to each other, forming a cycle, is usually not desirable. Patterns authors have been working for almost two decades to inventory as many useful arrangements as they could find, resulting into many books and publications in every domain.

Common and stable arrangements of two to three classes together form the basis for design patterns as in the GoF book, an idea I have experimented in a previous post: systematic combination of subpatterns generates design patterns.

Common stable arrangements of methods, fields and how they relate with each other within one class are simply stereotypes of classes, which we tend to call patterns anyway like the ValueObject pattern in the Domain Driven Design book.

Note that in this discussion we are focusing on arrangements of programming elements in the solution space, not in the problem space, but pattern express intents too.

Harnessing stable arrangements of things: toward more efficient tools

I believe that making explicit the use of predefined common stable arrangements of programming elements, in the coding process, can boost the efficiency of many tools. I also believe that such common and stable arrangements of programming elements have already been identified and are already well-documented in the existing pattern literature.

Rather than configuring tools at the programming element level (class, field, method etc.), if the code is augmented with explicit declarations of the patterns used, the tools can then be configured at the pattern level. For each tool, the idea is to prepare in advance how to configure it for each supported pattern. This preparation must be automated, so that given an occurrence of a known pattern in the code base, the configuration of the tool can be automatically derived from the particular details of the pattern occurrence.

In other words:

tool configuration=
  auto-configuration(pattern, tool) + pattern occurrence

A simple case study

To start with an example, let us consider the pattern Abstract Factory, that defines an Abstract Factory interface and one or more Product interface(s). Then assume that in our code base we have an occurrence of this pattern, where the interface WidgetFactory participates as the Abstract Factory, and the interfaces Window and Button participate as Product. Concrete classes form two families, one Linux family (LinuxWidgetFactory, LinuxWindow, LinuxButton) and one Mac family (MacWidgetFactory, MacWindow, MacButton), where each concrete class participates as ConcreteFactory, ConcreteProduct and ConcreteProduct respectively.

Dependencies restrictions (à la Jdepend + Junit)

The auto-configuration(AbstractFactory pattern, dependency checker tool) could be programmed like the following:

//Given a pattern occurrence from the actual base: occ

//Factory interface knows about the Product interface, not the other way round
For each Product in occ, add constraint (Product, Must not depend upon, AbstractFactory)

//ConcreteProduct must not know about the AbstractFactory
For each ConcreteProduct participant in occ, add constraint (ConcreteProduct, Must not depend upon, AbstractFactory)

//ConcreteProduct must not know about the ConcreteFactory
For each ConcreteProduct participant in occ, add constraint (ConcreteProduct, Must not depend upon, ConcreteFactory)

//Interfaces must not depend upon their implementor
For each abstract participant in occ, add constraint (participant, Must not depend upon, implementor of participant)

I have expanded the auto-configuration script to highlight how we can do more sophisticated configuration as soon as it is supposed to be reused many times, something we would never afford for one-shot configuration. Also in the above script, it is quite obvious that we can extract more powerful primitives to simplify the declaration of the script itself.

I have already presented this idea: toward smarter dependency constraints (patterns to the rescue).

Dependency injection (à la Spring)

The auto-configuration(AbstractFactory pattern, IoC container) could be programmed like the following:

//Given a pattern occurrence from the actual base: occ
//Given the selected family (Mac or Linux) we want to inject: F

//Bind the ConcreteFactory from F to the AbstractFactory interface
For the AbstractFactory participant in occ, and for the ConcreteFactory in occ that belongs to F, bind (ConcreteFactory, AbstractFactory)

//Bind each ConcreteProduct from F to the Product interface
For each ConcreteProduct in occ that belongs to F, bind (ConcreteProduct, corresponding Product)

Again we can see that we can automate the binding of each implementation to its corresponding interface from the single explicit selection of the family F we want to inject. This is a great way to factor out dependency injection to make it more manageable. In particular, the configuration is closer to the way we actually think.

Other tools

The same idea can be applied for many other tools, in particular:

Conclusion

In this post I described how it makes sense to consider overall arrangements of programming elements as higher-level constructs, which I identify with patterns (the solution part of patterns in fact). I emphasize the fact that the useful arrangements are not that numerous, and that many of them are already documented in the pattern literature. Finally I present how such arrangements or patterns, if declared explicitly in the code, can be leveraged to automate tools configuration in a powerful way.

As usual, any feedback is highly welcome.

Read More

How I became enthusiastic about patterns

In my very first job, I  was doing R&D, working on a map-matching algorithm. The goal of this algorithm was to pinpoint a moving car on a vector map, based on the data from various sensors, including a GPS, an electronic compass and the car odometer. Such algorithm was essential for the business of the company, and there was very little literature on the subject.

The R&D challenge

At school I had been taught some C programming, so I started doing the algorithm in plain C code. One special case after another, the code began to grow until it became quite complicated. I had a specially equipped car with a computer and all the gear in it to do real testing on the roads around the office, from the highway to forest road, city streets or even car parks, and this was fun! But situation after situation, I had to make the code more and more complicated. At some point, it became obvious to me that the mode of implementation (plain C code) had become the main obstacle for improving the algorithm. It was becoming increasingly difficult to grow the sophistication in a mess of structured code.

My early mentors

Yes savoir-faire can be found in books (but not only)
Yes savoir-faire can be found in books (but not only)

At the same time I was willing to progress, so I was getting closer to the few very experienced colleagues. Our company was a startup in 2000, and there were many more junior developers than senior ones. At first, I thought UML could help me (it did not indeed) so I started asking questions about UML. When I became more comfortable with UML, a senior colleague told me I should now have a look at design patterns, starting with the Composite. So I took the GoF book on my desk and began to look at it as a reference to get design ideas during the day. I also borrowed the pattern pattern book from Mark Grand and read it in the train.

And then it has been “Wow patterns are a great way of transfering knowledge!”. I remember reading the pattern “Cache” in the book. It was not in itself a very innovative design idea, but I understood that the pattern format was ideal to document just any idea. I hate long explanations in long books, and the pattern format, which tends to be short and structured, is perfect for quick scanning whenever you’re looking for ideas. Even when I didn’t find a pattern for my case, I found it stimulating to read other people ideas.

Enthusiasm and success as a result

I started to apply the State and the Strategy patterns into the map-matching algorithm and this made it much, much simpler. It actually made it so much simpler that we were now able (the team was growing at that time) to go an order of magnitude further in sophistication, while being perfectly in control of the code. The introduction of two simple design patterns had suddenly given a really big advantage to a piece of code essential to the company! This is how I became enthusiastic about patterns.

The reality

I am also enthusiastic about deserts.
I am also enthusiastic about deserts.

What actually happened is that reading and starting to play with patterns just taught me object-oriented programming. Patterns acted like examples of good design, until the underlying principles became natural. Later I discovered the SOLID principles of Robert C. Martin, and recognized the principles behind many design patterns. In my next job experiences I took the habit to look for patterns for whatever problem I was encountering, and to my surprise, I found out that most common problems were already being taken care of in the form of analysis patterns or other kinds of patterns! To give the most obvious example, Martin Fowler “Things that change with time” is really a must-read, which you can apply easily to solve your problem.

Conclusion

This is how I became enthusiastic about pattern, not just design patterns but every kind of patterns, from analysis patterns to domain driven design patterns, enterprise integration patterns, PLoP patterns and many patterns from various authors. I know my enthusiasm is a bit exaggerated, a bit like the souvenir of a first love that cannot be much objective. Fortunately I quickly learnt when not to use patterns, to keep things as possible as they can be, and to do unit testing. By the way the benefits of unit tests also struck me when I started with them, but not as much as patterns, there can be only one first time, and my first first time was with patterns, not unit tests!

Read More

Manipulating things collectively

There is great power in being able to manipulate collective things as one single thing. It gives you simplicity, hence control. You can focus your attention on it and reason about it, even though behind the hood it is made of many parts. The composite thing is kept simple, therefore you can also deal with several of them at a time. This would not be possible if you had to deal with every part they are made of, because it would be overwhelming.

There exists many strategies to deal with collective things as if they were one single thing: statistics, multiple selection, groups, classifications and super-signs.

Statistics

Statistics is probably the most obvious way to deal with collective things, when the things can be expressed as numbers. Historically it has been used with great results in physics, thermodynamics in particular.

It is all about extracting a few macro properties that we can reason on instead of the whole set of data:

  • number of elements
  • mean, deviation, moments, percentiles, etc.
  • regression, clusters
  • total property: total weight, total volume, total price

Multiple selection

Many software applications enable you to select multiple elements at a time in order to apply one operation to each element:

Arman accumulation
Arman accumulation
  • When sending an email, you can select multiple addresses to send to
  • In a word processor, you can select several words, several paragraphs, or even all the document to copy, paste or apply formatting to each element
  • In a spreadsheet, you can select multiple rows or columns to apply operations to, and you can also repeat formula for each row or column

The selected elements can be of the same kind or not. However for multiple selection to be useful, they must share at least something in common: the capability of being copied or pasted, or the fact that they are specific for a particular user.

Functional programming and the three higher order functions map, fold and filter address very well how to apply operations collectively to many elements.

Groups

When multiple selections are often needed, you can create groups. We can consider a group to be a multiple selection made explicit. You create a group and you explicitly add elements to it. Common examples of groups:

  • Mailing lists are named groups of email addresses
  • Vectors in maths

As for multiple selection, the elements in a group must share something in common. For example, they must all have a price. Elements of various kinds can be grouped if they relate to something common, for example  the set of various data (name, address, phone number, preferred colour and date of birth) specific for a user is called a user profile.

A group is extensional. The elements in the group may or may not know they belong to a group.

Java packages are groups, and they are declared within the same file as the elements they refer to. Java classes also group fields and methods under one name.

The Composite pattern suggests to group objects that share the same interface into a Composite that also shares the same interface. The intent is to manipulate the collective set of objects as if it was one single object, i.e. without knowing it is collective.

Classification

You get control over multiple things if you just classify them. Given several flowers, if you classify them into categories, then you can talk about several flowers collectively without having to enumerate each of them: the category is a way to refer to several flowers with just one name.

Classifications enable intensional grouping. This means that groups are defined not by the set of their elements, but by a condition (predicate) to be satisfied. The condition can test for the category of something (is this animal a bird?), or test for its attribute (is this car red?).

Of course abstraction is one particular way to classify.

Java modifiers (private, public, abstract, interface etc.) classify Java elements, and can be used to refer collectively to them, as in “let’s generate the Javadoc for every public elements”.

Super signs

Super-sign
Super-sign

There are elements that exhibit a special property when considered together as a whole. For example, the ink dots on the paper can be seen as letters. Letters next to each other can be seen together as words, which again can be seen together as sentences, and then again up to the novel. Collective arrangements of multiple things that together exhibit a property are called super signs.

This phenomenon is related to emergence, and only exists for a given observer if he can recognize the super sign.

In science in general, we use models to account for the collective behavior of several elements, typically objects with measurable properties, and forces in action.

In a Java program, idioms and patterns can be considered super signs for those who know them.

Conclusion

Manipulating multiple things in a simple way really matters, it is a life saver.

In software development it is paramount because it is a lever you use to manage tons of data with no effort. The art is to find the way you think about collective things that reduces the most your effort.

I already mentioned this topic in previous posts: group together things that go together, don’t make things artificially different, and my definition of abstraction, because abstraction is an essential way to refer to different things in what they share in common.

Read More