How to fail an Open Space session

And the answer is: when you try to change the format until it’s not really an open space session any more.

Since 2011 I’ve been running and facilitating dozens of open space sessions in many conferences across various countries: at the Software Craftsmanship Meetup in Paris, in Socrates Germany, Socrates France, in Bucharest’s ITAKE, Agile France and so forth. It always worked fine, except twice. Precisely the two times I tried to mix the open space format with a more structured one.

A word on Open Space technology

Open Space technology is a beautiful format. Basically it’s all about people interested in a common topic joining to discuss it together, usually with a flip chart. It comes with rules that actually remind participants that they are the actors of what is going to happen, and these rules shape the expectations:

  • Whoever comes is the right people
  • Whenever it starts is the right time
  • Wherever it is, is the right place
  • Whatever happens is the only thing that could have, be prepared to be surprised!
  • When it’s over, it’s over (within this session)

And there’s the most important rule : the “Law of two feet”:

If at any time during our time together you find yourself in any situation where you are neither learning nor contributing, use your two feet, go someplace else.

Note that there is no moderator role, just a facilitator role to remind the rules and organize the very basics like time and space. Anyone can hijack a session, and the law of two feet is supposed to counter-balance that from happening.

The form does not promise much and there’s no guarantee you’ll be happy, but because it’s fully interactive you can do the steering yourself by suggesting, reacting, asking questions or answering directly. In practice this means you’re quite likely to be happy at the end. Except when you take risks and try to change the format.

A Tale of Two Disasters

At Socrates Germany 2014 I had explained monoids to many attendees. At the same Socrates the next year in Soltau I proposed a session to go further, which I called the ”Monoid Safari”. It was supposed to be a collective exploration to identify more examples of monoids in the business domains of the attendees.

I came prepared with a large slide deck “just in case”. I had it ready just in case I had to remind people of what a monoid is, or to show some of the examples of monoids I had already collected and described in the past. I wanted an open space session, but I had a full content for a talk as a backup. It should have been safe.

This one wasn’t really a disaster, but was not a success either. Most attendees had no idea what a monoid was, and they came to learn that. I quickly found out that my plan was not working, so I tried to explain monoids quickly to carry on to the safari. But still questions were coming: “do you have examples?”. “Oh yes I have some, here they are”. And I jump to the most appropriate slides to show some the examples, quickly. And still hoping to come back to the safari right after.

A few attendees got it fast and suggested interesting examples, but not much. At the end of the session, I was disappointed of the safari, while the overall feeling of the audience was that I could have explained better the concept of monoids. #fail

More recently, at DDD Europe in Brussels I had proposed an ambitious session called ”Bounded Contexts: the Illustrated Bestiary”. I consider myself knowledgeable in DDD and Bounded Contexts, which also means I have open questions on the edge of this topic. I thought this conference dedicated to DDD was the best place to find knowledgeable peers to discuss that and try to answer some of the puzzles. So I proposed this session, precisely because I do not have all the definitive answers.

There was an abstract, which was not meant for everyone:

From one entry in the Blue Book, Bounded Contexts have since evolved into a bestiary of mythical creatures like Bubble Context, Interchange Context or even “Uber-Context”, each with different rationales that somehow challenge their initial definition. We now have recurring relationships between them like in the Collaborative Construction pattern. We now know that Bounded Contexts are a solution thing, not a problem thing, but some confusion remains, and we can make the matter even more confusing with crazy questions like ”Can Bounded Contexts be nested?”, ”Are Aggregates mini Bounded Contexts?’ or ”Is it useful to say that legacy UI and DB layers are their own Bounded Contexts?”.

During this semi-structured Open-Space session, every attendee can contribute examples or feedbacks, ask questions and share their ideas and opinions on this topic. Contributions in any form (slides, pictures, code…) are also welcome prior to the session and will be be credited.

Given the topic is very abstract I had in mind to run for 1 hour, no more. And I was expecting just a few attendees. But the program generously planned two hours, and more than 40 people came in early, sitting in a very large circle!

20160128-IMG_7641
Again I came prepared with a slide deck covering each example and concept mentioned in the abstract, “just in case”. From the start it appeared that many attendees did not clearly know what a Bounded Context was, and they came to learn that more in-depth. So I started to explain quickly. As I felt tied by the abstract, I tried to cover all the questions, quickly. It was uncomfortable switching from lecture-style and then back to open space style, and again quickly, every 5mn. At this pace I was also speaking too fast, which combined to my French accent is a recipe for disaster. And it was a disaster!

The feedback from the attendees was quite harsh, and illustrates how the expectations were not clear, it was not a talk, and it was not an open space either:

  • “Even for an open session it could have been beter structured. At the end, I do not know if I learned something.”
  • “Although it was open space, it seemed like the speaker didn’t prepare anything. The crowd was waiting for some definitions, explanations to open the discussions, but these never came…”
  • “Speaker clearly did not master the topic. Also, the open space form was unknown to most of the participants, it was bland”
  • “The speaker wanted to organise a structured open space, but this approach had three problems: the speaker was not very understandable,  did not reign in the discussion, and he let loud-mouth participants walk all over him”
  • “Due to the fact, that there were no experts, no questions were answered. Instead I left with more questions than before. The content was with e.g. “bubble context” very specific and it was hard to discuss with the others without having knowledge about this.”

Some did appreciate, but still a disaster:

  • “Very interesting topic. However, the group was way too big to have a useful conversation. Left the session after 30 minutes.”

Some offered suggestions:

  • “be a bit more clear on what the format of the open space is”
  • “I dont think it was the problem of the speaker. But he expected a MUCH higher level of the audience. Wasnt really prepared to give a lecture about bounded contexts. A warning sign “only for experienced attendees” on the schedule would have help immensively.”
  • “Split the group into multiple small groups to discuss the topics”

It’s called a training?

Thinking about it, the format I was looking for exists and I know it well, it’s just called a training, not open space. And that’s funny since that’s already the way I do trainings! Although it may look like conversational, it remains actively directed by the trainer. In the ideal situation, the trainer does not have to talk that much, as the learning happen thanks to exercises, discussions and coding. But still, this is not open space.

However a training assumes that the trainer knows the topic completely. So how do we do when we want to create opportunities to discuss topics precisely to clarify open questions and to explore the boundaries of our knowledge? When there’s a pre-requisite level of knowledge to even listen and follow what’s happening? So far the best opportunities I’ve met were in conferences hallways, during random discussions with speakers and attendees, unfortunately always too short and usually without taking notes. Advanced topics can be discussed during open spaces of course, as it regularly happens in various Socrates un-conferences, but it’s usually between small groups, the topics are broadly defined, e.g. “Combinators in FP”, and the answers usually exist in the literature.

I’ll try again, but without pretending it’s open space when it’s not really. Or as a pure open space, hence without any commitment or hard promise, to relax the expectations. I’ll probably put myself in danger again from time to time though :)

 

 

Read More

Make Money vs Reduce Risks dichotomy

In sports, football for example, players have only one goal in mind: score, score, again and again, as often as possible. Close to them, but not too close, arbiters have only one goal in mind: detect quickly all violations of the rules of the game, and sanction them.

The players know the rules, still we need an antagonism role, the arbiters, to keep the game fair. It is never perfect, but this is not plain chaos either.

Many mature businesses have chosen a similar structure. There is a role to make money, as much money as possible, and another role to control risk under an acceptable level.

20141125-001955.jpg

In finance, this schema is visible at several levels. Traders and sales people in the front-office focus on making money, while officers in the risk department closely monitor their activity to control they don’t go too far. We hear very loud in the news when the traders go too far. We don’t hear much when the risk people go too far, but reducing risk usually hurts profitability in the short term.

This schema occurs again between banks, who want to make money, and the regulators, who is supposed to protect the country and the customers. That the regulators do a good job or not is not the point of this article, my point is that there is a common business pattern there.

When there is a common business pattern, and when the business is heavily supported by software systems, does this mean there is a corresponding pattern in the software itself? I believe there is, a bit like a generalized Conway’s Law. The corresponding software pattern is: when the business has an obvious antagonism like “Making Money vs Reducing Risk”, then it probably calls for two distinct Bounded Contexts in the corresponding software.

This dichotomy is not a rule, it is just an heuristics to suggest there may be a need for two distinct Bounded Contexts.

Who is the key decision maker is probably the question that shapes everything. I learnt that a few years ago in a course with @ziobrando. In particular, when two management hierarchies are involved, even if their visions coincides right now, it’s unlikely that both visions will evolve the same way over time. This is a reason to split the solution in two Bounded Contexts that will evolve independently. So if you have a Direction of Trading and the Direction of Risk, you’re in this situation.

Modeling in the two contexts

Making money typically involves good commercial relationships and a competitive pricing expertise, plus enough speed to react to opportunities.

Software systems for that typically manages the business one deal at a time. They often need to be real-time, or fast enough not to lose impatient customers. Sometimes we may even accept to trade calculation accuracy in exchange for speed. For example we may be using floating point calculations instead of Big Decimals, or an approximation instead of the exact formula.

Software systems to support making money need to help people doing the sales to be fast, for example with rich defaulting of the input values.

By contrast, software for officers who want to reduce or control risk often computes risk metrics out of a lot of deals. It may be fraud analysis or a stress tests simulating markets crisis. It is often just computing the overall risk taken by summing up the numbers from each deal. Some do that in realtime too, but usually it can accommodate much slower paces like on-demand, daily, weekly or even monthly.

20141125-002012.jpg

Sometimes the competition is so tight that risk control becomes the key differentiator to make money between competitors. In this situation risk control has become another miniature sub-domain within the domain of selling and pricing. Still, it has its own risk-oriented perspective on the business, and it is like a delegation of responsibility from the risk officers to the front office people and their trading bots. Even in this situation there will also be a full-featured domain of risk control outside, with the corresponding software in its own Bounded Context.

A developer example

DevOps is the classical example in software development: developers want to release often to deliver more value. However ops people know that each release comes with risks for the production so they traditionally prefer to release less frequently. “No release, no risk” would be ideal for them.

20141125-002020.jpg

In this scheme, developers and ops teams use different tools, and don’t monitor the same indicators. When they get closer as in DevOps, the ops usually delegate some risk control to the development team and their automated testing tools, but they keep their own expertise and their specific tooling.

Many thanks to @ziobrando and @mathiasverraes for the early feedback and some complements incorporated into the text.

Read More

DDD is back in Paris with a brand new Meetup group!

The first DDD Open Forum of the brand new Paris DDD meetup was last night, hosted by Arolla, and it was good to meet again after a long time with twenty-some Paris DDD aficionados!

@tjaskula, the organizer of this new group, opened the evening with a welcome introduction. He also gave many suggestions of areas for discussion and debate.

A quick survey revealed that one third of the participants were new to Domain-Driven Design, while another third was on the other hand rather comfortable with it. This correlated with a rather senior audience, with only one attendee with less than 5 years experience and many 10+ years developers, including 22 years and 30 years experience developers, and still coding! If you work in Paris, I guess you know them already…

It was an open space session, so we first proposed a lot of topics for discussion with post-its on the wall: how to sell or convince about DDD, introduction on concepts, synchronizing between contexts…

We all decided to start with a walk through of the fundamentals of DDD: Bounded Contexts, Ubiquitous Language, Code as Model… It was great to have this two-way knowledge transfer between seniors and juniors, in an interactive fashion and with lot of questions, including some rather challenging and skeptical ones! There was also some UML bashing of course.

We concluded by eating Galettes des Rois, together with cider and beer, and a lot of fun. Thanks everyone for your questions and contributions, and see you soon on next meetup!

The many proposals for discussion

Read More

Collaborative Construction by Alberto Brandolini

Alberto Brandolini (@ziobrando) gave a great talk at the last Domain-Driven Design eXchange in London. In this talk, among many other insights, he described a recurring pattern he had seen many times in several very different projects: “Collaborative Construction, Execution & Tracking. Sounds familiar? Maybe we didn’t notice something cool”

Analysis conflicts are hints

In various unrelated projects we see similar problems. Consider a project that deals with building a complex artifact like an advertising campaigns. Creating a campaign involves a lot of different communication channels between many people.

On this project, the boss said:

We should prevent the user from entering incorrect data.

Sure you don’t want to have corrupt data, which is reasonable: you don’t want to launch a campaign with wrong or corrupt data! However the users were telling a completely different story:

[the process with all the strict validations] cannot be applied in practice, there’s no way it can work!

Why this conflict? In fact they are talking about two different processes, and they could not notice that. Sure, it takes the acute eyes of a Domain-Driven Design practitioner to recognize that subtlety!

Alberto mentions what he calls the “Notorious Pub Hypothesis“: think about the pub where all the bad people gather at night, where you don’t go if you’re an honest citizen. The hypothesis comes from his mother asking:

Why doesn't the police shut down this place?

Why doesn’t the police shut down this place? Actually there is some value in having this kind of place, since the police knows where all the bad guys are, it makes it easier to find them when you have to.

In a similar fashion, maybe there’s also a need somewhere for invalid data. What happens before we have strictly validated data?  Just like the bad guys who exist even if we don’t like it, there is a whole universe outside of the application, in which the users are preparing the advertising campaign with more than one month of preparation of data, lots of emails and many other communication types, and all that is untraceable so far.

Why not acknowledge that and include this process, a collaborative process, directly into the application?

Similar data, totally different semantics

Coming from a data-driven mindset, it is not easy to realize that it’s not because the data structures are pretty much the same that you have to live with only one type of representation in your application. Same data, completely different behavior: this suggests different Bounded Contexts!

The interesting pattern recurring in many applications is a split between two phases: one phase where multiple stakeholders collaborate on the construction of a deliverable, and a second phase where the built deliverable is stored, can be reviewed, versioned, searched etc.

The natural focus of most projects seems to be on the second phase; Alberto introduced the name Collaborative Construction to refer to the first phase, often missed in the analysis. Now we have a name for this pattern!

The insight in this story is to acknowledge the two contexts, one of collaborative construction, the other on managing the outcome of the construction.

Looks like “source Vs. executable”

During collaborative construction, it’s important to accept inconsistencies, warnings or even errors, incomplete data, missing details, because the work is in progress, it’s a draft. Also this work in progress is by definition changing quickly thanks to the contributions of every participant.

Once the draft is ready, it is then validated and becomes the final deliverable. This deliverable must be complete, valid and consistent, and cannot be changed any more. It is there forever. Every change becomes a new revision from now on.

We therefore evolve from a draft semantics to a “printed” or “signed” semantics. The draft requires comments, conversations, proposals, decisions. On the other hand the resulting deliverable may require a version history and release notes.

The insight that we have  these two different bounded contexts now in turn helps dig deeper the analysis, to discover that we probably need different data and different behaviors for each context.

Some examples of this split in two contexts:

  • The shopping cart is a work in progress, that once finalized becomes an order
  • A request for quote or an auction process is a collaborative construction in search of the best trade condition, and it finally concludes (or not) into a trade
  • A legal document draft is being worked on by many lawers, before it is signed off to become the legally binding contract, after the negotiations have happened.
  • An example we all know very well, our source code in source control is a work in progress between several developers, and then the continuous integration compiles it into an executable and a set of reports, all immutable. It’s ok to have compilation errors and warnings while we’re typing code. It’s ok to have checkstyle violations until we commit. Once we release we want no warning and every test to pass. If we need to change something, we simply build another revision, each release cannot change (unless we patch but that’s another gory story)

UX demanding

Building software to deal with collaborative construction is quite demanding with respect to the User Experience (UX).

Can we find examples of Collaborative Construction in software? Sure, think about Google Wave (though it did not end well), Github (successful but not ready for normal users that are not developers), Facebook (though we’re not building anything useful with it).

Watch the video of the talk

Another note, among many other I took away from the talk, is that from time to time we developers should ask the question:

what if the domain expert is wrong?

It does happen that the domain expert is going to mislead the team and the project, because he’s giving a different answer every day, or because she’s focusing on only one half of the full domain problem. Or because he’s evil…

Alberto in front of Campbell's Soup Cans, of course talking about Domain-Driven Design (picture Skillsmatter)

And don’t hesitate to watch the 50mn video of the talk, to hear many other lessons learnt, and also because it’s fun to listen to Alberto talking about zombies while talking about Domain-Driven Design!

Follow me (@cyriux) on Twitter!

Read More

What’s your signal-to-noise ratio in your code?

You write code to deliver business value, hence your code deals with a business domain like e-trading in finance, or the navigation for an online shoe store. If you look at a random piece of your code, how much of what you see tells you about the domain concepts? How much of it is nothing but technical distraction, or “noise”?

Like the snow on tv

I remember TV used to be not very reliable long ago, and you’d see a lot of “snow” on top of the interesting movie. Like in the picture below, this snow is actually a noise that interferes with the interesting signal.

TV signal hidden behind snow-like noise
TV signal hidden behind snow-like noise

The amount of noise compared to the signal can be measured with the signal-to-noise ratio. Quoting the definition from Wikipedia:

Signal-to-noise_ratio (often abbreviated SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise.

We can apply this concept of signal-to-noise ratio to the code, and we must try to maximize it, just like in electrical engineering.

Every identifier matters

Look at each identifier in your code: package names, classes and interfaces names, method names, field names, parameters names, even local variables names. Which of them are meaningful in the domain, and which of them are purely technicalities?

Some examples of class names and interface names from a recent project (a bit changed to protect the innocents) illustrate that. Identifiers like “CashFlow”or “CashFlowSequence” belong to the Ubiquitous Language of the domain, hence they are the signal in the code.

Examples of classnames as signals, or as noise
Examples of classnames as signals, or as noise

On the other hand, identifiers like “CashFlowBuilder” do not belong to the ubiquitous language and therefore are noise in the code. Just counting the number of “signal” identifiers over the number of “noise” identifiers can give you an estimate of your signal-to-noise ratio. To be honest I’ve never really counted to that level so far.

However for years I’ve been trying to maximize the signal-to-noise ratio in the code, and I can demonstrate that it is totally possible to write code with very high proportion of signal (domain words) and very little noise (technical necessities). As usual it is just a matter of personal discipline.

Logging to a logging framework, catching exceptions, a lookup from JNDI and even @Inject annotations are noise in my opinion. Sometimes you have to live with this noise, but everytime I can live without I definitely chose to.

For the domain model in particular

All these discussion mostly focuses on the domain model, where you’re supposed to manage everything related to your domain. This is where the idea of a signal-to-noise ratio makes most sense.

A metric?

It’s probably possible to create a metric for the signal-to-noise ratio, by parsing the code and comparing to the ubiquitous language “dictionary” declared in some form. However, and as usual, the primary interest of this idea is to keep it in mind while coding and refactoring, as a direction for action, just like test coverage.

I introduced the idea of signal-to-code ratio in my talk at DDDx 2012, you can watch the video here. Follow me (@cyriux) on Twitter!

Credits:

TV noise picture: Some rights reserved CC par massimob(ian)chi

Read More

A touch of functional style in plain Java with predicates – Part 2

In the first part of this article we introduced predicates, which bring some of the benefits of functional programming to object-oriented languages such as Java, through a simple interface with one single method that returns true or false. In this second and last part, we’ll cover some more advanced notions to get the best out of your predicates.

Testing

One obvious case where predicates shine is testing. Whenever you need to test a method that mixes walking a data structure and some conditional logic, by using predicates you can test each half in isolation, walking the data structure first, then the conditional logic.

In a first step, you simply pass either the always-true or always-false predicate to the method to get rid of the conditional logic and to focus just on the correct walking on the data structure:

// check with the always-true predicate
final Iterable<PurchaseOrder> all = orders.selectOrders(Predicates.<PurchaseOrder> alwaysTrue());
assertEquals(2, Iterables.size(all));

// check with the always-false predicate
assertTrue(Iterables.isEmpty(orders.selectOrders(Predicates.<PurchaseOrder> alwaysFalse())));

In a second step, you just test each possible predicate separately.

final CustomerPredicate isForCustomer1 = new CustomerPredicate(CUSTOMER_1);
assertTrue(isForCustomer1.apply(ORDER_1)); // ORDER_1 is for CUSTOMER_1
assertFalse(isForCustomer1.apply(ORDER_2)); // ORDER_2 is for CUSTOMER_2

This example is simple but you get the idea. To test more complex logic, if testing each half of the feature is not enough you may create mock predicates, for example a predicate that returns true once, then always false later on. Forcing the predicate like that may considerably simplify your test set-up, thanks to the strict separation of concerns.

Predicates work so good for testing that if you tend to do some TDD, I mean if the way you can test influences the way you design, then as soon as you know predicates they will surely find their way into your design.

Explaining to the team

In the projects I’ve worked on, the team was not familiar with predicates at first. However this concept is easy and fun enough for everyone to get it quickly. In fact I’ve been surprised by how the idea of predicates spread naturally from the code I had written to the code of my colleagues, without much evangelism from me. I guess that the benefits of predicates speak for themselves. Having mature API’s from big names like Apache or Google also helps convince that it is serious stuff. And now with the functional programming hype, it should be even easier to sell!

Simple optimizations

This engine is so big, no optimization is required (Chicago Auto Show).

The usual optimizations are to make predicates immutable and stateless as much as possible to enable their sharing with no consideration of threading.  This enables using one single instance for the whole process (as a singleton, e.g. as static final constants). Most frequently used predicates that cannot be enumerated at compilation time may be cached at runtime if required. As usual, do it only if your profiler report really calls for it.

When possible a predicate object can pre-compute some of the calculations involved in its evaluation in its constructor (naturally thread-safe) or lazily.

A predicate is expected to be side-effect-free, in other words “read-only”: its execution should not cause any observable change to the system state. Some predicates must have some internal state, like a counter-based predicate used for paging, but they still must not change any state in the system they apply on. With internal state, they also cannot be shared, however they may be reused within their thread if they support reset between each successive use.

Fine-grained interfaces: a larger audience for your predicates

In large applications you find yourself writing very similar predicates for types totally different but that share a common property like being related to a Customer. For example in the administration page, you may want to filter logs by customer; in the CRM page you want to filter complaints by customer.

For each such type X you’d need yet another CustomerXPredicate to filter it by customer. But since each X is related to a customer in some way, we can factor that out (Extract Interface in Eclipse) into an interface CustomerSpecific with one method:

public interface CustomerSpecific {
   Customer getCustomer();
}

This fine-grained interface reminds me of traits in some languages, except it has no reusable implementation. It could also be seen as a way to introduce a touch of dynamic typing within statically typed languages, as it enables calling indifferently any object with a getCustomer() method. Of course our class PurchaseOrder now implements this interface.

Once we have this interface CustomerSpecific, we can define predicates on it rather than on each particular type as we did before. This helps leverage just a few predicates throughout a large project. In this case, the predicate CustomerPredicate is co-located with the interface CustomerSpecific it operates on, and it has a generic type CustomerSpecific:

public final class CustomerPredicate implements Predicate<CustomerSpecific>, CustomerSpecific {
  private final Customer customer;
  // valued constructor omitted for clarity
  public Customer getCustomer() {
    return customer;
  }
  public boolean apply(CustomerSpecific specific) {
    return specific.getCustomer().equals(customer);
  }
}

Notice that the predicate can itself implement the interface CustomerSpecific, hence could even evaluate itself!

When using trait-like interfaces like that, you must take care of the generics and change a bit the method that expects a Predicate<PurchaseOrder> in the class PurchaseOrders, so that it also accepts any predicate on a supertype of PurchaseOrder:

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
    return Iterables.filter(orders, condition);
}

Specification in Domain-Driven Design

Eric Evans and Martin Fowler wrote together the pattern Specification, which is clearly a predicate. Actually the word “predicate” is the word used in logic programming, and the pattern Specification was written to explain how we can borrow some of the power of logic programming into our object-oriented languages.

In the book Domain-Driven Design, Eric Evans details this pattern and gives several examples of Specifications which all express parts of the domain. Just like this book describes a Policy pattern that is nothing but the Strategy pattern when applied to the domain, in some sense the Specification pattern may be considered a version of predicate dedicated to the domain aspects, with the additional intent to clearly mark and identify the business rules.

As a remark, the method name suggested in the Specification pattern is: isSatisfiedBy(T): boolean, which emphasises a focus on the domain constraints. As we’ve seen before with predicates, atoms of business logic encapsulated into Specification objects can be recombined using boolean logic (or, and, not, any, all), as in the Interpreter pattern.

The book also describes some more advanced techniques such as optimization when querying a database or a repository, and subsumption.

Optimisations when querying

The following are optimization tricks, and I’m not sure you will ever need them. But this is true that predicates are quite dumb when it comes to filtering datasets: they must be evaluated on just each element in a set, which may cause performance problems for huge sets. If storing elements in a database and given a predicate, retrieving every element just to filter them one after another through the predicate does not sound exactly a right idea for large sets…

When you hit performance issues, you start the profiler and find the bottlenecks. Now if calling a predicate very often to filter elements out of a data structure is a bottleneck, then how do you fix that?

One way is to get rid of the full predicate thing, and to go back to hard-coded, more error-prone, repetitive and less-testable code. I always resist this approach as long as I can find better alternatives to optimize the predicates, and there are many.

First, have a deeper look at how the code is being used. In the spirit of Domain-Driven Design, looking at the domain for insights should be systematic whenever a question occurs.

Very often there are clear patterns of use in a system. Though statistical, they offer great opportunities for optimisation. For example in our PurchaseOrders class, retrieving every PENDING order may be used much more frequently than every other case, because that’s how it makes sense from a business perspective, in our imaginary example.

Friend Complicity

Weird complicity (Maeght foundation)

Based on the usage pattern you may code alternate implementations that are specifically optimised for it. In our example of pending orders being frequently queried, we would code an alternate implementation FastPurchaseOrder, that makes use of some pre-computed data structure to keep the pending orders ready for quick access.

Now, in order to benefit from this alternate implementation, you may be tempted to change its interface to add a dedicated method, e.g. selectPendingOrders(). Remember that before you only had a generic selectOrders(Predicate) method. Adding the extra method may be alright in some cases, but may raise several concerns: you must implement this extra method in every other implementation too, and the extra method may be too specific for a particular use-case hence may not fit well on the interface.

A trick for using the internal optimization through the exact same method that only expects predicates is just to make the implementation recognize the predicate it is related to. I call that “Friend Complicity“, in reference to the friend keyword in C++.

/** Optimization method: pre-computed list of pending orders */
private Iterable<PurchaseOrder> selectPendingOrders() {
  // ... optimized stuff...
}

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
  // internal complicity here: recognize friend class to enable optimization
  if (condition instanceof PendingOrderPredicate) {
     return selectPendingOrders();// faster way
  }
  // otherwise, back to the usual case
  return Iterables.filter(orders, condition);
}

It’s clear that it increases the coupling between two implementation classes that should otherwise ignore each other. Also it only helps with performance if given the “friend” predicate directly, with no decorator or composite around.

What’s really important with Friend Complicity is to make sure that the behaviour of the method is never compromised, the contract of the interface must be met at all times, with or without the optimisation, it’s just that the performance improvement may happen, or not. Also keep in mind that you may want to switch back to the unoptimized implementation one day.

SQL-compromised

If the orders are actually stored in a database, then SQL can be used to query them quickly. By the way, you’ve probably noticed that the very concept of predicate is exactly what you put after the WHERE clause in a SQL query.

Ron Arad designed a chair that encompasses another chair: this is subsumption

A first and simple way to still use predicate yet improve performance is for some predicates to implement an additional interface SqlAware, with a method asSQL(): String that returns the exact SQL query corresponding for the evaluation of the predicate itself. When the predicate is used against a database-backed repository, the repository would call this method instead of the usual evaluate(Predicate) or apply(Predicate) method, and would then query the database with the returned query.

I call that approach SQL-compromised as the predicate is now polluted with database-specific details it should ignore more often than not.

Alternatives to using SQL directly include the use of stored procedures or named queries: the predicate has to provide the name of the query and all its parameters. Double-dispatch between the repository and the predicate passed to it is also an alternative: the repository calls the predicate on its additional method selectElements(this) that itself calls back the right pre-selection method findByState(state): Collection on the repository; the predicate then applies its own filtering on the returned set and returns the final filtered set.

Subsumption

Subsumption is a logic concept to express a relation of one concept that encompasses another, such as “red, green, and yellow are subsumed under the term color” (Merriam-Webster). Subsumption between predicates can be a very powerful concept to implement in your code.

Let’s take the example of an application that broadcasts stock quotes. When registering we must declare which quotes we are interested in observing. We can do that by simply passing a predicate on stocks that only evaluates true for the stocks we’re interested in:

public final class StockPredicate implements Predicate<String> {
   private final Set<String> tickers;
   // Constructors omitted for clarity

   public boolean apply(String ticker) {
     return tickers.contains(ticker);
   }
 }

Now we assume that the application already broadcasts standard sets of popular tickers on messaging topics, and each topic has its own predicates; if it could detect that the predicate we want to use is “included”, or subsumed in one of the standard predicates, we could just subscribe to it and save computation. In our case this subsumption is rather easy, by just adding an additional method on our predicates:

 public boolean encompasses(StockPredicate predicate) {
   return tickers.containsAll(predicate.tickers);
 }Subsumption is all about evaluating another predicate for "containment". This is easy when your predicates are based on sets, as in the example, or when they are based on intervals of numbers or dates. Otherwise You may have to resort to tricks similar to Friend Complicity, i.e. recognizing the other predicate to decide if it is subsumed or not, in a case-by-case fashion.

Overall, remember that subsumption is hard to implement in the general case, but even partial subsumption can be very valuable, so it is an important tool in your toolbox.

Conclusion

Predicates are fun, and can enhance both your code and the way you think about it!

Cheers,

The single source file for this part is available for download cyriux_predicates_part2.zip (fixed broken link)

Read More

Key insights that you probably missed on DDD

As suggested by its name, Domain-Driven Design is not only about Event Sourcing and CQRS. It all starts with the domain and a lot of key insights that are too easy to overlook at first. Even if you’ve read the “blue book” already, I suggest you read it again as the book is at the same time wide and deep.

You got talent

The new "spoken" language makes heavy use of the thumb
A new natural language that makes heavy use of your thumbs

Behind the basics of Domain-Driven Design, one important idea is to harness the huge talent we all have: the ability to speak, and this talent of natural language can help us reason about the considered domain.

Just like multi-touch and tangible interfaces aim at reusing our natural strength in using our fingers, Eric Evans suggests that we use our language ability as an actual tool to try out loud modelling concepts, and to test if they pass the simple test of being useful in sentences about the domain.

This is a simple idea, yet powerful. No need for any extra framework or tool, one of the most powerful tool we can imagine is already there, wired in our brain. The trick is to find a middle way between natural language in all its fuzziness, and an expressive model that we can discuss without ambiguity, and this is exactly what the Ubiquitous Language addresses.

One model to rule them all

Another key insight in Domain-Driven Design is to identify -equate- the implementation model with the analysis model, so that there is only one model across every aspect of the software process, from requirements and analysis to code.

This does not mean you must have only one domain model in your application, in fact you will probably get more than one model across the various areas* of the application. But this means that in each area there must be only one model shared by developers and domain experts. This clearly opposes to some early methodologies that advocated a distinct analysis modelling then a separate, more detailed implementation modelling. This also leads naturally to the Ubiquitous Language, a common language between domain experts and the technical team.

The key driver is that the knowledge gained through analysis can be directly used in the implementation, with no gap, mismatch or translation. This assumes of course that the underlying programming language is modelling-oriented, which object oriented languages obviously are.

What form for the model?

Text is supplemented by pictures
Text is supplemented by pictures

Shall the model be expressed in UML? Eric Evans is again quite pragmatic: nothing beats natural language to express the two essential aspects of a model: the meaning of its concepts, and their behaviour. Text, in English or any other spoken language, is therefore the best choice to express a model, while diagrams, standard or not, even pictures, can supplement to express a particular structure or perspective.

If you try to express the entirety of the model using UML, then you’re just using UML as a programming language. Using only a programming language such as Java to represent a model exhibits by the way the same shortcoming: it is hard to get the big picture and to grasp the large scale structure. Simple text documents along with some diagrams and pictures, if really used and therefore kept up-to-date, help explain what’s important about the model, otherwise expressed in code.

A final remark

The beauty in Domain-Driven Design is that it is not just a set of independent good ideas on why and how to build domain models; it is itself a complete system of inter-related ideas, each useful on their own but that also supplement each other. For example, the idea of using natural language as a modelling tool and the idea of sharing one same model for analysis and implementation both lead to the Ubiquitous Language.

* Areas would typically be different Bounded Contexts

Read More

Domain-Driven Design: where to find inspiration for Supple Design? [part2]

This is the second part of this topic, you can find the first part here.

Maths (cont’ed)

Abstract algebra

Another classical example of mathematical concept that I find insightful for modeling is the concept of Group in the field of Abstract Algebra. To be honest, I’d guess this is what Eric Evans had in mind when he talks about Closure of Operations (where it fits, define an operation whose return type is the same as the type of its arguments) or when he mentions arithmetic as a known formalism:

Draw on Established Formalisms, When You Can […] There are many such formalized conceptual frameworks, but my personal favorite is math. […] It is surprising how useful it can be to pull out some twist on basic arithmetic.

The usual integer numbers and their addition operation is an example of Groups that everybody knows (although integers are also examples of other kind of algebraic structures).

Limits as Special Cases

Going further, the concept of zero and of limits such as infinity can be inspirations for your domain. Consider for example the domain element Currency, with values EUR, USD, JPY etc. Traders can typically request to monitor prices for one currency, but they can also request to see all currencies or none.

Just like mathematicians do when they invent limit concept such as infinity or zero (a special value that does not really exist in the world but that is convenient), we can expand Currency with the Special Case values ALL and NONE, each with their own special behavior, i.e. behave like any currency or none at all (typically this means their methods return true or false respectively all the time). Trust me, it is very convenient.

You’ll recognize that the Null Object pattern generalizes the very concept of zero for arbitrary concepts.

Principles

Dimensions in the domain

Something I clearly learnt from maths classes at school is thinking in separate dimensions, and how looking for orthogonal dimensions helps tremendously.

Given a cloud of concepts spread over the whiteboard, if you can sort them out onto 1, 2 or 3 separate dimensions (the less the better), and if these dimensions are orthogonal (independent to each other), then your design gets greatly enhanced, provided of course it still makes sense in the domain.

For example we may have a dimension that groups the concepts of user, user group and company, that is probably orthogonal to another dimension that groups the concepts of financial instrument, product segment and market. Once the two dimensions are identified, we’ll seldom discuss concepts from both dimensions at the same time, because we know that they are independent.

Yet simple, the formalism of dimension is valuable as an inspiration for supple design. Also once you think of explicit dimensions you may consider extrapolating known concepts such as ordering or interval, if they make sense for your domain. In other words your domain gets more formal, which at the end of the day, once coded, will means less code and less bugs.

Symmetries in the domain

symmetry1
This chair has 4 identical legs: lots of symmetries

One of the biggest possible inspiration for Supple Design is the idea of symmetry, as Paul Rayner quotes from Eric Evans in the 5th part of his series: Domain-Driven Design (DDD) Immersion Course – Part #5

Use symmetry as principle of Supple Design

I love this idea! Indeed, great physicists like Murray Gell-Mann have discovered a lot guided by symmetry.

When you look at elements of the domain, you can choose to focus on how we deal with them, rather than looking at what they are. Considering each perspective in turn helps to find out what properties the domain elements may have in common (for more see don’t make things artificially different).

Sometimes concepts looks fundamentally different just because some of them are singular whereas other are plural. The Composite pattern offers a solution for that. You can unveil symmetry just by looking from the right viewpoint.

Deeper insight than the domain experts?

Domain-Driven Design advises that the domain shall not deviate from concepts and ideas that the users really know and really talk of. However it may happen that by analysing the domain you uncover structures than the people in the business are not even aware of. In fact they may well manipulate the structure in practice without being conscious of them.

I’ve come across such a case in interest rates derivatives trading, where swap traders think of spreads and butterfly as products by themselves, and intuitively know how to combine them for their purpose. However when you want a machine to manipulate these products, you’d better formalize a bit the concepts, that are otherwise intuitive. In our case, simple linear combinations proved to be a good formal framework for spreads and butterflies, even though hardly any traders speaks of the topic like that.

Influences everywhere

I recommend browsing various fields for behaviours or structures that look similar to the domain being modelled. This yields benefits even if you don’t find a corresponding structure, since when you compare your problem to another one you must ask good questions that will help get insights anyway.

With some practice you may even anticipate, and whatever concept you encounter gets automatically “indexed” in your brain just in case you would need it later. No need to remember the concept, just that it exists.

What’s important is not that you know everything, there are books and Wikipedia for that. What’s important is to be ready to dedicate some time looking for a formalism that is well-suited for your domain.

The more demanding you are when making the Design Supple, the more reward you’ll get over time, and by this I mean less ad hoc adjustments, error corrections and tricky handling of special cases.

Read More

Domain-Driven Design: where to find inspiration for Supple Design? [part1]

Domain-Driven Design encourages to analyse the domain deeply in a process called Supple Design. In his book (the blue book) and in his talks Eric Evans gives some examples of this process, and in this blog I suggest some sources of inspirations and some recommendations drawn from my practice in order to help about this process.

When a common formalism fits the domain well, you can factor it out and adapt its rules to the domain.

A known formalism can be reused as a ready-made, well understood model.

Obvious sources of inspiration

Analysis patterns

It is quite obvious in the book, DDD builds clearly on top of Martin Fowler analysis patterns. The patterns Knowledge Level (aka Meta-Model), and Specification (a Strategy used as a predicate) are from Fowler, and Eric Evans mentions using and drawing insight from analysis patterns many times in the book.Analysis Patterns: Reusable Object Models (Addison-Wesley Object Technology Series)

Reading analysis patterns helps to appreciate good design; when you’ve read enough analysis patterns, you don’t even have to remember them to be able to improve your modelling skills. In my own experience, I have learnt to look for specific design qualities such as explicitness and traceability in my design as a result of getting used to analysis patterns such as Phenomenon or Observation.

Design patterns

Design patterns are another source of inspiration, but usually less relevant to domain modelling. Evans mentions the Strategy pattern, also named Policy (I rather like using an alternative name to make it clear that we are talking about the domain, not about a technical concerns), and the pattern Composite. Evans suggests considering other patterns, not just the GoF patterns, and to see whether they make sense at the domain level.

Programming paradigms

Eric Evans also mentions that sometimes the domain is naturally well-suited for particular approaches (or paradigms) such as state machines, predicate logic and rules engines. Now the DDD community has already expanded to include event-driven as a favourite paradigm, with the  Event-Sourcing and CQRS approaches in particular.

On paradigms, my design style has also been strongly influenced by elements of functional programming, that I originally learnt from using Apache Commons Collections, together with a increasingly pronounced taste for immutability.

Maths

It is in fact the core job of mathematicians to factor out formal models of everything we experience in the world. As a result it is no surprise we can draw on their work to build deeper models.

Graph theory

The great benefit of any mathematical model is that it is highly formal, ready with plenty of useful theorems that depend on the set of axioms you can assume. In short, all the body of maths is just work already done for you, ready for you to reuse. To start with a well-known example, used extensively by Eric Evans, let’s consider a bit of graph theory.

If you recognize that your problem is similar (mathematicians would say isomorphic or something like that) to a graph, then you can jump in graph theory and reuse plenty of exciting results, such as how to compute a shortest-path using a Dijkstra or A* algorithm. Going further, the more you know or read about your theory, the more you can reuse: in other words the more lazy you can be!

In his classical example of modelling cargo shipping using Legs or using Stops, Eric Evans, could also refer to the concept of Line Graph, (aka edge-to-vertex dual) which comes with interesting results such as how to convert a graph into its edge-to-vertex dual.

Trees and nested sets

Other maths concepts common enough include trees and DAG, which come with useful concepts such as the topological sort. Hierarchy containment is another useful concept that appear for instance in every e-commerce catalog. Again, if you recognize the mathematical concept hidden behind your domain, then you can then search for prior knowledge and techniques already devised to manipulate the concept in an easy and correct way, such as how to store that kind of hierarchy into a SQL database.

Don’t miss the next part: part 2

  • Maths continued
  • General principles

Read More

Your cross-cutting concerns are someone else core domain

Consider a domain, for example an online bookshop project that we call BuyCheapBooks. The Ubiquitous Language for this domain would talk about Book, Category, Popularity, ShoppingCart etc.

Business Domains

From scratch, coding this domain can be quite fast, and we can play with the fully unit-tested domain layer quickly. However if we want to ship, we will have to spend several times more effort because of all the extra cross-cutting concerns we must deal with: persistence, user preferences, transactions, concurrency and logging (see non-functional requirements). They are not part of the domain, but developers often spend a large amount of their time on them, and by the way, middleware and Java EE almost exclusively focus on these concerns through JPA, JTA, JMX and many others.

On first approximation, our application is made of a domain and of several cross-cutting concerns. However, when it is time to implement the cross-cutting concerns, they each become the core domain -a technical one- of another dedicated project in its own right. These technical projects are managed by someone else, somewhere not in your team, and you would usually use these specific technical projects to address your cross-cutting concerns, rather than doing it yourself from scratch with code.

Technical Domains

For example, persistence is precisely the core domain of an ORM like Hibernate. The Ubiquitous Language for such project would talk about Data Mapper, Caching, Fetching Strategy (Lazy Load etc.), Inheritance Mapping (Single Table Inheritance, Class Table Inheritance, Concrete Table Inheritance) etc. These kinds of projects also deal with their own cross-cutting concerns such as logging and administration, among others.

Logging is the core domain of Log4j, and it must itself deal with cross-cutting concerns such as configuration.

domain_ccc1

In this perspective, the cross-cutting concerns of a project are the core domains of other satellite projects, which focus on technical domains.

Hence we see that the very idea of core domain Vs. cross-cutting concerns is essentially relative to the project considered.

Note, for the sake of it, that there may even be cycles between the core domains and the required cross-cutting concerns of several projects. For example there is a cycle between a (hypothetical) project Conf4J that focuses on configuration (its core domain) and that requires logging (as a cross-cutting concern), and another project Log4J that focuses on logging (its core domain) and that requires configuration (as a cross-cutting concern).

Conclusion

There is no clear and definite answer as to whether a concept is part of the domain or whether it is just a cross-cutting concern: it depends on the purpose of the project. There is almost always a project which domain addresses the cross-cutting concern of another.

For projects that target end-users, we usually tend to reuse the code that deals with cross-cutting concerns through middleware and APIs, in order to focus on the usually business-oriented domain, the one that our users care about. But when our end-users are developers, the domain may well be technical.

Read More