Surface-area over volume ratio – a metaphor for software design

There’s a metaphor I had in mind for a long time when thinking about software design: because I’m proudly lazy, in order to make the code smaller and easier to learn, I must do my best to reduce the “surface-area over volume ratio” of the software.

Surface-area over volume ratio?

I like the Surface-area over volume ratio as a metaphor to express how to make software cheaper to discover and learn, and smaller to maintain as well.

For a given object, the surface-area over volume ratio is the amount of surface area per unit volume. For buildings and for animals, the smaller this ratio, the less the heat loss during the winter, hence a better thermal efficiency.

Have you ever noticed that huge warehouses were always cool even during the summer when it’s hot? This is just because in our real 3D world the surface-area over volume ratio is much smaller when the absolute size of the building increases.

The theory also mentions that the sphere is the optimal shape with respect to this ratio. In fact, the more “compact” the less the ratio, or the other way round we could define compactness of an object directly by its surface-area-over-volume ratio.

A dodecahedron, a volume that approximates a sphere with just 2D facets (Wikipedia picture)

What about software design?

Let’s consider that each method signature of each interface is part of the surface-area of the software, because this is what I have to learn primarily when I join the project. The larger the surface-area, the more time I’ll need to learn, provided I can even remember all of it.

Larger surface is not good for the developers.

On the  other hand, the implementation is part of what I would call the volume of the software, i.e. this is where the code is really doing its stuff. The more volume, the more powerful and richer the software. And of course the point of Object Orientation is that you don’t have to learn all the implementation in order to work on the project, as opposed to the interfaces and their method signatures.

Larger volume is good for the users (or the value brought by the software in general)

As a consequence we should try to minimize the surface-area over volume ratio, just like we’re trying to reduce it when designing a green building.

Can we extrapolate that we should design software to be more “compact” and more “sphere”-like?

Facets-like interfaces

Reusing the same interface as much as possible is obviously a way to reduce the surface-area of the software. Adhering to interfaces from the JDK or Google Guava, interfaces that are already well-known, helps even better: in our metaphor, an interface that we don’t have to learn comes for free, like a perfectly isolated wall in a building. We can almost omit it in our ratio.

To further reduce the ratio, we can find out every opportunity to use as much as possible the minimum set of common interfaces, even over unrelated concepts. At the extreme of this approach we get duck typing in dynamic languages. In our usual languages like Java or C# we must introduce additional small interfaces, usually with one single method.

For example in a trading system, every class with a isInCurrency(Currency) method can implement a common interface CurrencySpecific. As a result, a lot of processing (filtering etc.) on stuff that is related to currencies in some way can be done on all these classes without any knowledge about them, except their currency-specificity.

In this example, the currency-specificity we extracted into one interface is like a single facet over a larger volume made of several implementation. It makes our design more compact, it will be easier to learn, while offering a rich set of behaviors.

The limit for this approach of putting a lot of implementation code under the same interface is that sometimes it really makes no domain sense. Since code is primarily meant to describe the domain, without causing confusion we must be careful not to go too far. We must also take great care when sharing interfaces between bounded contexts, there’s a high risk of excessive coupling.

Faceted artwork (picture from http://reinierdejong.wordpress.com)

Yet another metric?

This metric could be measured by a tool, however the primary value is not in checking the figures, but in the thinking and taking care of making the design easy to learn (less surface-area), while delivering a lot of valuable behaviors (more volume).

Follow me on Twitter!

Read More

What’s your signal-to-noise ratio in your code?

You write code to deliver business value, hence your code deals with a business domain like e-trading in finance, or the navigation for an online shoe store. If you look at a random piece of your code, how much of what you see tells you about the domain concepts? How much of it is nothing but technical distraction, or “noise”?

Like the snow on tv

I remember TV used to be not very reliable long ago, and you’d see a lot of “snow” on top of the interesting movie. Like in the picture below, this snow is actually a noise that interferes with the interesting signal.

TV signal hidden behind snow-like noise
TV signal hidden behind snow-like noise

The amount of noise compared to the signal can be measured with the signal-to-noise ratio. Quoting the definition from Wikipedia:

Signal-to-noise_ratio (often abbreviated SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise.

We can apply this concept of signal-to-noise ratio to the code, and we must try to maximize it, just like in electrical engineering.

Every identifier matters

Look at each identifier in your code: package names, classes and interfaces names, method names, field names, parameters names, even local variables names. Which of them are meaningful in the domain, and which of them are purely technicalities?

Some examples of class names and interface names from a recent project (a bit changed to protect the innocents) illustrate that. Identifiers like “CashFlow”or “CashFlowSequence” belong to the Ubiquitous Language of the domain, hence they are the signal in the code.

Examples of classnames as signals, or as noise
Examples of classnames as signals, or as noise

On the other hand, identifiers like “CashFlowBuilder” do not belong to the ubiquitous language and therefore are noise in the code. Just counting the number of “signal” identifiers over the number of “noise” identifiers can give you an estimate of your signal-to-noise ratio. To be honest I’ve never really counted to that level so far.

However for years I’ve been trying to maximize the signal-to-noise ratio in the code, and I can demonstrate that it is totally possible to write code with very high proportion of signal (domain words) and very little noise (technical necessities). As usual it is just a matter of personal discipline.

Logging to a logging framework, catching exceptions, a lookup from JNDI and even @Inject annotations are noise in my opinion. Sometimes you have to live with this noise, but everytime I can live without I definitely chose to.

For the domain model in particular

All these discussion mostly focuses on the domain model, where you’re supposed to manage everything related to your domain. This is where the idea of a signal-to-noise ratio makes most sense.

A metric?

It’s probably possible to create a metric for the signal-to-noise ratio, by parsing the code and comparing to the ubiquitous language “dictionary” declared in some form. However, and as usual, the primary interest of this idea is to keep it in mind while coding and refactoring, as a direction for action, just like test coverage.

I introduced the idea of signal-to-code ratio in my talk at DDDx 2012, you can watch the video here. Follow me (@cyriux) on Twitter!

Credits:

TV noise picture: Some rights reserved CC par massimob(ian)chi

Read More