# Degrees of freedom analysis

The concept of degrees of freedom looks so relevant to software development that I am wondering why it is not considered more often. Fortunately Michael L. Perry dedicates a full section of his blog to that concept. In this post I will quote a lot, please consider that as a sign of enthusiasm.

# A common concept in maths

The concept of DOF is central in solving systems of linear equations, and in the main post of his series, Michael L. Perry starts by focusing on mathematical system of linear equations:

A mathematical model of a problem is written with equations and unknowns. You can think of the number of unknowns as representing the number of dimensions in the problem.

Depending on the number m of equations compared to the number n of unknowns (variables), there are several cases:

1. m > n: there is no solution, the problem is over-constrained
2. m = n: there is only one solution
3. m < n: the system is under-determined system, and the dimension of the solution set is usually equal to n ? m, where n is the number of variables and m is the number of equations.

# A common concept in mechanics

The concept of DOF is prevalent in mechanics. In particular, a system with more internal constraints than the total possible number of DOFs has no solution. However in practice it can still work, provided some of the bodies are not absolutely rigid.

The table is often given as an example, because it only needs three legs to be stable on the ground, but usually has 4 legs. This only works because the table is not fully rigid, and can accommodate the small imperfection of the ground.

# Not yet common in software

In his post, Michael L. Perry explains in practice how to analyse software using DOF. First find the unknowns:

To identify the degrees of freedom in software, start by defining the unknowns. These are usually pretty simple to spot. These are the things that can change. In a checkbook program, for example, each transaction amount is an unknown, as are the the account balance and the color used to display it (black or red).

Then find out the constraints between the DOFs.

Next, define the equations. These are the relationships between the unknowns that the software has to enforce. In the checkbook, the balance is the sum of all transaction amounts. And the color is red if the balance is negative or black otherwise.

Finally:

Subtract to find your degrees of freedom. One amount per transaction (n), one balance, and one color gives n+2 unknowns. The balance sum and the color rule give us two equations. n+2-2 = n degrees of freedom, one per transaction.

# What for?

Quoting again Michael L. Perry (across various posts in the DOF category):

Understanding the degrees of freedom in the software helps to create a maintainable design.

Adding independent data to a system increases its degrees of freedom. Adding dependent data does not. Adding an immutable field does not.

You want no more degrees of freedom in the system than the problem calls for.

The concept of degree of freedom is remarkably useful to help distilling the domain down to the essential variable parts and the constraints between them. Any extra independent data can only create opportunities for bugs.

# Pattern grammar for the variant problem

For tools to be aware of patterns, the patterns must be formalized, at least partially. At this point I must quote Gregor Hohpe to clarify my thoughts, as I strongly agree with his skipticism:

Typically, when people ask me about “codifying” or “toolifying” patterns my first reaction is one of skepticism. Patterns are meant to be a human-to-human communication mechanism, not a human-to-machine mechanism. After all, I have pointed many people to the fact that a pattern is not just the piece of code in the example section. It’s the context-problem-forces-solution combination that makes patterns so useful.

Patterns link together a problem part to a solution part. This is expressed within the limits a stated context outside of which it is no more applicable. Patterns also emphasize the forces involved, that you must consider to decide how and whether or not to apply the pattern.

Patterns litterature usually describes examples of application of patterns. In your project, you will have to do more work to adapt the pattern solution to your exact need. A pattern solution may be stretched a lot, but the pattern remains as long as humans still recognize its presence. Every different way of applying a pattern is called a pattern variant.

Formal descriptions of anything human is too restrictive, and this is especially true for patterns which are the product of human analysis, in that they resist simple formalization. However if we focus on sub-parts of patterns, it becomes easier to formalize them, at least for their solution part.

For example a design pattern that uses (in its solution part) some form of inheritance admits several variants. At first, the pattern solution seems to resist against its formalization. However if we now focus on the inheritance part only, we can enumerate every possible alternative for it. For example we can use:

• interface and classes that implement it
• abstract class and classes that extend it
• concrete class provided it is not final (assume we’re in Java), so that it can be extended

Notice that each alternative is a solution to the same problem “How to realize some

form of inheritance”. We can say that each alternative is indeed a pattern, and by the way Mark Grand already described them in Patterns in Java Volume 1. These patterns are easy to formalize, as they can be precisely described in terms of programming language elements.

How can we split a pattern into parts? The idea is to identify the areas that are fragile with respect to the variant problem in the solution part of a pattern, and to consider them as lower-level problems embedded inside the bigger pattern.

In the example before, the problem was to achieve “some form of inheritance”, and we listed three patterns that address this problem.

Provided it can be split into sub-parts (hence into smaller problems), any pattern solution can be formalized by recursively formalizing its sub-parts. If a sub-part cannot be easily made formal, then it can be split again into sub-parts, and so on until each sub-part can.

Given a pattern solution that we want to formalize:

• If it can be described formally directly, then we are done (terminal)
• If it cannot be fully described formally, then extract the problematic sub-parts into sub-problems, then find every pattern that addresses each sub-problem, and formalize their solution part

We can then represent a pattern as a tree of smaller patterns, where the solution part of each patter is connected to the problem part of another pattern.

# From pattern language to pattern grammar

When the pattern is applied into the code, at each node in the tree there is actually a selection of which variant of the sub pattern to use. As such, each selected pattern represents an atom of decision in the design process.

It then appears that we have a form of grammar for patterns, where there are terminal patterns solutions T (easier to formalize in term of programming language elements), non-terminal parts of patterns solutions N (that cannot be easily formalized but that can formally ask for help to solve their sub-problems), and where the production rules P are nothing but the patterns themselves:

```Patterns are production rules that link:
elements of N (the problems) to elements of (N Union T)```

# Conclusion

I have suggested quickly a way to formalize patterns solutions in spite of their fragility with respect to their variants. This approach identifies patterns as production rules in some grammar over the set of patterns considered.

This perspective is well-suited for tools to work on patterns in real-world projects, where the patterns are indeed applied in many variant forms. The problem of this approach is that every known pattern that is variant-fragile must be reconsidered and have its solution split into a formal part and one or several sub problems to be addressed by specific, lower-level patterns, which themselves must be formalised in the same way.

It is essential that for every problem (“intent”) we can enumerate every pattern that addresses it. Intents can be also classified as a taxonomy, where some intents are specialized versions of more generic intents.

This approach does not claim to formalize the full potential of patterns, it only aims at enabling tools to understand patterns that are already there so that to assist the developers for various tasks.

# Patterns in general, not just design patterns

Over time, patterns have appeared on many different topics, not all related to programming. Here is a list of patterns and pointers to other lists of patterns, to illustrate two things:

1. Knowledge and experience in general can be packaged into patterns, often using the pattern form. Patterns are convenient for reuse, in any domain.
2. There are already plenty of patterns, and they cover a really wide range of situations. Given the number of patterns available today, whatever your problem, you will likely find helpful patterns for it.

This should encourage you to search for existing patterns whenever you need additional insight, or just an already documented reference for what you’ve just done (documenting the design of software using patterns).

The list in this post will grow, but without the ambition of listing every possible pattern.

# Software Design

Here are two essential lists of books on patterns:

Other books and/or websites focus on patterns for software development, each with a specialized perspective:

Please notice that while many patterns are to be applied directly to software design or source code, there are also many patterns about the process of building software.

This list could not be complete without referring to the big Pattern Almanach by Linda Rising, available on her website here in PDF or in a web version here.

The Hillside.net Patterns catalog and the many papers published after each PLoP also provide many patterns that may well cover your problem.

# Misc

Probably any domain can benefit from using patterns to represent chunks of knowledge and experience. Here are some examples quite foreign to software developement:

# Conclusion

Though this listing is far from extensive (there are many independently published articles on patterns everywhere), it shows that the pattern community has been quite active to mine as many patterns it could, and this knowledge has been carefully documented in the pattern form, ready for reuse in your next projects.

Of course you don’t need to know them all, not even read them all in advance. However it does help to be aware that for most problems, there probably already exists a few patterns that can prove helpful, and they are only an Internet search away from your needs!

# Patterns as stored arrangements: toward more efficient tools

In nature, out of every possible arrangement of several elements, only a few arrangements are stable. This is illustrated with atoms combined together, or smaller particles arranged together into atoms, where not every combination is sustainable.

Unstable arrangements tend to move toward stable ones over time. Whenever you observe the elements, you mostly see stable arrangements of them. Because there is only a relatively small number of stable arrangements, a brain can be trained to recognize them, and they can even be named and incorporated into the language.

# Better with a brain

The capability to recognize common arrangements of elements is beneficial because it saves a lot of time and thinking. Rather than describing in the details each arrangement each time, it is therefore very economical -cheaper- to describe each stable arrangement once, and then to declare that such arrangement happens in this or that case. Saying: “this is an equilateral triangle” is times more efficient than explaining what it is: “there are 3 lines of the same length, each connected to the two others such as they form a closed path, etc.” It also enables thinking at a higher level.

# In software development

In software development, the usual elements are classes, interfaces, methods, fields, associations (implementation, delegation, instantiation) and various constraints between them. Given a few of these elements we can form many possible arrangements, however only a relatively small number of them is useful and of interest. This happens because the useless arrangements tend to be quickly replaced by the skill-full developer into other that are more useful. For example, any arrangement of two distinct classes that depend to each other, forming a cycle, is usually not desirable. Patterns authors have been working for almost two decades to inventory as many useful arrangements as they could find, resulting into many books and publications in every domain.

Common and stable arrangements of two to three classes together form the basis for design patterns as in the GoF book, an idea I have experimented in a previous post: systematic combination of subpatterns generates design patterns.

Common stable arrangements of methods, fields and how they relate with each other within one class are simply stereotypes of classes, which we tend to call patterns anyway like the ValueObject pattern in the Domain Driven Design book.

Note that in this discussion we are focusing on arrangements of programming elements in the solution space, not in the problem space, but pattern express intents too.

# Harnessing stable arrangements of things: toward more efficient tools

I believe that making explicit the use of predefined common stable arrangements of programming elements, in the coding process, can boost the efficiency of many tools. I also believe that such common and stable arrangements of programming elements have already been identified and are already well-documented in the existing pattern literature.

Rather than configuring tools at the programming element level (class, field, method etc.), if the code is augmented with explicit declarations of the patterns used, the tools can then be configured at the pattern level. For each tool, the idea is to prepare in advance how to configure it for each supported pattern. This preparation must be automated, so that given an occurrence of a known pattern in the code base, the configuration of the tool can be automatically derived from the particular details of the pattern occurrence.

In other words:

```tool configuration=
auto-configuration(pattern, tool) + pattern occurrence```

## A simple case study

To start with an example, let us consider the pattern Abstract Factory, that defines an Abstract Factory interface and one or more Product interface(s). Then assume that in our code base we have an occurrence of this pattern, where the interface WidgetFactory participates as the Abstract Factory, and the interfaces Window and Button participate as Product. Concrete classes form two families, one Linux family (LinuxWidgetFactory, LinuxWindow, LinuxButton) and one Mac family (MacWidgetFactory, MacWindow, MacButton), where each concrete class participates as ConcreteFactory, ConcreteProduct and ConcreteProduct respectively.

## Dependencies restrictions (à la Jdepend + Junit)

The auto-configuration(AbstractFactory pattern, dependency checker tool) could be programmed like the following:

```//Given a pattern occurrence from the actual base: occ

//Factory interface knows about the Product interface, not the other way round
For each Product in occ, add constraint (Product, Must not depend upon, AbstractFactory)

//ConcreteProduct must not know about the AbstractFactory
For each ConcreteProduct participant in occ, add constraint (ConcreteProduct, Must not depend upon, AbstractFactory)

//ConcreteProduct must not know about the ConcreteFactory
For each ConcreteProduct participant in occ, add constraint (ConcreteProduct, Must not depend upon, ConcreteFactory)

//Interfaces must not depend upon their implementor
For each abstract participant in occ, add constraint (participant, Must not depend upon, implementor of participant)```

I have expanded the auto-configuration script to highlight how we can do more sophisticated configuration as soon as it is supposed to be reused many times, something we would never afford for one-shot configuration. Also in the above script, it is quite obvious that we can extract more powerful primitives to simplify the declaration of the script itself.

I have already presented this idea: toward smarter dependency constraints (patterns to the rescue).

## Dependency injection (à la Spring)

The auto-configuration(AbstractFactory pattern, IoC container) could be programmed like the following:

```//Given a pattern occurrence from the actual base: occ
//Given the selected family (Mac or Linux) we want to inject: F

//Bind the ConcreteFactory from F to the AbstractFactory interface
For the AbstractFactory participant in occ, and for the ConcreteFactory in occ that belongs to F, bind (ConcreteFactory, AbstractFactory)

//Bind each ConcreteProduct from F to the Product interface
For each ConcreteProduct in occ that belongs to F, bind (ConcreteProduct, corresponding Product)```

Again we can see that we can automate the binding of each implementation to its corresponding interface from the single explicit selection of the family F we want to inject. This is a great way to factor out dependency injection to make it more manageable. In particular, the configuration is closer to the way we actually think.

## Other tools

The same idea can be applied for many other tools, in particular:

# Conclusion

In this post I described how it makes sense to consider overall arrangements of programming elements as higher-level constructs, which I identify with patterns (the solution part of patterns in fact). I emphasize the fact that the useful arrangements are not that numerous, and that many of them are already documented in the pattern literature. Finally I present how such arrangements or patterns, if declared explicitly in the code, can be leveraged to automate tools configuration in a powerful way.

As usual, any feedback is highly welcome.

# Manipulating things collectively

There is great power in being able to manipulate collective things as one single thing. It gives you simplicity, hence control. You can focus your attention on it and reason about it, even though behind the hood it is made of many parts. The composite thing is kept simple, therefore you can also deal with several of them at a time. This would not be possible if you had to deal with every part they are made of, because it would be overwhelming.

There exists many strategies to deal with collective things as if they were one single thing: statistics, multiple selection, groups, classifications and super-signs.

## Statistics

Statistics is probably the most obvious way to deal with collective things, when the things can be expressed as numbers. Historically it has been used with great results in physics, thermodynamics in particular.

It is all about extracting a few macro properties that we can reason on instead of the whole set of data:

• number of elements
• mean, deviation, moments, percentiles, etc.
• regression, clusters
• total property: total weight, total volume, total price

## Multiple selection

Many software applications enable you to select multiple elements at a time in order to apply one operation to each element:

• When sending an email, you can select multiple addresses to send to
• In a word processor, you can select several words, several paragraphs, or even all the document to copy, paste or apply formatting to each element
• In a spreadsheet, you can select multiple rows or columns to apply operations to, and you can also repeat formula for each row or column

The selected elements can be of the same kind or not. However for multiple selection to be useful, they must share at least something in common: the capability of being copied or pasted, or the fact that they are specific for a particular user.

Functional programming and the three higher order functions map, fold and filter address very well how to apply operations collectively to many elements.

## Groups

When multiple selections are often needed, you can create groups. We can consider a group to be a multiple selection made explicit. You create a group and you explicitly add elements to it. Common examples of groups:

• Mailing lists are named groups of email addresses
• Vectors in maths

As for multiple selection, the elements in a group must share something in common. For example, they must all have a price. Elements of various kinds can be grouped if they relate to something common, for example  the set of various data (name, address, phone number, preferred colour and date of birth) specific for a user is called a user profile.

A group is extensional. The elements in the group may or may not know they belong to a group.

Java packages are groups, and they are declared within the same file as the elements they refer to. Java classes also group fields and methods under one name.

The Composite pattern suggests to group objects that share the same interface into a Composite that also shares the same interface. The intent is to manipulate the collective set of objects as if it was one single object, i.e. without knowing it is collective.

## Classification

You get control over multiple things if you just classify them. Given several flowers, if you classify them into categories, then you can talk about several flowers collectively without having to enumerate each of them: the category is a way to refer to several flowers with just one name.

Classifications enable intensional grouping. This means that groups are defined not by the set of their elements, but by a condition (predicate) to be satisfied. The condition can test for the category of something (is this animal a bird?), or test for its attribute (is this car red?).

Of course abstraction is one particular way to classify.

Java modifiers (private, public, abstract, interface etc.) classify Java elements, and can be used to refer collectively to them, as in “let’s generate the Javadoc for every public elements”.

## Super signs

There are elements that exhibit a special property when considered together as a whole. For example, the ink dots on the paper can be seen as letters. Letters next to each other can be seen together as words, which again can be seen together as sentences, and then again up to the novel. Collective arrangements of multiple things that together exhibit a property are called super signs.

This phenomenon is related to emergence, and only exists for a given observer if he can recognize the super sign.

In science in general, we use models to account for the collective behavior of several elements, typically objects with measurable properties, and forces in action.

In a Java program, idioms and patterns can be considered super signs for those who know them.

# Conclusion

Manipulating multiple things in a simple way really matters, it is a life saver.

In software development it is paramount because it is a lever you use to manage tons of data with no effort. The art is to find the way you think about collective things that reduces the most your effort.

I already mentioned this topic in previous posts: group together things that go together, don’t make things artificially different, and my definition of abstraction, because abstraction is an essential way to refer to different things in what they share in common.

# Essential patterns books for today object-oriented developers

The more patterns developers know, the most efficient they become within a team: it only takes one or two words (the pattern name) to communicate a design decision or proposal, instead of 10 mn of explanations. Communication also gets much more accurate and to-the-point (or less fuzzy).

Because patterns often form a pattern language, not only they offer a standard vocabulary but they also help structuring the mind by their relationships: patterns can be related as exclusive alternatives, or rather often-go-together, which is useful.

Patterns are not always supposed to be “tricks” to learn, or extra complication to introduce to a design; indeed even obvious and uninteresting pattern are worth reading in books, just for the social advantage of being able to reference them. Going further, sometime you can just go to the book to find out which is the pattern that you just did without knowing its name…

 The most essential book, a definitive must-have. Every experienced software developer must know these 23 design patterns.

 The essential complement to the GoF, to extend the benefits of patterns from design to analysis (closer to the problem to solve in the domain). Each pattern will not necessarily teach you a new solution, but will always give you have a standard name for each solution, which is already worth the book.

3.  Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
 A book that can change the way you deal with software. Considering the domain as the main driver of the design is so powerful an approach! Works best together with the analysis patterns.

 Other analysis patterns to solve many technical problems, useful complement to Analysis Patterns.

 Very complete pattern language (both test and visual vocabulary) about messaging, will improve the communication within a team a lot, both in efficiency and in accuracy.

 Important book, more specialized toward optimized low-level problems (web servers…). often quoted. The other volumes are less relevant for today software development on modern languages or virtual machines.

 The book I discovered design patterns in, quite affordable, but some colleagues do not like it that much.

# Analysis: don’t make things artificially different

In any order management systems a quote is not quite different of an order, just a different status of the same entity that is the description of a work to be done (status = POTENTIAL) or done (status = COMPLETED). However I usually find people are tempted to consider they are different concept with no link between them except some similar fields. As if it was easier to consider them different… I don’t think so.

The easiest way to be lazy is simply to write one thing instead of writing two things. Making quote and order two subclasses of the same abstract class is easier, and still enable to use quote-dependent or order-dependent behaviour (i-e methods).

To go further on this idea, and to be more lazy, one can also add virtual capabilities to a concept just to consider it the same as another concept:

• using the Composite pattern, to make no difference between a whole and a part
• thinking about things as special-case of other things, e-g a financial Index can be considered as an Instrument even if it is not really tradeable, only for convenience: it can be used as an underlying for derivatives in the same fashion, there is no point to consider it different unless we really need to.

A common way to forget about what things really are and use them as only one common thing is to think of their common business capability, i-e the common use a specific interface: Sellable, Tradeable, Priceable, Product (as something that can be described, quoted and sold)…

Also, sometimes there is even no need to be accurate when some processing does not actually care what things really are, which happens surprinsingly often. In these cases you’d better just use general Java supertypes: Object, Comparable, Serializable… You’ll gain higher reuse and cleaner code.

One last note about the project management point of view: the differences that do matter to split a project into subparts (to allocate work for small teams for instance) are not always the most obvious. Consider a project in an asset management house that deals with several kinds of financial instruments, say Stocks, Bonds, and Derivatives; many project managers will be tempted to split the project into a Stock subproject, a Bond subproject and another Derivatives subproject.

This is a fatal mistake, since from the asset management point of view the difference between all these kinds of instrument is really small. In this case it is very likely to have complete duplication of the code in each project, just because things have been made artificially different. In short, do not split project scope at random, most obvious classifications are, by definition (because a classification is already a successful attempt to extract commonalities), NOT good candidates to split the workload !

Initially published on Jroller on Thursday April 07, 2005