Archive for the 'Structure' Category

Holarchy 101


Herbert Simon’s parable of the watchmakers was constructed to convey his belief that complex systems will evolve from simple systems much more rapidly if there are stable intermediate forms present in that evolutionary process than if they are not present.

Arthur Koestler built on this in his 1967 book “The Ghost in the Machine“, in the process coining the term holon to denote something that is simultaneously a whole or a part, depending on how you look at it. Here Mark Edwards explaining this duality:

Every identifiable unit of organization, such as a single cell in an animal or a family unit in a society, comprises more basic units (mitochondria and nucleus, parents and siblings) while at the same time forming a part of a larger unit of organization (a muscle tissue and organ, community and society). A holon, as Koestler devised the term, is an identifiable part of a system that has a unique identity, yet is made up of sub-ordinate parts and in turn is part of a larger whole.

Importantly, Koestler further described holons as

… autonomous, self-reliant units that possess a degree of independence and handle contingencies without asking higher authorities for instructions. These holons are also simultaneously subject to control from one or more of these higher authorities. The first property ensures that holons are stable forms that are able to withstand disturbances, while the latter property signifies that they are intermediate forms, providing a context for the proper functionality for the larger whole. [Summary text from wikipedia]

Though the terminology is different, I am sure the key tenets of Koestler’s principles will resonate with most people in software. Certainly, the importance of meaningful wholes (within the context of a wider system) is well recognized, and reflected in established principles such as Single Responsibility and Reuse Release Equivalency. Similarly, most would agree that the ability to withstand disturbances is hugely desirable, though we generally talk about this in terms of agility (or its converse fragility). The one aspect that might jar a little is the reference to higher authorities – I’ll revisit this later in the context of Emergent Design.

Koestler also introduced the term holarchy to denote a hierarchy of holons. As I suggested in my previous post on this subject area, I rather feel that, mostly, today’s software thinking tends to buy Koestler’s notions on holons but fall down on holarchy. Specifically, we tend to pay little or no attention to the world of complexity between the low-level coding constructs (classes, methods) and the unit of deployment (jar, dll).

Just as one example of this, see Bob Martin’s Principles of OO Development. He describes five principles that apply to the class level, and six that operate at the unit of deployment. Nothing inbetween. Similarly, and related, there are lots and lots of (not necessarily very useful) metrics that measure aspects of classes and methods, but there is an almost complete vacuum at the (what Booch would have called) “class cluster” level. One of the very few exceptions to this is DMS and related stability metrics for Java packages (based on Martin’s Acyclic Dependencies Principle). However, and somewhat amusingly, it would seem that these metrics only came into being because of confusion over Martin’s use of the term “package” (apparently, he actually intended this to denote unit of deployment)…

The situation changes instantly if we embrace hierarchy, holarchy. I do not see this as anything particularly radical, rather just a generalizing of existing principles. However, the ramifications could be quite far reaching. In the next two posts, I will explain for example how holarchy opens the door to automated visualization and holistic measurement.

The parable of the two watchmakers


The parable of the two watchmakers was introduced by Nobel Prize winner Herbert Simon to describe the complex relationship of subsystems and their larger wholes.

There once were two watchmakers, named Hora and Tempus, who made very fine watches. The phones in their workshops rang frequently and new customers were constantly calling them. However, Hora prospered while Tempus became poorer and poorer. In the end, Tempus lost his shop. What was the reason behind this?

The watches consisted of about 1000 parts each. The watches that Tempus made were designed such that, when he had to put down a partly assembled watch, it immediately fell into pieces and had to be reassembled from the basic elements. Hora had designed his watches so that he could put together sub-assemblies of about ten components each, and each sub-assembly could be put down without falling apart. Ten of these sub-assemblies could be put together to make a larger sub-assembly, and ten of the larger sub-assemblies constituted the whole watch.

I am reasonably sure that most software people reading this little parable would be inclined to nod. For sure, modularity is and always has been a hugely desirable trait in our attempts at software development and design.

In fact, though, I would suggest that the overwhelming majority of software projects today follow the example of Tempus, who lost his shop, rather than Hora. Why?

Because, mostly, we only pay attention to aspects of modularity and component-ness at two levels of granularity: low-level code (classes, methods) at one end of the spectrum, and unit of deployment (jar, dll) at the other. Everything in between we tend to treat as a largely amorphous blob comprising hundreds or even thousands of interacting entities. Even in those case where we do have meaningful abstractions/layers between the low-level code and and the unit of deployment, these are generally invisible and unmeasured. In this context it is hardly surprising that they will tend to degrade over time.

Simon’s parable was one of the key drivers behind Koestler’s theory of holons and holarchy. I will follow up on this – and its (to my mind) huge relevance to software thinking today – in a future post.

Beautiful Structure


In response to O’Reilly’s just-published Beautiful Code, Johnathan Edwards explains why he couldn’t go along with the premise. One sentence in his excellent piece stood out for me:

“The human mind can not grasp the complexity of a moderately sized program, much less the monster systems we build today.”

This is true, but only to a point. Clearly it is humanly impossible to understand the whole “design” of a million-line code-base by studying just the lines of code. But hopefully that is not necessary.

If it’s written in an OO language, then I’m mentally constructing class diagrams as I read the code. Much better if these are visible to me – I can work on understanding the class-level structures, dipping down to the code-level as I need to.

I’m still not going to make sense of thousands of classes as a single conceptual group, But that’s not what I’m going to do. I’ll start organizing the classes into groups. Ideally this was done as the code-base evolved and I have some physical representation of these groups to help me with my task. In Java, the package structure largely serves this purpose. Understanding the package hierarchy and interdependecies (in conjuction with dipping down to the class and code-levels) is not going to be a cake walk, but if the hierarchies are well-structured, it is surely possible.

The degree to which my million line code-base can be understood is therefore largely dependent on:

  1. How well the hierarchies are structured
  2. How good a job they do in explaining the code-base

1 is pretty easy to measure (a lack of cyclic dependencies; not too much complexity at any point of breakout in the hierarchy). But being measurable, I’d call this “quality” rather than “beauty”.

2 depends on 1, but goes a step further. It requires the inspired human touch. Herein lies the beauty.

The bottom line is that structure and architecture are an intrinsic part of the code, and any discussion of code “beauty” without them isn’t going to work for today’s monster systems.

DevX review of Structure101


"Getting your arms (and eyes) around large, complex code bases has never been easy, but Structure101 from Headway Software may just be the elegant solution to this age-old problem. Find out how this visual design tool analyzes your enterprise projects and lets you zone in on issues quickly and gracefully."  Full Story by Derek Lane.

Complexity Debt – don’t “fix it”, “keep a lid on it”


So you just discovered that your code-base has racked up a whole load of complexity debt. This  maybe explains why progress seems so painfully slow lately. You briefly think of suggesting a major complexity-reducing refactoring effort. This will delay the next release significantly, but foreshorten the time to the following releases. Plus a cleaner, simpler code-base will make the world a nicer happier place, right?

But you don’t suggest this. You’re human and self-preservation is an instict. Precisely because of the recent slow progress, there is a lot of disapointment on the whole product delivery front at the minute. Suggesting another big delay doesn’t feel like the best career move just now.

Luckily there is another, more subtle way to get to that happier place without climbing out on long limbs over thin ice.

Don’t repay the debt in one big painful bang – just keep a lid on it. And watch it begin to disipate as though by magic.

You use personality, charisma, leadership and/or donuts to convince your team that henceforth, they will not add any more complexity debt to the code base. Now watch what happens…

If I need to add to a method with a CC of 20 (where the threshold is say 15) and I add a couple of new paths, then I temporarily increase the complexity from 20 to 22. Uh oh, I said I wouldn’t do that. No problem – I’m working on the method already, so I have a good handle on what it does. I just extract a suitable lump into a new method with a nice helpful name and bingo, I have 2 methods each within tolerance instead of 1 over. The 2 methods are simpler and easier to understand and maintain than the 1 before, and the overall code-base debt just went down a bit. Well, I feel good about this.

But wait. That one new method pushes the containing class over the class-level complexity threshold. Again, I refactor the class while its workings are in my head already (perhaps I use move field or extract class). Again, if the class was previously over-threshold, then I probably just reduced the overall debt a bit more.

The same will happen when anyone trys to add to any overly-complex package. And as the xs framework sets thresholds at every level of design breakout, the developers are relieved of the temptation to “hide” complexity by pushing it up or down the hierarchy. The code-base becomes truly less complex, without anyone really trying.

This is cool enough to be named – how about “KALOI” for “keep a lid on it”.

KALOI is supported by Structure101 and there is more explicit support in the pipeline. More on this later.

Structure101 v2 goes GA today


Additions let you see complete slices of a code-base at any level, home in on structural complexity, view dependency graphs in matrix form, and map code items and groups (like tangles) through different hierarchies, slices and perspectives (more download).

Spring 2’s architecture – A single dependency cycle slipped in


The Spring guys have let a single dependency cycle into their architecture. A very small flaw, but it’s a perfect example of why you need to check your code-base at different levels to keep it truly tangle-free.

I did a quick analysis of the Spring Framework some time back and sure enough found their claims of a cycle-free architecture to be correct – a pleasure to behold!

The recent announcement of Spring 2.0 rc4 prompted me to point Structure101 version 2 at same and check they were keeping up the high standard they had set themselves. The Structure101 “notables” quickly took me to the org.springframework.aop package which contains the following tangle:

Springtangle_1

Ok, this is not exactly a fatal flaw, far from it, but it surprised me because I know that Juergen keeps an eye on this stuff. Then I took a look at the leaf package slice (leaf packages being packages that contain classes), and guess what? Not a single tangle.  It is only when you look at the slice one level up that the tangle is apparent:

Springtangle2

Taking a look at the leaf packages contained by these 2 packages:

Springtangle3

(The package names are relative to org.springframework.aop). The dependency between the tagged packages (blue dots) is the one causing the problem. Overlaying the parent package boundaries on this graph, you can see why it is that, although the package diagram is acyclic, dependencies between the parent packages go in both directions, making them cyclically dependent.

Springtangle4

I presume they check only at the flat-package level, which is why this one slipped through the net.

Tracking complexity debt


Un-monitored, the complexity of a code-base increases with its size. Jboss and Struts are perfect examples. However monitoring complexity helps you keep complexity debt under control, or even down to zero.

If you publish the last couple of years worth of releases of your project to a Structure101 repository, you’ll probably see something like this in the Structure101 Tracker web application:

Jbossxschart

jboss over time

Structure101 matches the amount of XS to the lines of code that cause it. Unless someone pays attention to it, the same team will tend to code-in a consistent degree of complexity debt as they go – in this case they’re running at a not-unusual but alarming 80%.

Struts shows a similar chart running at about 57% XS:

Strutsxschart

struts over time

I was in with the Prime Carrier guys yesterday. They’ve been tracking their excess complexity for several months now and it really shows.  Here’s the chart for one of their projects:

Pc1

Prime Carrier project 1 over time

As you can see, XS is hugging the zero line even as the size of the code-base grows. This is why we call it “excess” (XS) – it really is excessive and totally avoidable. Occasionally a cyclic dependency (tangle) or fat class may get into the build, but it’s flagged and usually fixed during the next iteration.

They recently released a new version of another product under severe deadline pressure and consciously decided to pick up a bit of short-term debt. As you can see, they’re already paying it back before it incurs too much “interest”:

Pc2

Prime Carrier project 2 over time

These are a great bunch of hard-nosed developers who are keeping the code-base clear of debt so they can focus on the principal – customer requirements.

Manage complexity like debt


Ben Hosking writes in Managing Complexity – The aim of Designing Code that:

The most important part of design is managing complexity

I like the simplicity of that. What happens if you don’t manage complexity. Well, it starts to cost. Talking at OOPSLA 2004, Ward Cunningham (Mr. Wiki) compared complexity with debt:

“Manage complexity like debt,” Cunningham told attendees. Using this analogy, he likened skipping designs to borrowing money; dealing with maintenance headaches like incurring interest payments; refactoring, which is improving the design of existing code, like repaying debt; and creating engineering policies like devising financial policies.

In an interview with Bill Venners (Artima), Andy Hunt (Pragmatic Programmer) extends the analogy concisely:

“But just like real debt, it doesn’t take much to get to the point where you can never pay it back, where you have so many problems you can never go back and address them.”

It’s a lovely metaphor. But it does breaks down in one place. Project managers don’t get a pile of bills through the door every month. Even if they wanted to, they can’t rip them open, sum them up, compare them against income and outgoings and discover just how fragged they are, or even hell, that they can afford loads more debt!

Well it’s not quite that bad. We can at least measure and sum up the complexity of items at different levels of design breakout (methods, classes, packages, subsystems and projects).  We may not be able to put a hard complexity number on the tipping point (insolvency), but we can give you a number. With this you can compare projects, monitor trends that show where it’s getting more or less complex, and discover which items at what level are causing the trend.

For example here is the home page for the Structure101 Tracker web application showing the sizes and over-complexity of several projects:

Tracker

Now, correlate XS with the depth of furrow on team leaders’ foreheads, and you’ve really got something to go on…

CAT-scan a code-base


Structure101 v2 goes beta today. With it you can walk through the code-base in slices from the class-level, to the package-level and up through the design levels, spotting tangles and seeing how far they have spread.

This is a snag of the Slice perspective with the slice selector highlighted:

Sliceselector

You can now see dependency graphs as matrices, which tend to be better for very large graphs (like slices). A value in a cell indicates a dependency from the column item to the row item. Here’s the equivallent of the tangle shown as a diagram above – as a matrix (highlighted) it now fits in on the screen:

Smallmatrix

And here is a much bigger slice of all the classes in the code-base grouped by parent package (the orange areas).

Bigmatrix

Even zoomed way out, it is possible to pick out some patterns on the matrix. The rows and columns are ordered so that as far as possible items only use items below or to the right, so any dots (dependencies) above the diagonal indicate cyclic dependencies. Horizontal lines indicate heavily used items, vertical lines indicate items that use a lot of other items.

Version 2 lets you “tag” (mark) code-level items (like methods and classes), and any higher-level item (like a package) that contains tagged items is shown as tagged. This lets you tag items in one slice and then see how it maps to other slices and hierarchies. For example, you could tag a big class-level tangle in the Slice perspective and then go to the Composition perspective to see how the tangle is distributed across the package design – it would look like this:

Taggedhierarchy