Software erosion in pictures – Findbugs
My particular area of interest in software these days is the importance of levels of abstraction above the raw code. In Java, the most natural place for these to manifest themselves is through the package structure (though this is certainly not the only possibility).
Recently I used Structure101 to do some analysis on the evolution of the findbugs code-base, and was rather stunned at what I saw. Here is the root dependency graph for the first public release (0.7.2) back in March 04.
This diagram shows us the top-level packages in the code-base and the interactions between them (the numbers beside the arrows denote the number of code-level references).
With just a little knowledge about what findbugs does (it is a static analysis tool that scans bytecode for potential bugs), it is easy to rationalize about how this code-base is internally architected:
- graph is a re-usable (baggage-less) data structure to model the control flow within a method body
- visitclass wraps a bytecode parser with a visitor pattern to shield the parser implementation from the interpretation of the parser callbacks
- ba (bug analyzer?) is the bit where specific rules (policies, strategies, …) are implemented
- findbugs is the controller that drives the interactions between the other components
The image below again shows the top-level breakout, but this time several releases later in Oct 04 (0.8.6).
Although the code-base has grown significantly, it is still absolutely possible to rationalize about the architecture and where the new pockets of code (annotations, xml, config, etc) fit into the “big picture”. The only apparent blemish is that io is now disconnected (perhaps dead code).
The first significant imperfection creeps in in April 05 (0.8.7).
The relationship between config and findbugs has become blurred. Since both packages are dependent on each other, it is no longer clear that either of these in isolation represent meaningful or useful abstractions, and it may make more sense to think about the relevant “component” as being the union of the two.
Skip forward just a month…
… and the confusion has spread (0.8.8). The filter package has been added but it too has a 2-way dependency with the findbugs package, so it seems reasonable to say that the whole world of controller and config (incl. filtering) is in essence a blob where the individual packages do not really contribute anything in isolation.
There is also a rogue dependency here from ba to findbugs. This is clearly contrary to the original architectural intent. The weight is just 2, so this would have been very easy to reverse out had it been spotted.
If we fast-forward a year (1.0.0), however, we see that this rogue dependency has become entrenched (the weight has increased from 2 to 99).
Moreover, more and more packages are being pulled into the tangle such that it is hard or impossible to talk about these as meaningful entities in their own right. For example, what is the point of a util package if it contains code that depends on the findbugs package?
Nevertheless, we can still see evidence of meaningful architectural decisions. For example, the bcel and asm packages are presumably wrappers for the BCEL and ASM bytecode engineering libraries that, together with the classfile package, enable an element of plug&play in terms of which library actually gets used for the analysis.
However, moving on to Nov 07 (1.3.0)…
… we see that these too have been sucked into the tangle. From now on, it seems, all testing, deployment etc. will need to include both.
And here is the most recent snapshot from September 08 (1.3.5):
This diagram doesn’t help any more – nearly all the higher-level abstractions appear to have eroded away. Moreover, a peek under the hood reveals that there is a large code-level tangle involving 43% of all the classes and spanning 33 packages – this implies that the interdependency has become deeply entrenched in the code. Shame…
For a quick view of the full history, I did up a little animated gif showing the “progression” through all 27 releases. If you are interested in something meatier, see the “Structure101 in a Nutshell Part 1″ presentation on the Headway Take a Tour page.






