Tuesday, February 01, 2011

On UML as a Modeling Standard for Internal IT Purposes

UML has evolved into a very large and fairly comprehensive Standard (http://www.uml.org/). The size and complexity allows it to be applied to most IT problems and also can create problems of it's own.

I do believe that it could be used to solve some of the problems I see in my day-to-day job and, with proper application, solve some of the problems I face.

'What we've got here is (a) failure to communicate' 
- from 'Cool Hand Luke'. 

Communication problems are the crux of the matter as I see it and come in two basic forms; challenges between two or more people trying to communicate an idea or situation at a point in time and also over time. In the first case there is an opportunity for the two parties to interact and discuss. In the second case that is not typically the case; a document is created and then read months or years later. The original author may no longer be with the company.

Aspects of the problem I see are:
  1. Integrity (correctness and completeness), 
  2. Comprehensibility (succinctness and semantics), 
  3. Usability (navigation and accessibility)

Before I go into these in more detail and describe how UML may fit into the solution, I need to define a scope of applicability - the context within which I am think about this. In my current job I deal mostly with IT investments, processes, business systems (solutions), and requirements. I don't often deal with software architectures (the deep structures of software inside those business solutions), or the deep details of deployment. I also usually only deal with the results of the business requirements process and not the detailed creation of them.

Very often the results of my work (or of a team of people with whom I work) is depicted in the form of diagrams within a PowerPoint. Although the diagrams titles are used over and over again (business context, system context, and anything followed by the word architecture), there are no common rules about what goes into them, no common definitions for what lines and boxes mean, and very little consistency across teams, or time. My common tools for creating and managing this information are: Visio, Excel, Word, PowerPoint, and Sharepoint. All though we do get a lot done with these tools, there are challenges with the integrity, comprehensibility and usability of the resulting documents. Let me illustrate with a Visio example.

A box likely means something exists (physically or conceptually) and a line means that two things are somehow related. The precise meaning of the diagrams parts is defined by the author and may not be included in the diagram itself. This description might sound like total chaos, and that would be misleading. People within my organization have learned what those things usually mean within the context they are presented and can deal with the ambiguities and information gaps that exist. In most cases, the fact that a diagram abstracts a complicated physically deployment and represents it as a simple line is useful.  

For example, I might depict the communication pattern between two applications as a simple line and maybe I put the label MQ on it. That tells many of my peers quite a bit. If they also know that those two systems are deployed within our own datacenter and a vendor datacenter then they can likely guess that there are likely two queue managers involved, a few MQ channels, several servers, likely at least two firewalls and a likely more IP switches than we would care to document. However, if I were the network engineer, I likely care about each and every one of those switches and circuits and not much at all about the communication activity between the two applications. Until there is a problem, or we need to make a major change and at that point all parties are interested in ensuring all levels are understood.

So we need to be able to communicate the idea that systemA talks to systemB and we want to show that as a simple line. We would like to be able to look inside that line to see the MQ specific characteristics and topology. Drilling further down we would see IP socket level details, IP addresses and ports - useful to network engineers and people writing firewall rules. Below that there is even more detail, circuits etc. Of course these layers of abstraction have been documented in another standard reference - the OSI 7-layer model (http://en.wikipedia.org/wiki/OSI_model). Should are architectural diagrams follow the same layers? Perhaps. Perhaps we should be producing artifacts that align with TOGAF 9 (http://www.togaf.com/)? Or Zachman (http://en.wikipedia.org/wiki/Zachman_Framework)? This is a major outstanding question for me. Maybe there are other possibilities.

Some of these frameworks provide better guidance at that than others. Zachman, for instance, provides guidance on many ways we should partition out models. TOGAF takes a different approach and is a little more concrete in some areas, but is generally less specific about deliverables. Which ever framework is chosen, I am thinking the most important part of the application of UML to the problem is getting guidance that is specific enough that two people would produce a models that are similar at the semantic level. What do I mean? If I were to point to people at a running system including all the code, deployment descriptions ands operational documents, and told them to draw the UML diagrams that describe the system at OSI layers 4 through 7, would they produce diagrams that use the same symbols in the same way? Would they define the same stereotypes and make the same profile extensions? I don't think so. There does not seem to be a standard methodology which can be applied to UML that gives good guidance on this. 

I recently was involved in the review of a large number of infrastructure diagrams that were produced by a team. It wad quite clear that the individual artists had different ideas and rules in their heads. One person would focus on the network topology, another on the data flow (process oriented), and yet another on the IP level session. Production and disaster recovery (DR) path were always on the same diagram, but sometimes the DR paths were dotted or pink, sometimes not. Sometimes you could identify active-active clusters, sometimes not. If we were to re-execute the same task using UML and a common UML tool, would it be any better? I don't see that it would be. The symbols would be more consistent, but beyond that I don't expect much improvement.

It may actually be worse. When you have a free-form Visio diagram you know that there is no defined semantics for the graphical objects. But once you put them into UML and then into a repository you might think you know things we more confidence than you actually do. Garbage in, Garbage out. I have seen the progression from visio diagram to excel spreadsheet and then aggregate all the spreadsheets into a database and run queries. Once we have gotten to that point we are in danger of drawing conclusions from data that was shaky to start with. When it was a visio it was unstructured and it was apparent that only so much could be done with it. Once it becomes more structured (without adding knowledge in the process) one might lose sight of the inherent weaknesses in the source.

So back to my MQ example. In its simplest form this is two boxes and one line. Should the two applications be 'components' in a UML model. Should the MQ connection be a 'Usage', a 'Component Realization', an 'Interface Realization', or an 'Association'?  Or something else? As I have poked around I do find answers, but I am left to think that somebody else doing the same job might get different guidance. When we try to aggregate our work together at some future time we might discover a large amount of rework ahead of us. 

Is there an undiscovered part of internet that holds the answer to my quest? Leave a comment and let me know what you think on the topic.