In one of ZapThink's recent Licensed ZapThink Architect courses for a US Department of Defense (DoD) contractor, we were discussing service-oriented architecture (SOA) quality. I pointed out that as a SOA implementation matures, it becomes increasingly important to manage quality continuously over the full service lifecycle, as the agility requirement for SOA reduces the practicality of focusing quality assurance solely on pre-deployment testing activities. The students then pointed out that the DoD requires that they put any new software implementation through six months of rigorous acceptance testing before deployment. Clearly, these two views of quality are at odds, and beg the question: which has to give? Can the DoD or any other organization implementing SOA have to sacrifice either agility or quality in order to obtain the other?
Best effort SOA quality
The answer is that no, such organizations don't have to sacrifice quality to obtain agility, but rather, they must rethink what they mean by quality in the SOA context. As ZapThink has discussed before when we explained the meta-requirement of agility and the SOA agility model, the agility requirement for SOA vastly complicates the quality challenge, because functional and performance testing aren't sufficient to ensure conformance of the SOA implementation with the business's agility requirement. In essence, the business isn't simply asking the technology team to build something with capabilities A, B, and C, but is also asking them to build something flexible enough to meet future requirements as well -- even though those requirements aren't yet defined.
To support this agility requirement, therefore, traditional pre-deployment acceptance testing is impractical, as the acceptance testing process itself impedes the very agility the business requires. Instead, quality becomes an ongoing process, involving continual vigilance and attention. Quality itself, however, need not suffer, as long as the business understands the implications of implementing an agile architecture like SOA.
An intriguing analogue to this shift in perspective that SOA Quality requires appears in the telco world's move to the mobile phone network. In the old circuit-switched, plain old telephone service (POTS) days, the carriers sought to deliver eponymous carrier grade quality of service, delivering the proverbial five nines (99.999%) availability. However, the new cell phone network was entirely unable to deliver carrier-grade availability -- and even to this day, as we move to third generation mobile networks and beyond, we still live with dropped calls, dead zones, and more. Does that mean that today's mobile phone networks are essentially of lower quality than POTS service? The answer is no, as long as you define quality in the context of the new environment, what the telcos call "best effort." After all, the components of this network -- cell towers, mobile phones, etc. -- have all been tested thoroughly, and the network overall delivers the quality the telcos promise and that we're willing to pay for. As long as the infrastructure delivers its best effort to complete and maintain our phone calls, we're happy.
Just so with SOA Quality. If we exclude the agility requirement from the quality equation, we'll never be happy with the result. But if we build agility into our quality approach, then the resulting implementation is within reach of both. Nevertheless, there is still a tradeoff between agility and quality, but that tradeoff depends upon more than two variables. As a result, there's more to this story.
Beyond the software development triangle
To understand the agility/quality tradeoff question, we have to reach back into the ancient history of software development (say, more than ten years ago) and brush off the venerable Software Development (SD) Triangle, which states that any SD project has three key variables: cost, time, and scope. It's possible to fix any two of these, but the third vertex of the triangle will vary as a result. For example, if you fix the cost and schedule for a project, you may or may not be able to deliver on all the requirements for the project, and so forth.
Restricting the relevant variables to these three, however, assumes a fixed level of quality. In other words, if an SD stakeholder attempts to fix all three corners of cost, time, and scope, then all that's left to give way is the quality of the resulting deployment, which is rarely if ever acceptable. We might say that the SD Triangle is really a square, with quality being the forth vertex.
SOA projects, though, vary this relationship in a fundamental way: they add agility to the equation, turning this square into a star (or a pentagon, if you will), as shown in the figure below, where the lower three vertices form the traditional SD Triangle:
The SOA Quality Star
Now, it's tempting to posit that this SOA Quality Star exhibits a fundamental five-way symmetry, where we might fix any four vertices at the expense of the fifth, but if we take a closer look, the situation isn't quite so simple. In fact, there is a second triangle embedded in the star above that illustrates some important fundamental principles of SOA Quality. This triangle that connects agility, quality, and time we'll call the Best Effort SOA Triangle, because it illustrates the problem the DoD faced in the story above: the more agile a SOA implementation becomes, the more time is required to ensure quality, and as a result, it isn't long until quality activities become so time-consuming that the agility of the implementation is at risk.
Combining the SD Triangle and the Best Effort SOA Triangle into the SOA Quality Star, then, doesn't lead us to the conclusion that we can fix any four vertices by varying the fifth, because the maximum agility we can achieve is limited by the time it takes to obtain the required quality. As a result, the only two vertices we might leave unfixed are the two that are not on the Best Effort Triangle, namely scope and cost. In other words, if we're able to specify the required agility, quality, and time for a particular SOA project, within the context of the dependencies the Best Effort Triangle expresses, then either we can set the cost of that project by adjusting the scope, or set the scope of the project by allowing for additional cost.
The conclusion of this analysis is clear: the only way to obtain the balance between agility and quality the business requires is to take an iterative approach to a SOA initiative, where each iteration is bounded either by scope or cost. As a result, SOA projects differ from traditional SD projects in that with a SOA project, time boxed iterations (that is, those with the time vertex fixed) are impractical, because time boxing doesn't take into account the effect of the agility requirement on quality. Instead, the time vertex depends upon the agility/quality balance that characterizes Best Effort SOA.
The ZapThink take
Perhaps the most important lesson here is the contrapositive of the above conclusion: "big bang" SOA projects that attempt to roll out broad, enterprisewide SOA initiatives in a non-iterative fashion inevitably sacrifice the agility/quality balance, essentially because the time it would take to adequately test such an implementation would severely curtail the agility of the result. Only when agility is not a requirement at all would such a big bang project be even theoretically feasible -- but it could be argued that every SOA effort does have at least some agility requirement, or else why would you bother with SOA in the first place?
You could even say that an ostensible SOA project with no agility requirement is in truth an integration project, as there would be no need for loosely coupled Services. Unfortunately, we see evidence of such projects all the time: organizations that think they want to implement SOA for one reason or another, but instead purchase an ESB or other integration middleware first and deliver what becomes a large-scale, big bang integration project instead. The end result of this snafu is business stakeholders scratching their heads over the resulting lack of agility and wondering why they didn't get the business value from the initiative they expected and thought they were paying for. Taking an iterative approach to SOA is the most important step in avoiding this unfortunate conclusion.