There's a parallel concept when it comes to performance testing: Never use old testing methodologies on new technologies. If you do...well, you end up with a mess.
I've spent my entire career helping organizations figure out how to deploy new technologies. One of the key deployment hurdles is performance: how do you test and tune for maximum throughput?
The early days of advanced networking provide a great example of the performance measurement quandary. Just when you thought you'd figured out how to evaluate network performance, something new came along that rendered your testing assumptions irrelevant. First, it was all these protocols, SNA, LAT, NetBIOS, IPX/SPX, AppleTalk, DECnet, XNS, etc., and testing depended upon the actual network medium -- RS-232, Ethernet, Token Ring, and others. As networking technology matured, we were able to focus only on the TCP/IP suite and CSMA/CD networking technologies. And life was good. TCP/IP was tuned with things like slow-starts and acknowledgement windows. But then things went downhill quickly with UDP publish/subscribe before getting better with IP Multicast. And so on.
I'm seeing the same pattern emerge as I work with companies deploying Web services and service oriented architectures -- only this time, companies are trying to benchmark application- level performance across the Web service management components without application context. They're trying to benchmark them as if they were routers. And that just won't do.
Out of habit, network administrators are pulling out packet testers and blasting traffic through the infrastructure. Their goal: to determine maximum message throughput and the maximum number of clients that can hit the infrastructure before it fails.
But it's a new world -- a more peaceful world, where developers pull services together to make new applications. And while capacity is important, developers are more concerned with the performance of their application, because performance is a better indicator of how their applications will behave when the infrastructure is under a load.
A simple way to think about this is to look at the implications of packets vs. messages. On a network, multiple small messages might be combined into a single packet for efficiency -- at the expense of latency. On the other hand, a single large message might be broken into multiple small packets, making network communications more efficient but placing a processing burden on the application layers to reassemble the message.
A simple case and point was a recent performance evaluation where the solution worked better with a SOAP intermediary in place than without. While perhaps unusual, it's perfectly understandable because the back-end application hadn't been tuned for performance, and the buffering that the intermediary provided prevented dropped messages at the server and resends by the client.
Since we're not dealing with packets, we can't test as if we are. We must look at messaging performance. Some good questions to ask are:
- How do I determine maximum message throughput?
- What size messages do I expect?
- What are the response characteristics of the services? How does the timing on the response flow affect the performance of the Web service management infrastructure components?
- What are the performance characteristics with the messages and flow-timings that I expect my network to exhibit?
- What processing is the Web services management broker going to do to the messages?
- How will broker processing affect the response time of my applications?
- How will the application recover in the event of an infrastructure failure?
At a basic level there are agents and service brokers. Some vendors use different terminology, but for the sake of simplicity, think of an agent as something that runs on the application server at an end-node, that is, at a service provider or consumer. A service broker runs as an independent infrastructure component, and is sometimes called a service proxy or SOAP intermediary.
A key performance factor is how the agent runs and where it's loaded. Does it run as a servlet filter or http module? Or, does it run embedded in the SOAP stack? Why is this important with regards to performance?
- A servlet filter or http module adds an extra processing hop compared with an agent that is embedded in the SOAP stack.
- A servlet filter or http module cannot share information with the entire container. Therefore, if the container adds value, for example, like performs security processing, then the servlet filter or http module must reprocess the security credentials on the message in order to make credential policy decisions, or integrate with WS-Security capabilities (and so on).
- A servlet filter or http module is generally written for a platform, java or IIS, not a product, BEA WebLogic Server, Microsoft .NET, or IBM WebSphere. Therefore, it can't be developed for performance on the specific product the way an agent that runs in the SOAP stack can.
In fact, this was demonstrated at a recent product bake-off, where the customer called me up and was confused. He thought his tests were wrong, and wanted my help. He was seeing almost an order of magnitude better performance (as measured by latency inserted into the message flow) on with Vendor A's service broker than on a Vendor B's agent. And what really confused him was that the agent was running on the application server with the service, whereas the service broker was a hop away in the network.
I explained this in the context of the two architectural points above, and it made sense. He wasn't really testing an agent -- Vendor B didn't have a product tuned as an agent (though they certainly called it one). They ran their proxy as a servlet filter on the application server with the services and call it an agent. So, they had all the negative performance characteristics of a servlet filter and the bloated (though necessary in the right place) functionality of a service broker. No wonder it was almost ten times slower!
An agent should be tuned to have minimal impact on message processing. Minimal impact is measured in microseconds of latency.
And, when it comes to the service broker, where latency is measured in milliseconds, performance under real load is a function of transformation efficiency, an asynchronous architecture, and thread management for managing messaging flows.
So how do you test Web Services Management Platform performance?
- Test the impact of clustering on performance. A way to scale performance, in theory, is to cluster servers. Test how performance scales in a cluster. Understand the limit of servers in a cluster, and the performance of the clustering software itself.
- Test latency. This is the most important measure of performance, because you can always cluster servers. But, you can't always eliminate latency with better hardware. See for yourself, and watch CPU % when running latency tests.
- Test throughput with your real back-end application characteristics. Arbitrary testing with packet echo testers doesn't give a true sense of the message flow through a device. Why? Again, messages are different than packets. Packet switching is the domain of routers and it's all about throughput. Message flow management is up at the application levels and therefore is affected by the characteristics of the flow within the application. That said, look for a platform that has tuning parameters to compensate for things like message size, threading alternatives, and poor programming practices.
- A company that can share security credential information with the application server container. Security message processing cannot be done partially, so if it has to be done twice, it's a real kick in the pants.
- A product suite that has optimized performance for specific platforms. A company that provides a "java" agent is not going to have the tuning that a product from a company that has platform-optimized agents for BEA WebLogic Server, IBM WebSphere, and Microsoft .NET does.
- A platform that has different capabilities in the agent and the broker. If they're the same piece of software, just configured differently, you'll have a bloated agent. And, as we saw above, running as a servlet filter with bloated functionality has a significant performance impact.
David Bressler is principal architect for Actional Corporation (aka Vendor A). He has spent much of his adult life helping companies figure out advanced networking, middleware, and now Web services technologies. When he's not doing that, he enjoys the martial arts and technical diving (but not at the same time). His favorite wine is Zinfandel. Contact him at firstname.lastname@example.org.