How can I address Web services scalability issues?
When discussing Web service scalability we need to consider the two aspects of the scalability problem:
1. The scalability of the distributed system consisting of a collection of services
2. The scalability of the individual services
The architecture of the distributed system will have a significant impact on its scalability. No matter how fast you make the individual components, the system will not scale if there are too many messages flowing through the network. The system architecture should focus on limiting the number of messages required to complete a request as well as the size of the messages themselves. Remember, the granularity of your Web services will impact the number of roundtrips (for more on this, see my answer to a question from last week-"How much of an application should I expose as a Web service?").
Once a well-considered system architecture has been developed, attention turns to the individual Web services. Increasing their scalability generally begins by making them as efficient as possible. This might be enough in some cases, but eventually you'll run up against the issue that the transaction load is simply more than a single "instance" of the service can handle. To remedy this, you'll need some type of load balancing solution to distribute the transaction load among multiple copies of the service. In many cases the infrastructure (such as transaction monitors and app servers) in which the service is implemented supports load balancing directly. However, nearly all infrastructures are much better at load balancing stateless services than they are with services that maintain a dialog with the requester. If your system requires such services you should carefully consider the load balancing capabilities of your infrastructure and the resources required to maintain the dialogs. Of course, in the Web services world you also have the option of load balancing across functionally equivalent services implemented in entirely different technologies and environments. In this case you have to consider what load balancing mechanisms are available for distributing requests to the various service implementations. For instance, you might route requests from and responses to your platinum customers to your highest-performance servers.
A high-quality architecture is a good start but we've found that most distributed systems never behave quite as expected. Therefore, it's very important to perform some scalability testing on your system as it is being constructed and to measure the actual performance of the system, once it's completed. Here's a recommendation-don't collect too much detailed information at the outset. Rather, collect basic performance information at only the boundaries of the major services. This will give you a sufficient overview of your conditions and an idea of which services or architectural elements are likely to cause problems. Then you can drill down into specific problem areas ("hot spots"), adjust the service implementations and architecture and iterate toward your scalability targets.