Using cloud computing infrastructure -- or Infrastructure as a Service (IaaS) -- allows for the elastic provisioning of memory and compute capacity for SOA services. There are two main benefits of achieving a high degree of service elasticity. First, it allows services to scale up quickly to meet booms in demand. The other advantage is being able to scale back down after temporary boosts have ended, which lets you save on computing expenses in between the peeks.
As a part of TheServerSide.com's Java University, Steve Millidge of British middleware consultancy C2B2, discussed service elasticity, how to perceive it and how to achieve it. Millidge has over a decade of experience as a professional service consultant, has worked with Java since before version 1.0 and sits on multiple Java Specification Request (JSR) expert groups.
Before describing how enterprise architects can build a service architecture that maps well to a cloud infrastructure, Millidge defined what "elasticity" means in relation to services in the cloud. "Elasticity is rapid, it's automatic, and it offers the ability to scale up and scale down," he said. He explained that by "rapid," he meant minutes rather than days and by "automatic," he meant that it requires no operator intervention.
Millidge said that organizations often prepare to scale up, but they do not adequately prepare to scale back down. Preparation is important, he said, because it "involves removing excess compute capacity and there will be services running on those [virtual] instances. They will need to shut down gracefully, without losing any data and without causing any problems to the system."
Load monitoring is a key feature to build into an elastic service infrastructure, said Millidge. He said it is important to have metrics in place that monitor the number of service requests per second, the services' memory usage and the CPU load from the services. "That information must be fed back into a system, which can then trigger additional compute capacity," said Millidge.
Knowing when and how to increase capacity is important. You must figure out the right metrics for each service's scalability, Millidge said, "so that you can place the appropriate factors into your application performance monitoring [system] to set the right triggers."
For example, if a service can support 100 concurrent users at its baseline capacity, then a spike up to 150 concurrent users may break the service's SLA. That means the application performance monitoring should be set such that when it detects 120 concurrent users, it increases capacity to the point where that service will be able to support many more users.
Here, a service repository is essential. A service repository, said Millidge, addresses the issue of service discoverability. "If we have a service on one compute host and we fire up two new compute hosts for capacity, any client of that service must be able to find those servers to use the service. That's normally carried out through the repository."
It is important both for new service instances that can take on load to register themselves and for old instances that are no longer necessary to cleanly deregister themselves.
According to Millidge, one of the key factors for service elasticity in a cloud model is that "networking is dynamic. Servers come and go. Load balancers must be able to discover and use services dynamically." He warns against static typing, which developers would use in an on-premises data center with a known number of nodes. However, dynamic typing may be a challenge for some Java development teams.
Millidge presents a few tips for dynamic cloud services.
- Deploy heterogeneous services. Different services can be deployed on different platforms. For instance, some services may favor a high computing platform while other services perform better on a memory intensive platform. Multiple platforms may be available with a single cloud infrastructure provider.
- Don't code-in dependencies. Look at deployment architecture and middleware infrastructure to ensure that there are no static assumptions about the network coded in and that there are no assumptions about where services will be deployed.
- Avoid single-point-of-failure services. Services need to be independent of each other so that if any one service should fail, other services will not be affected.