Business Drivers and Project Goals Businesses need to take full advantage of existing legacy systems in order to...
function. They have invested significant sums and long periods of development and upgrades to get the systems to where they are today. At the same time, businesses need to improve their technological capabilities by integrating and refactoring those existing capabilities in ways that were never envisioned when systems were originally built. Changing environments create constant pressures to do business better and more effectively – often with a smaller staff.
One of the common approaches that businesses take today is to add or improve self-service capabilities. Self service is a big money saver, since it can help reduce call center staffing. However, systems that are useful in a call center setting are often not adequate for self service, since the users require training in order to understand how to perform the tasks they need to perform. Many legacy UIs are green screen and predate the science of "user friendly" user interfaces.
Our client is a large company that wanted to test the promise of service-oriented architecture as an approach to solving a business problem. They wanted to improve their online self service capability. The initial targeted users are internal workers who are specialists in performing a specific job, and who rely on an eclectic set of legacy systems of various generations to perform their tasks.
In many cases, a user has to log on to one system, find an item of information, then log on to another system (using a different userid/password) and use that data item as a key to another lookup. The disadvantages of this approach are obvious: It is slow, it is difficult to learn and it clearly does not promise anything in the realm of self service.
The goal of this project was to create a "service layer" that provided easy access to a refactored set of services. Those services would all be accessible as Web services. In addition, a user accessing the services across the portal would only be required to authenticate once, and any background authentication would be handled automatically. So the view from the portal is simple, seamless and unified, even though at the back end it is complex, heterogeneous and decidedly not unified.
SOA Architectural Style
IBM's SOA Architectural Style adds a "service layer" between existing systems at the back end and their ultimate clients at the front end. The key enabler of this approach is Web services technology.
The "service layer" clients see a well-defined set of Web services that have been designed and developed to meet business requirements. The services that are on offer are based on (but not necessarily the same as) those supplied by the existing systems. The service layer employs the capabilities of the existing systems in whatever way is most useful. Then it repackages, recombines, groups, serializes and hides details of the legacy services that support it.
The nastiness is still there - and it always will be - but it is neatly tucked away out of sight.
Using Web services in this way allows for the gradual and controlled building up of a set of business-aligned services that have machine-processable service descriptions. Those service descriptions are in the form of WSDL documents. A catalogue of the services that are available in the service layer can be managed in a registry.
When all those pieces are in place, one begins to see a service-oriented architecture.
PanDOORA for Web Services
The details of a service layer are at the discretion of the implementers. However, like any architectural choices, there are better ones as well as some that are not so good. The PanDOORA architectural template recommends an approach that provides advantages over some less-structured but simpler-to-understand designs.
The thinnest possible service layer has only three elements:
- A back-end adaptor that knows how to send/receive messages to/from the existing system.
- A front-end Web service that is accessible by the service consumers.
- A mediation layer that converts between the two messaging and invocation styles.
In the simplest implementation all three elements could actually be coded within the "doPost" method of a single servlet. Unfortunately, doing it that way creates an undesirable coupling between the elements and guarantees that a change to any part ripples through the code.
Any best practice approach provides decoupling between the "front" and "back" ends. That makes it easy for the implementer to replace or modify one part without having to recode another part. So if the underlying legacy system undergoes an upgrade that changes details of the messaging, the front end (and the consumers) don't need to know that or deal with it.
In addition to decoupling the front and back ends, the mediator element should be decoupled from both of them so that it can be changed or replaced as required. In fact, some of the mediator's responsibilities might be provided by a separate facility such as an ESB. For example, the actual version and location of the service provider could be determined within ESB as a runtime decision. In that case, changing a table might result in another choice. In an example of message-based routing, details of the initial request might result in a different style of response – possibly a choice of currency or language. Maximum flexibility and the opportunities for unlimited value add require that the layers be completely separated.
PanDOORA is an architectural template. It does not provide code, but rather gives guidance on the design. PanDOORA for Web services helps designers by giving a template for a service layer. This layer has six elements. In a specific implementation, each element would normally be a single Java class, although that is not a restriction, just observed practice.
- The Requestor Adaptor knows how to access the underlying system.
- The Requester Agent decouples the Business Service from the Requestor Adaptor.
- The Business Service contains all business logic. It is generally a poor practice to put business logic outside the Business Service.
- The Provider Adaptor is the actual Web service. In practice, this may be a SAAJ-style Servlet, or it could be a Java Bean that is extended with an automatically-generated JAX-RPC SEI.
- The Provider Agent decouples the Provider Adaptor from the Business Service.
- The Client Proxy resides in the client and knows how to access the Provider Adaptor Web service. Depending on the situation, the Client Proxy may be supplied by the implementers of the Provider Adaptor, or it may be generated from supplied WSDL.
- Various "Contracts" are Java Interfaces representing Value Objects containing the parameters that are passed from one layer to the next, which results in a further level of decoupling. We discuss the options for implementing the Contracts below.
So the logic flow is that the portal invokes a method on the Client Proxy that results in a Web service invocation. The Web service invocation is handled in the service layer by the Provider Adapter which constructs a Contract and uses it to parameterize a method on the Provider Agent. The Provider Agent uses a Contract (possibly the same one, as we will see) to parameterize a method on the Business Service. The Business Service creates a Contract that it sends to the Requestor Agent, which in turn follows the same pattern with the Requestor Adaptor. The Requestor Adaptor accesses the actual legacy system and returns the response back up the chain and eventually across the network to the remote client.
The PanDOORA template describes a very flexible approach that allows a complete and formalized decoupling of the three basic elements. Any of the three can be replaced or updated without knowledge of the other two unless there is a need to offer additional capabilities or handle new parameters.
Service Layer Implementation Considerations
A real implementation needs to consider the following items:
The thread in the servlet that handles the Web service invocation can not be held during the time when the back-end system is being accessed. Doing that would lock a valuable and limited resource that belongs to the application server. If the back end system failed to respond for whatever reason, or even if the response time was well within the range of "normal," the performance of the entire server or cluster could suffer dramatically. A separate thread should be used to access the back end system.
There are at least two choices in how the inter-layer Contracts should be constructed. One choice is that each layer can have its own ContractInterface and ContractImpl. The result of that approach is that the parameterization between the layers is explicit and well-defined. The advantages of this approach are that programming errors will generally not compile and the contract is easy to understand. The disadvantage to this approach is that it requires code to create the various styles of contract and to copy parameters from requesting to providing contract at each layer - once on the way down and once on the way back up. That can be very tedious. It also leads to expensive maintenance in situations where the parameterization is complex and likely to change frequently.
Another option is to use a generic payload object, (such as a HashMap, for instance) and just pass the same object down and back up the stack. Objects at each layer receive the payload and optionally read or write it as needed. The advantage of this approach is that it never requires changes to layers that don't need to modify the parameters – those that just pass them to the next layer. The disadvantage is that there can be no compile-time checking of the parameterization fit between the requestor and provider, since the payload is generic. If you want something that is not there, or if it is there and it's the wrong type, you can only find that out through testing. Errors are never apparent until run time. In some cases it may seem to work until the code goes into production.
Security is a very large topic for service layer implementers, and the scope is too broad for a short article like this one. So I'll just discuss the most obvious aspect of it – user authentication.
If the service layer is to have great value, it needs to provide a homogeneous authentication approach. A typical goal would be to have authorization decisions somehow externalized so that the service layer could handle that automatically in the context of the universal authentication scheme.
Unfortunately, what we normally see is a much more difficult situation.
Legacy systems were built over a long period of time by people who did not know each other, who did not share common goals, who did not understand enterprise architecture and to whom security was a last-minute add-on. As a result, the various authentication schemes that back-end systems implement are eclectic and thus hard to mold into a seamless whole, which is the goal of the service layer.
Examples of legacy authentication approaches that we found included standard mainframe RACF or ACF, single-system userid/password repository and even one that used a "trusted IP." Trusted IP means that if your IP address is in their table, you can just come on in.
Approaches to getting authorized access to the enterprise system can be divided into two types: transitive and user-specific. In the transitive approach, the portal itself is an authorized user of the enterprise system which implicitly delegates authorization chores to it carte blanche. A typical implementation is to create a user with full access rights and let the portal authenticate as that user. Then it's up to the portal to prevent unauthorized access. Obviously, this is a weak approach and should only be considered when the cost of a compromise is low and the cost of user-specific authentication is high, or if a user-specific approach is simply not feasible.
A user-specific approach can be implemented in several different ways, depending on what the target system requires. Two that we tried are the Replay Technique and the Secondary Table Technique. Neither one of these is very satisfying, either.
In the Replay Technique, the portal steals the session token that is generated by the authentication server (in this instance Netegrity SiteMinder). It passes the token as a parameter in the request to the service layer and it eventually filters down to the Requester Adaptor which replays it in the request to the enterprise system. This works only if the token decay is long enough and if the target is defended by the same system component that generated the token in the first place.
In the Secondary Table Technique a locally-accessible encrypted table contains the userid/password combinations required to authenticate at the target system. I am not joking. The userid, as obtained from the stolen session token, is used as a primary key into this table which yields the required credentials for access to the enterprise system. Come on in.
The knowledgeable reader despairs for a safe and effective approach, and indeed one does exist, as described in WS-SECURITY. The single troublesome detail preventing the widespread acceptance of this approach is that it is universally unimplemented in the target systems. Over time however, we do believe WS-SECURITY will be implemented by many enterprise systems and the situation will gradually improve. It will never be perfect, though, and security hacks like those I describe above will still be required if service layers are to be constructed.
Next to "security," attachments (binary files added to Web service requests or responses) are the most problematic aspect of Web services. There are four approaches, and none of them work very well. You have to make a choice, though, if you want to send a binary file (e.g. an image) across a Web service.
Microsoft has an approach which they refer to as WS-ATTACHMENTS giving it the purported flavor of some type of standard, which it is as long as you don't try to use for communication between Java and Microsoft. It is a Microsoft only "standard." WSE implements it, but Microsoft is deprecating it and plans to move to a new approach soon.
Java has an approach called Soap with Attachments (SwA) and Sun supplies an API for it called SAAJ (Soap with Attachments API for Java). But Microsoft does not support SwA, even though it is a WS-I approved approach.
XML provides "base64" which is a binary encoding technique for sending binary as text. It works for everybody, since it's standard XML, but it's a problem for large files, since the file size grows by a factor of 50% or so. That's doubly bad, since the XML parser has to wade through all that non-text text looking for the next token. So it's very expensive compared to just sticking it into the HTTP Request as a MIME type and handling it as a chunk of known size.
And finally, it is possible to just send the URL of the file as stored on an FTP site and allow out-of-band self service. But that is an authorization nightmare, since all the authorized users of the site could upload the file, even though the real authorization requirements may be stricter.
For our implementation we chose SAAJ and made the decision not to allow interoperation with Microsoft. That worked for us, but it certainly will not work for everybody.
As we began our development, we learned that while the client wanted to investigate Rational Software Architect as a toolkit, some of the client developers were more familiar with Axis and were doing all their development that way.
That created a serious conflict which we did not expect until we tried to integrate. The problem was that there are a lot of classes of the same name, but the Axis ones don't play well with the IBM ones. So if an object loads SOAPAttachment (for instance), then the class loader needs to know which one, since there are choices. In order to make sure they got what they wanted, the Axis developers set the property to load org.apache.axis.SOAPAttachment. That created a problem for our team, naturally, since we knew nothing about that and assumed that if we needed SOAPAttachment we would get com.ibm.ws.webservicesengine.SOAPAttachment. That assumption not being true anymore, there were some difficult moments while we tried to figure out why integration broke our code so dreadfully.
In the end we managed to get around these problems and we showed that a service layer can indeed be made to work, although only with difficulty at the back end.
The one remaining question has to be about performance. Obviously, with so many layers, and with Web services in the starring role, the service layer will not be fast. Fortunately, as a previous client once observed, "Moore's Law is our friend."
So we can have flexibility now and better performance later. We can not have great performance now and flexibility later.
That thinking is what got us to where we are today.