News Stay informed about the latest enterprise technology news and product updates.

Challenges and rewards of integrating commercial Web data services

Today's applications can benefit from a wealth of data services supporting SOAP and REST. But enterprise applications along with SLAs must be carefully prepped to successfully employ such Web services. Enterprise mashups are growing as a means to tap into the Web data.

Early in the development of Web services and SOA, application developers began to focus on how to integrate various applications and their elements – databases are perhaps the best example. Nowadays, there is a wealth of data services available including products from the likes of Thomson Reuters, Dun & Bradstreet and many others.

"It is an evolving and expanding world that includes giants and many niche services, not only for financial data but things like weather, resources, and technology," says Noel Yuhanna, an analyst at Forrester Research. These "data services" are typically wrapped in XML, combined, and then imported into various applications. To further facilitate growing traffic, a range of niche players have emerged, complemented by IT giants such as Microsoft and IBM.

If you need a service or data supplied at high speed, it helps if the provider is close, in order to avoid latency.

 According to Fawaad Khan, an integration architect in Accenture's global SOA practice, there are multiple challenges when a company wants to harness an outside commercial Web data service (or even a publicly available data source).

"The foremost issue in using commercial Web data services is the effort required to validate the quality and reliability of the data being received upfront," says Khan.

It can be very tempting to use third party Web services to get—say, a list of items and their prices for an employee purchasing product catalog—only to discover data quality issues mid-way through testing.

Another major risk involved in using commercial Web data services, according to Khan, is the provider's financial and organizational stability as a business entity. "Performing an appropriate amount of due diligence in the beginning on a commercial Web services data provider is a prudent strategy to avoid potential rework later," says Khan.

Internally, Khan says, you should design your enterprise application in such a way that it doesn't lose all its functionality if the commercial Web data service is unavailable for some reason. Consider using caching technologies, where feasible, to minimize external dependencies and potentially increase application performance and responsiveness and be sure your integration architecture provides isolation via well-defined interfaces. "You shouldn't have to change your enterprise application if the transport or the messaging protocol changes—for example from SOAP to RESTful—in your commercial Web services," he says.

Looking externally, the first step is usually to analyze the APIs for the Web data services of interest to understand not only their inputs and outputs but also what type of data validations, if any, are required to successfully invoke the Web services; and how to process exceptions in the event that the invocation is not successful.

Second, says Khan, who has helped many clients design and implement enterprise solutions based on SOA and Web technologies, you have to consider the available messaging formats (like REST, SOAP, and Ajax) and the transport protocols, and design both your Web services and technology platform for invoking the external Web data services.

Then, he notes, "You may also have to design and implement the integration layer between your internal Web services and the enterprise application of interest if it doesn't natively support Web services integration."

Khan says invoking commercial Web data services with open, standards-based or de-facto standard technologies like HTTP, XML, JSON, Ajax, and so on, is preferable to using proprietary tools and specialized technology or application connectors. This process can be simplified, he says by employing enterprise service bus (ESB) applications.

Khan views an ESB as an implementation of an architectural pattern to support integration of data between Web service providers and consumers. An ESB can enable cross-enterprise Web data services by supporting various Quality of Service (QoS) requirements, including guaranteed delivery, security, audit/logging, mediation, and transformation, such as going from HTTP externally to JMS internally. In other words, an ESB serves to decouple the consumer from the provider of Web services allowing both to evolve as requirements change.

Thus, an ESB can provide a platform for more easily implementing various integration design patterns for Web services. For instance, any of the leading ESB products can perform security, data validation, dynamic routing, and transformation—both at the transport and message level—out of the box. These capabilities, if needed, should be acquired from either an open source community or a commercial vendor but not custom built, he advises.

Khan says thoroughly performance testing the service level agreements (SLA) for response times provided by your Web services vendor is key. Additionally, testing for scalability based on your user base size is also important, especially when integrating with mission-critical enterprise systems. From an operations perspective, capturing audit information such as incoming data elements, their values and response times in the commercial Web services helps with more efficient troubleshooting and issue resolution.

Finally, he adds, "Don't underestimate the complexity of the effort required for implementing security functions such as authentication and authorization, which are particularly challenging with external Web services," particularly if you need to provide granular access to some functions while restricting others.

Enterprise Mashups make sense of Web data services

With another view of the connection challenges, Dana Gardner, principal analyst at Interarbor Solutions, says issues mostly boil down to the types of data and the choice of Web standards. However, those issues are further compounded by the emerging challenge of the volume, size, and complexity of the data sets.

"For instance, the flood of XML data coming in is often much greater than would have been handled by internal applications -- so it requires a different approach technically to mine, search, and manage that data," he said. Fundamentally, it can be an issue of scale and scalability.

Gardner says BI and analytic tools are coming into the market that can handle these ever larger data sets. Also emerging are new tools to simplify and automate more of the importation process. As an example, Gardner cites Kapow Technologies, which provides a means for easily acquiring data from the Web interface layer. "This is a technology that allows you to overcome issues such as format and bring data directly into a mining activity or other application," he says.

"Kapow is ETL [extract, transform, and load] for Web data," says Ron Yu – vice president, marketing at Kapow Technologies. "We have our own proprietary browser and JavaScript engine so we can access data in the same way you would see it. We eliminate the misunderstandings the knowledge worker would have with IT," he adds. Yu says Kapow also provides a Windows-based client development environment that supports live loading of HTML and XML data through a viewer – data can be extracted with a point-and-click system. "You can apply business rules to a transformation capability that can further massage the data prior to actually loading it."

For its part, Composite Software, Inc., provides data virtualization. The company recently announced a joint solution with Kapow to accelerate the integration of Web data in large-scale data virtualization environments. The joint solution is called Composite Application Data Services for Web Content, and is powered by the Kapow Web Data Server. Robert Eve, executive vice president at Composite says as a data virtualization middleware company, Composite offers an alternative or complement to an ETL-to-data warehouse style of integration. "We do those steps in one shot with a view and a pull method built on our high performance query capability."

Among other players is Denodo, which provides a data integration and data virtualization software platform called Denodo Platform, as well as support, training, and consulting services. In the Forrester Wave Information-As-A-Service, Q1.2010, study, Yuhanna and his co-author described Denodo as delivering, "…simplified, low-cost, rapid deployment of data services with options to scale to enterprise-class performance, reliability and scalability."

Outside data meets inside apps

In the report, Yuhanna says large, established players such as Microsoft (BizTalk Server, Microsoft SQL Server), IBM (InfoSphere Information Server), Informatica (Informatica Platform for data services), and Red Hat (MetaMatrix Enterprise Data Services Platform, JBoss SOA Platform) offer the broadest products and still dominate the market in terms of revenue. However, he notes, the niche players like Kapow, Composite, and Denodo are growing rapidly by offering easier and more automated ways to connect outside data with internal applications.

Fundamentally, he adds, the whole information-as-a-service phenomenon is part of the larger inter- and intra-enterprise trend to mashup applications and data more freely and more frequently. As such, it will continue to grow in importance.

The trend line is clear – Web data services are growing in importance and even if you aren't using them now, you probably will. "I'm convinced we will see more services coming in from the Web specifically because companies that specialize in providing data or services have a distinct competitive advantage," says Mike Karp, a vice president at Ptak, Noel and Associates, a consultant and analyst firm. Indeed, Karp says in many cases the Web data could potentially be routed to another external provider for analysis prior to delivery to your own internal applications.

"With that kind of arrangement many of the potential technical challenges of working with Web data services can be reduced to a service level agreement – in other words, they become someone else's problem," he says.

Although Web data services can potentially come from anywhere and be delivered to anywhere, Karp adds one final caution: geography still matters. If you need a service or data supplied at high speed, it helps if the provider is close, in order to avoid latency, he says.

Dig Deeper on Topics Archive

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.