News Stay informed about the latest enterprise technology news and product updates.

Data services pain points have become an SOA target for JBoss

At JBoss World Orlando, Craig Muzilla, VP of middleware business at Red Hat Inc., discussed data services, MetaMatrix and SOA governance. He delved into the common data services issues and future goals regarding open source governance.

Where are you seeing data issues creating specific pain points for users and what is the nature of the problem or problems that you're seeing?
Oh, big topic. There are all kinds of issues. One issue is trying to come up with canonical data formats and standards in an organization. I will give you an example. What is the definition of a customer in an organization? There might be twenty systems that have portions of a customer record. In a bank you might have a main system that deals with credit card information. You may have another system that deals with your account balances, checking accounts, savings accounts and then you have the loans database. Every one of those databases may define a customer differently. So if you look at a database schema, one database may say "Cust ID" and another one may say "Customer name," so even how you identify the customers among all the data is different.

One the major problems that MetaMatrix helps solve is to help companies model out what the data should look like and how it is more easily consumed in an SOA environment. It bridges the gap between relational data and maybe how it needs to be represented in XML or as a Web service. One of the things that we see is the need to create more flexibility within your application infrastructures. It's about flexibility. Generally when you create an application whether you use a JBoss application server or not, you hardcode between the application server and the data source. So there's a database underneath, it could be Oracle or it could be DB2 from IBM, but you basically couple the logic to the database in many respects. One of the data issues now out there is how can I get around that problem and get more flexibility. So if a database changes it doesn't break my application, it doesn't break my service. And so therefore separating the application logic from the data source and allow you to swap out databases and swap out your data storage without breaking applications. Those are some of the issues we are seeing in data services. How common do you think the issues are?
It is extremely common. Those issues in particular. That is a common architectural problem especially within service-oriented environments because the whole idea of service-oriented architecture is that I don't need to know in advance how my service is constructed, it just provides me with something. Is that why you are gearing towards open source so that there are less problems with connectivity?
Well open source is not necessarily the architectural construct its more of the delivery construct which is how I'm making this software available. So, how am I creating it? It is done in an open source manner so I am not tied. I can see the code, I can manipulate the code, I can contribute and I'm not locked in to any particular vendor if at any time I need to move and do something different. It is more of a business construct and licensing construct than it is any sort of technology construct. Are the architecture issues that you're seeing an epidemic of sorts?
I think everyone has encountered some of these issues for decades. The move towards SOA has exacerbated this. So, if you don't have a good understanding of your data you're not going to be able to move to a service-oriented architecture as well. You're not going to have canonical formats of a general service for accessing customer information if the customer information is foreign to the consumer of that service and it will do you no good. So trying to get rationalization of the data prior to implementing SOA helps a lot.

I'll give you some practical use cases. Some of the things people are trying to do might be creating reporting applications or if they're doing business intelligence, they're trying to grab data from two or three different sources. They need to create a common format and use that same data without having to replicate it over or without creating new databases. So data services and MetaMatrix helps you do that. It provides that foundational layer that helps you get the data. As far as your experience, who is responsible for fixing the issues?
Well within the data side it is an interesting question because when you're talking about ESBs and Java-based application servers, that is sort of the senior architects for the entire company and the development group. When you talk about data it straddles. There is certainly the more sophisticated companies with an architecture group are concerned with some of these data issues. But often there is a separate group called a data management group. They're responsible for warehouses, business intelligence tools, it's vague and it is a whole data area. Even some of the database administrators that manage the database in a company are different than the developers. So data is one area where it really straddles and it is unclear where the true responsibility is. In your opinion, who should it be?
Well, the senior vice president of architecture generally from an architecture standpoint has to be concerned how do you create applications and how do you integrate them. As well as, what is the best method for making your data available? How should it be constructed? How many databases do you need? What should they be? That person should be ultimately responsible. Just the one person and not a team of developers?
Well certainly everyone contributes, but if you look at one area where you have an architectural basis, it is generally that team that starts to straddle both the application development world and the database and data management world. Where does ETL software fall short of addressing these problems?
ETL is a technology for moving data. Taking data out of one database and moving it to another database and performing transformation to take it from form A to make it look like form B. It does not deal with abstraction, so this real-time use of data from multiple databases, it doesn't do any of that in real-time. It's really a batch process for loading a warehouse. So technologies like data services technology like MetaMatrix gives the data in real-time. It's not moving data to creating a warehouse, it is actually helping you not having to create a lot of additional data marks and replicate data over, you can go to the original sources and service-enable your data from those original sources. Because I have different applications and different application needs, ETL will help you make a new database twenty times, but now you have twenty different databases to manage with all that additional hardware and all that overhead.

What data services technology does is allows you to forget about copying the data over, put an abstraction layer over your initial data and make it look like it needs to look for the application that needs to use that data without have to ever create another copy. It is just the one data source. Is there a timeline for when the MetaMatrix code is going to published?
We made an announcement today. It is sort of part of that. There are three major components to MetaMatrix. There is a runtime which helps you grab the data, helps perform transformations, helps run distributing queries. There is an IDE or a development tool that helps you model the data so you can perform these transformations in run-time. And the third piece is a metadata management system. The metadata management system is a place where you store your models and you store your data. What does the source data look like? The DNA project that we're talking about is the metadata management system. With MetaMatrix it is really only used for data models. What we are doing is taking that metadata management system and are going to use it for the repository for everything to do with SOA. There is an incredible foundation of technology in that metadata management system to use it much beyond what MetaMatrix technology was using it for. So the first piece that goes open source is the metadata management system. We'll be making announcements pretty shortly about the rest of the roadmap for when things go open source, but within the next 10-12 months things will be open source as well. How long do you think it will take businesses to understand the need for open source governance because the idea of governance has been around for some time now?
In the next couple months we're going to have other people contribute to a definition. Other vendors?
Yes, we've already talked to a number of vendors including our partners and everybody is interested. We're going to start getting a dialogue with other vendors and end users to define what SOA governance should be. What is the mission? And then from that a number of software initiatives will follow. That will probably take a couple months to get everybody on board. We're moving forward with the DNA, that is going out. That will be a piece of how you define governance and then after that there will be projects spun. We need a registry project, a policy management project and I don't know if it will be five projects or ten because it will be based on the definition. So this will be the focus for how long?
Probably the next year or longer. We may have products that we introduce based on these projects and new projects come in we may expand the product. I think this will be a 2-3 year endeavor, but with real tangible pieces coming out of this like the repository and probably registry and expanding beyond that.

Dig Deeper on Topics Archive

JavaOne: JBoss on SOA middleware, Java EE and data services There's traditional middleware and then there's SOA middleware and determining where they might converge or diverge is still a work in progress for vendors, says Craig Muzilla, vice president of Red Hat Inc.'s Middleware Business Unit. At JavaOne in San Francisco this week to tout Tuesday's release of Red Hat's JBoss Operations Network (ON) 2.0 integrated middleware management platform, he acknowledged that the product is only a first step when it comes to SOA management. Before the technology can advance, vendors need to define what makes SOA middleware unique and where it fits into the larger middleware picture, which is a question he hopes to have answered by the end of the year. Since he was at Sun Microsystems Inc.'s annual Java conference, he also pondered the future of the Java Enterprise Edition. And finally, on the one year anniversary of Red Hat's acquisition of MetaMatrix, he offered an update on the emergence of data services.

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.