Sergey Nivens - Fotolia


Modernizing EA models for big data

Should data be governed independently of other data processes? Expert Tom Nolle discusses how data-centric EA models might be necessary.

One of the fundamental changes that has emerged in IT is the move, driven both by the cloud and by big data, to...

separate data into an IT element that effectively stands alone. The question is how far to take this: Does the notion of data independent of process mean that big data technology should be governed separately from the rest of IT? For enterprise architects, answering that question starts with BPM trends -- addressing the way that big data is changing the business-to-IT relationship, evaluating a data-centric versus process-centric EA model approach and, finally, working out how to transform to a data-centric EA approach, if it seems necessary.

Business process management is, in many ways, an end product of a good enterprise architecture strategy. The goal of BPM is aligning applications -- and, indirectly, IT infrastructure in general -- with business goals. Historically, BPM has been exactly what its name suggests -- a process-centric vision of IT. Now, it seems likely to become a more data-centric vision, and that shift is perhaps the most important reason why EA models need to take another look at data centricity as an approach.

The primary driver in EA practices is increased demands for business agility, meaning IT support of business agility. The notion of workers aligning within a static set of IT processes suits the traditional way of planning and implementing applications, but as application planning becomes focused on point-of-activity empowerment of mobile workers, the neat structure of processes is harder to find. IT does what's needed at the moment, and the worker guides rather than being guided by it.

Many believe that a process-centric vision of IT is limited because what's really coupling business activities from end to end is data. Modernists believe that, rather than modeling business processes, architects should model entities. Entity modeling is, in most cases, shorthand for entity relationship or ER modeling, and it starts with data elements and traces the way they are created and used. This approach places processes in a data context and aligns business processes based on the data elements that they share. ER modeling could be a subordinate part of BPM and EA or a new way of looking at both.

Big data -- the virtual or physical combination of enterprise data into a single repository that contains everything in both business scope and time -- is a strong driver for an entity-modeled vision of EA. ER modeling, based on diagrammatic depictions of process linkages via data derivations, could then become a logical alternative to process-based EA models. However, data-centric EA isn't for everyone, and where it doesn't work, it could truly create a mess.

A good opening question for the EA-ER combination is: How big is your data? Enterprises are split in how they visualize their databases. In some cases, they see data as being married to an application or family of applications. This view creates a big data model that's more abstract than real, because individual databases retain their identity. Others have started to think in repository terms, separating databases from applications and considering data as a whole. If your company is in the second camp or is clearly evolving to that position, then you are a candidate for an independent ER model and even for ER-driven EA.

Another aspect of size is volume and scope. Companies that make operational decisions based on data analytics across a wide range of business applications and that use both current and historical data in their analysis will benefit from ER planning and find it easier to justify ER centricity in their EA modeling and plan development.

If any of the "bigness" factors apply to your business, then ER is something you'll almost certainly need to apply in your EA practices. If all apply, then it might make sense to make your EA be ER-centric. In the former case, there are ample tips on adding ER to EA/BPM. If you fall into the latter case, then the best path forward will depend on how much you've committed to an EA model already, how much you like the model and how much change in applications and data usage you expect down the line.

The best approach overall, for any of these groups, is probably to start by exploring just what can be done with ER modeling and practice management. Lucidchart offers a free online tool to test the waters, and StarUML has a free ER extension. TerraER is an open source tool for learning ER techniques, and of course, there are commercial products available as well, such as Idera. Try to pick an example big data application that involves a typical but manageable number of elements to start in order to gain some insight into how well ER will work for you.

If ER doesn't seem to add much insight to your business process analysis, then you should consider (or stay with) EA practices and tools that are traditionally process-driven. As noted above, you can still incorporate ER in most EA models, but ER is most often based on the Chen (named after the founder of the approach) model, which, of course, is very different from typical EA models. Thus, it may be best to look at harmonizing the results rather than unifying the modeling.

Big data is a driver of change, but it's also driven by it. As worker-to-process relationships become more extemporaneous and variable, the data requirements to support the worker expand. What was once an application-specific, partitioned database framework increasingly becomes a generalized repository. There's a kind of closed-loop positive feedback at work then. Big data demands more ER involvement, but more ER value is also a symptom of a move to big data-centric database planning.

Dig Deeper on BPA and BPM