Just how available should production data Be?
By Aaron Goldberg
One of more perplexing dichotomies that exists in this age of eBusiness is just how much access the end user and Web community should have to your central, production data. The "open it up" argument has merit. For example, if you only have one data set you need access to it in real time data because leveraging that central data is required for many eBusiness applications to work. Yet, the arguments to "lock it down" are just as powerful. There is the ever-present threat of hackers and even state or corporate sponsored intrusions, the loss of integrity from data entered by untrained users, and of course out right theft of information.
Further confounding the problem is that the individuality of organizations is such, that there is no simple, off the shelf set of standards that will fit everyone. So rather than try to set standards, here' s a framework for how to develop policies for determining how "open" or "closed" your production data should be.
Before we get started, remember that basic IT operations standards still need to be in place. That means that viewing and editing privileges need to be handled differently, and that concurrent updating needs to be prevented. It' s also important to realize that few eBusiness applications today can rationally justify a significant widening of the base of users that are allowed edit privileges.
One of the first activities to complete when developing a plan for production data access, is to build a structure of the different levels of importance or restrictions that are found in the data sets. For example, there is some data that should have almost no access and be highly secure. Think payroll records, HR information, check processing systems, and the like. Then there is other production data that might be more open, things like inventory levels, credit rankings or limits, shipping information, and similar data. This second category of more commonly used data is still production data, but it does not have to be as critically secure and it has more widespread use. It's wise to take each data set, and assign it to a level of accessibility.
Typically, most organizations have three to four levels of availability for production data sets. The first level is production data that is made somewhat broadly available. This usually means a larger user population than might have been common in the pre-eBusiness days. The next level of production data security is for information that has a limited audience, but may have greater read only visibility as a result of an eBusiness application. Some companies will have two distinct levels of access for this type of data. A broader set of data that is used in eBusiness, and a more secure level that is not made more broadly available even with eBusiness applications. Finally, there are highly secure, limited access data sets.
This plan for developing guidelines for data set access has to be done in conjunction with the plans for new applications. For example, if inventory information is a highly secure, secretive data set, planning to install an exchange is probably a bad idea because of the visibility and access to inventory information that is necessary. The same can be true for sales/contact information and CRM applications. The simple truth is that you must have a plan for what data you want to make available, and how it will impact the decision to implement new applications that might need new levels of access to that information.
There are of course some other approaches that can be used in addition to providing access to production data. These include:
Replica Data Bases - This involves creating replicas of the real production data. Pros: Strong protection for production data, no access to critical systems. Cons: Cost of supporting more on-line data sets, synchronization of data, planning when to produce replicas
Co-Locating Data Bases - This is a fairly new approach that involves having two physical locations where the data is stored. One location may be broadly available, and the other less so. Similar to Replica approach. Pros: Protection of one data set, can use Co-location to provide higher bandwidth access, improved response times. Cons: Synchronization, cost of managing/owning two environments.
Subsets of Data Bases - This approach involves taking only pieces of production data and making it available to the broad user audience. Pros: Limit the risk of larger data sets, easier to decide what should be available. Cons: May not have all the data necessary to drive the application, Synchronization, Process of developing subsets.
Frozen Data Sets - Some firms decide that production data can only be used from the last roll or publication of the data. This data is not real time, but generally has most of the information. Pros: Broad data sets are available, no access to production data, fairly easy to administer. Cons: Data is old, often drives telephone queries to those that have real time access, mistakes made from decisions from non-current data.
Finally, one of the most useful exercises in determining what level of production data needs to be available is to sit with end users of the application and watch how they interact and work with the application or with mock-ups of the application if it is not yet installed. In a large number of cases, it becomes clear that the end users have a different view of what is needed than those who manage the data sets. Much of this is due to the very different perspectives that, for example, a sales person has with regard to order status data, than an IT professional.
This activity is hugely important because it can often provide the kind of detailed input that can lead to better decisions on what level and amount of production data is needed in the general user audience. This is not a replacement for the planning process, but rather an important input point that can help fine tune the final decisions.
The entire notion of making a traditional firm into an "eBusiness" depends on the ability to provide critical information to those that need it. However, that simple statement creates a whole range of issues regarding the access to, and availability of, key production data sets. Having a plan, and then checking it against the real world' s requirements are two of the more important steps in answering the question of what kinds of information should be available and to whom in eBusiness.
Copyright 2001, availability.com. Reprinted by permission.
FOR MORE INFORMATION: