Manage Learn to apply best practices and optimize your operations.

Who moved my legacy cheese?

Software tools can unveil legacy code patterns and speed application modernization, according to a cryptologist who worked to provide a visual assessment of years of spaghetti code.

Today's technology decision makers are surrounded by a paralyzing array of questions and uncertainties when it comes to application modernization. What's equally paralyzing is how many applications can be categorized as "legacy" because they don't meet the needs of the business. Not coming to terms with your true legacy footprint is the primary reason why enterprises fail to modernize.

In order to come to terms, enterprises need a thorough assessment and clear understanding of legacy applications – something like a blood test that checks cholesterol. Once the doctor runs your blood work and sees how high your cholesterol is, it becomes a bit silly to deny all the cheeseburgers you've been eating because the evidence is there in the results. It is the same with legacy applications – once you break down the code and know where the issues are, you can determine the right path to modernization.

My job as an EDS legacy transformation analyst is to advise enterprise technology teams on methods of change and application modernization. As you'd imagine, selling "change" and a "brighter hope for business futures" resonates quite well with business leaders. But when it comes down to the real change agents – the technology professionals – concepts need to be articulated in tangible actions. To achieve that, I leveraged my years of experience as a cryptologist in the U.S. Navy and in systems architecture to invent tools that provide a visual assessment of legacy code.

Understanding the problem
Despite predictions of its demise over the years, COBOL is still alive, and there are billions of lines of COBOL code in active applications today. Written over the past 10 to 30 years, this code has been re-used and re-written time and time again making it difficult for technology executives to understand exactly what their footprint encompasses. In my experience, when I ask how many lines of code the client has, they estimate it is 3 to 10 times higher than it actually is. This is exacerbated by the fact that the applications' original developers are often no longer available to provide insight and/or guidance. Numerous articles – including these in The New York Times, InfoWorld and Dr. Dobb's Journal – discuss the increasing demand for and shrinking supply of COBOL experts.

A thorough assessment of the current environment is needed to determine the right path for moving forward.

The first step – Seeing the x-ray 
This problem of presenting an accurate picture of legacy applications was one that I found myself spending countless hours trying to solve. By using a science known as Social Network Analysis, a science that has been used for decades to study social relationships, and leveraging the human mind's ability to recognize patterns and colors, I invented a set of tools called the HP Visual Intelligence Tools. For me, it was the visual element that proved critical to rendering the problem in a compelling manner.

With the HP Visualization Tools, we can analyze millions of lines of code within minutes and model data that is then rendered in graphical tools such as Miner3D and GUESS, a graph exploration tool originally created by HP Labs. The result is an enlightening visual analysis of legacy source codes that reveal patterns of similarity and uncover the unintended designs of over 30 years of copy-and-paste reuse.

You might think of these graphic representations as x-rays or perhaps infrared photography. They give the doctor or scientist a vision of otherwise hidden patterns. In the case of large, opaque and monolithic legacy applications, these patterns provide a means to devise transformation approaches that exploit the similarity, avoid duplication of efforts, and attain economies of size and scope. The patterns also identify cross-cutting code (security, logging, monitoring) that may not be relevant when reengineered to frameworks that readily provide such functionality with using hand-written code.

Fig. 1 - A legacy x-ray: This randomly arranged graph of nodes represents 10 million lines of code comprised of 7,500 legacy COBOL modules. Each dot is a node and the lines between them represent the amount of cloned code shared between the modules.

At first glance, little value is revealed from the above image. It looks more like modern art than legacy code. Social Network Analysis, with its many layout algorithms, allows these hidden patterns to be exposed.

Fig. 2 - Legacy DNA: Using a layout pattern called GEM (Generalized Expectation-Maximization) a portion of the above image in Fig 2 reveals definitive patterns, exposing hundreds of clusters ranging in size from 2 to 30 or more.

Once the patterns have been revealed to legacy subject matter experts, discussions now shift to the meaning of the patterns. What do the clusters of similarity tell us of the unintended design? Is there opportunity to view these clusters' constituent tasks? Could these tasks form the basis for a transformation strategy?

Fig 3 - Color and patterns: The colors and patterns of the GEM layouts provide enormous insight into complex relationships between legacy code.

These tools separate code that supports critical business functions while identifying code that may be replaced by modern frameworks and tools. Seemingly insurmountable transformation tasks come more clearly into view.

Legacy clones – creatures of expediency
Cloned code has, at its roots, the very human behaviors that created it. The code has multiplied over the years by cut-and-paste reuse. Imagine it's a Friday afternoon and Harry needs to write a new subroutine for an insurance policy calculation. He turns to Bob, knowing that he recently wrote similar code and asks to borrow the subroutine's code: cut, paste, modify a line here, add a line there and that's that. Multiply this action year after year for three decades. The code grows, the duplication continues.

There is often more than one way to visualize data. The HP Clone Set Visualization provides another dimension, allowing us to see the spectrum of cloned code: reuse driven code and cross-cutting code. This visualization identifies duplicate code and arranges it by frequency of occurrence and total size of each cloned code set to avoid repetition of modernization efforts. Organizations must be aware of the cloned code that exists throughout their environments to contain costs and more effectively modernize applications.

Fig. 4 - Cut and paste code: Just as the previous views revealed clusters of similarity, this view depicts the two major types of cut-and-paste clones: reuse driven and cross-cutting code

The business case for change – and choosing the right change 
These inscrutable legacy applications often force organizations to rely on applications that no longer meet their business needs. These applications have resisted change for decades. Uncertainty is the enemy of any worthy business cause and when it comes to modernization, uncertainty almost always wins out. This uncertainty can be conquered in the same way doctors and scientists have, by peering beneath the surface before acting, taking the X-ray and seeing what's truly underneath.

But an X-ray alone only describes the problem – not the solution. With further assessment, the doctor can develop a health plan to solve the problem. In the case of legacy applications, the health plan is a modernization roadmap. Developing a roadmap requires a thorough assessment to select the right modernization strategy, and the HP Visual Intelligence Tools are an important part of that process. One size does not fit all. In many cases, clients require multiple modernization strategies to adequately address risk levels, benefit potentials, and business priorities of each application.

When modernizing, it's important understand cloned code, its magnitude, its patterns, its similarities and its unintended design. Why rewrite code when it can be more effectively implemented on more appropriate frameworks and tools? No technology department would set about to reinvent a report writer or data integration tool. However, this is precisely the outcome when legacy code is converted in its entirety to yet more hand-written or machine-translated code.

Woolsey and Swanson once wrote that people would rather live with a problem they cannot solve than accept a solution they cannot understand. This simple quote lies at the heart of the legacy dilemma. For decades, legacy applications have been the problem that cannot be solved. Compelling visualizations can bridge the gap between unsolvable problems and tangible solutions.

About Steve Woods
Steve Woods, inventor of the HP Visual Intelligence Tools, is a Legacy Transformation Analyst for the Application Modernization and Migration practice, EDS an HP Company. After a 12-year career working in the National Security Agency and the White House Communications Agency as a cryptologist, he worked as a program administrator in high-tech manufacturing before moving into the IT industry.

Steve has designed highly scalable architectures that are currently providing mission-critical services at corporations worldwide. His architectural designs include the incorporation of graph theory, model-driven design, generative programming, aspect-oriented programming, and meta-programming. More recently, he has been instrumental in the formulation of HP's mainframe modernization methodologies, including the adaptation of architectural trade-off analysis and software product lines. Steve has created several legacy analysis and visualization techniques.

Steve holds a Master of Science in Systems Management from the University of Southern California, and a Bachelor of Science in Management from the University of Maryland.


Dig Deeper on Topics Archive

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.