-
Marketplace
-
Channel Resources
Articles from this Site
Dickinson College looks ahead with Rapid Insight Analytics and Data Integration
Singapore's Premier Regional Search Engine Deploys Pentaho
Troux Announces Integration with Project and Portfolio Management Software
expressor Updates Semantic Data Integration System
Spectrum Health Selects InterSystems Ensemble
White Papers
Why and How to Build a Continuous Integration Environment for the .NET Platform
Informatica - Handling Variable Length Files Using XML
Maximize Business Value with Right-Time Information Using Data Services
EAI - Refine the Economics of Integration
Profiling: Calculating Return on Investment for Data Migration and Data Integration Projects
Web Seminars
Espresso Shot: Optimize Sales and Marketing with Advanced Reporting and Dashboards
Trends and Tactics for Improving Data Quality
Getting In Synch: Creative Ways to Reconcile Data Between Apps
The Trouble with Success: Methods for Addressing Shrinking Batch Windows
Books
One brand, one Web site! DM Review is now the home of all the content you're used to at BIReview.com and much more. If you are registered at BIReview.com, you're already registered at DM Review. If not, take just a moment to sign up for all the free services we have for you at the new DMReview.com.
A Plan for Successful Legacy Data Conversion
As information technology systems continue to grow and evolve, so too do the number of legacy conversion projects. Advances in software and hardware technology along with newly developed standards make it important for businesses to keep their systems updated. If a legacy system can no longer conform or comply with new standards, it will become a hindrance to the business. Updating the system to a newer language or hardware present a few of the challenges required to remove the legacy system. The biggest challenge, however, is often how to successfully migrate legacy data into the new environment, due to the complexities involved in removing a legacy system and retaining the required business data. It is important to carefully plan the migration of the legacy data to the new environment. This article outlines a plan to help ensure a smooth data conversion from a legacy system to a new environment.
The first priority of any legacy conversion project must be to extract the existing data. Rarely is the extraction a case of migrating data from the source to the destination without having to perform data corrections or transformations. Conversions to a new system generally require mapping old data elements to new elements in a new data structure. The term data structure refers to any RDBMS, XML schema or Excel spreadsheet that generally is the data store for the new application. The table structures that existed in the legacy system may no longer be valid in the new system. Therefore, mappings and transformations may be required in order to load existing legacy data into a new data structure.
Legacy Data Extraction
There are common mistakes that companies make when dealing with legacy data extraction. One is that all of the known elements in the legacy system are not extracted in the initial export. Another mistake is the failure to remove data from the legacy system and place it into a universal data store for easy of use.
Always extract the data from the legacy system in its raw legacy stored form to a delimited flat file format. This serves two purposes. First, a delimited flat file enables every technology to access the data via an import program. Furthermore, if the destination data structure is not known, writing to a flat file enables progress on the data extraction side of the project while the data structure for the new system is being decided. Once the new system's data structure has been chosen, it can easily be loaded from the data files and does not involve writing code against the legacy system to extract data. A process to get the legacy data was completed when the extraction to the flat file format was developed. Second, every field from the legacy system should be extracted regardless of whether it is deemed a useful business element. By extracting all known data elements up front, there is no need to rewrite the extraction code to include any additional fields that were initially overlooked. This saves time in both coding of the extraction code and processing the legacy system to now extract new fields.
Extraction of the data in its raw form will make it easier to audit. The raw form is simply the way the data appears in the legacy system with no transformations applied in the extraction. For example, the legacy system stores gender as 1, 2 and 0. The new system stores gender as M, F and NA. If any data conversions were applied during the extraction to the flat files it would make auditing of the legacy data more difficult. In order to determine where a data problem exists, you would have to go back to the legacy system as the source to verify any audits. However, when the flat files are used as the data source for the new system, they contain the data as it looks in the legacy system. Audits can be performed on the flat file data as the source instead of the legacy system.
Documentation of Legacy Fields
Each data element in the legacy system should have defined meta data as well as documented valid values that apply to specific time periods. For instance, if field "X" was used from 1974 - 1980 as a business element; it was defined one way. Then in 1980 field "X" was no longer needed because of a business decision. Due to space constraints the same table column was used and populated with data from a different business attribute and definition. This scenario makes migration of legacy data very difficult without a full understanding of the data source as it was used. Collecting meta data information is tedious and time-consuming; however, it will make the legacy data conversion much easier saving time, money and resources.
Data Merge Rules
Once the legacy data has been documented and extracted then business rules must be applied to determine which system has the best data quality. In order to have a successful migration of legacy data into a new data structure, transformation rules must be determined. If the legacy system has two tables that both contain a record for a customer and each table contains a field to collect customer gender, rules must be documented to determine which set of tables is more accurate and how to merge the records together. Some questions that must be answered depending upon your environment are:
- What is a transaction? Is it time driven or event driven?
- What are the valid values allowed in the new system?
- How to handle invalid or null data in the new system?
- What determines when a record should be considered historical?
- What is the retention period of data in the new system?
- Is this field used for reporting?
- What is the data quality of this field?
The more accurately and completely these question are answered will increase the overall success of the legacy data conversion project.
Order of Elements
Over the years, organizations have modified and added to their existing systems. Unfortunately, organizations rarely removed old data elements from a legacy environment. Because of the sheer number of elements that are available in most organizations, it is important to plan the documentation process.
All of the elements will exist in the extracted flat files format. This can create quite a lot of elements for research. Therefore, focus on the elements that are most relevant to your organization and the new system. These elements are a great place to start defining meta data definitions and documenting merge rules. Because these are the systems most important elements, it will be easier to get the ball rolling defining requirements. Business users will be better able to answer any questions that you have regarding commonly used elements. Performing data quality audits will also be easier because of the high profile of the elements. Users will know the data that they are expecting in the element. Once the business users get a feel for the data that is required to migrate to the new system they will be more inclined to do research on the less commonly used data elements.
Tools of the Trade
There are many tools available today to help speed the development of data migration, meta data management and data auditing. As part of the legacy conversion planning, map out the path the data will travel. Then determine what types of tools will help facilitate the movement of data between various points.
It is important to understand what technology can help achieve your goals along the way to data conversion and integration. Some companies' may require a data integration tool, audit tool and an ETL tool. While other companies only need an auditing tool for a successful migration. Set up a pilot program of selected organizational key performance indicators to test each tool that is required. Download evaluation versions of any type of tool that is required. Test the ability of each tool in your existing environment and track what functions are good and need improvement to help make a decision on which tool to purchase.
Once a decision has been made for the tools required for the legacy conversion. Build a complete prototype beginning with the exported data on the flat files through the loading of the data into a data structure. You be able to build confidence in how to correctly move the data from the legacy system to the new system. As well as gain a better understanding of how to use the tools to make the legacy conversion as quick as possible.
Reporting
Often overlooked in a legacy conversion project is the area of reporting. An analysis of what reports and elements are being created from the legacy system must be performed along with an analysis of how the business wants to report on the data in the new system. For example, a report that is generated in the legacy system may not be able to be reproduced in its entirety in the new system. One reason for this is the new data structure and data migration. An old way of tracking history in the legacy system may no longer exist in the new system. Therefore, the business must be involved and document how they need to see the data from a reporting standpoint of the new system.
The process of removing a legacy system from a production environment is a complex task. In order to ensure a smooth transition from a legacy system to the new system, detailed planning must take place. Having a plan and understanding where your legacy data is going is just the beginning. Before trying to tackle the entire legacy project test out prototypes and experiment with what works within your organizations' requirements.
Derek Wilson is a database and business intelligence consultant.
For more information on related topics, visit the following channels:


