4th Principle of Modern Data Integration
Posted By Jonathan Wu
09 May 2016
This post is the fourth of five that are dedicated towards explaining the principles of modern data integration, which is an optimal approach towards addressing modern and big data needs.
The first principle of modern data integration is to “take the processing to where the data lives.” The objective of the 1st principle is to utilize host systems for specific processing in order to create efficiency by preparing and moving only the data that is needed. The second principle is to “fully leverage all platforms based on what they were designed to do well.” The 2nd principle was defined to create an optimal balance of processing and workload by utilize the source and target platforms for the capabilities that they were created and available to do. The third principle is to “move data point-to-point to avoid single server bottlenecks.” The objective of the 3rd principle is to move data in the most efficient and fastest manner possible. The fourth principle is to “manage all of the business rules and data logic centrally.” The objective of the 4th principle is to provide end to end visibility of data lineage and efficiency associated with managing and reusing business rules.
Regulations such as Basel III, Comprehensive Capital Analysis and Review (CCAR), Dodd-Frank Act, Sarbanes-Oxley Act and others are forcing organizations to have the ability to substantiate information that they report. The data lineage requirements provide a high level of accountability and integrity of the information that can be very challenging to implement. The ability to trace information from a report back to the original source or system of record can be an extremely complex process that is amplified by the number of and the type of transformations, integrations and blending activities that have occurred. Another significant factor that compounds the problem for organizations is the use of multiple data integration technologies and installations because the business rules and meta data are often stored is standalone repositories, unless there is some form of aggregation or unification effort.
Operating dispersed integration technologies creates chaos in today’s changing data ecosystems and complicates compliance with regulations. In a dispersed environment, business rules and data logic are created and maintained separately in each traditional data integration technology. There is no sharing or reuse, just consistency definition headaches. New data and modern data platforms must be managed centrally with all business rules and data logic in a single design studio. This can only be achieved in an architecture where a central design studio is completely separate from local processing agents using native functions. Managing all business rules and data logic centrally provides complete transparency and accessible lineage to aid in compliance with regulations. In addition, a centralized architecture enables maximum reuse of business rules and data logic. For example, the business rules associated with data validation can be defined once and easily applied and maintained in a reusable central architecture that is provided for by modern data integration.