5th Principle of Modern Data Integration
Posted By Jonathan Wu
20 May 2016
This post is the fifth and last post is this group that are dedicated towards explaining the principles of modern data integration, which is an optimal approach towards addressing modern and big data needs.
The first principle of modern data integration is to “take the processing to where the data lives.” The objective of the 1st principle is to utilize host systems for specific processing in order to create efficiency by preparing and moving only the data that is needed. The second principle is to “fully leverage all platforms based on what they were designed to do well.” The 2nd principle was defined to create an optimal balance of processing and workload by utilize the source and target platforms for the capabilities that they were created and available to do. The third principle is to “move data point-to-point to avoid single server bottlenecks.” The objective of the 3rd principle is to move data in the most efficient and fastest manner possible. The fourth principle is to “manage all of the business rules and data logic centrally.” The objective of the 4th principle is to provide end to end visibility of data lineage and efficiency associated with managing and reusing business rules. The fifth principle is to “make changes using existing rules and logic.” The objective of the 5th principle is to quickly adapt to changes in new data and new platforms by minimizing the amount of required effort.
Many organizations that have established information management environments are looking to modernize their infrastructure to handle the ever increasing sources of data and information demands, while bring down the total cost of ownership. Data storage and processing technologies, such as Hadoop, are replacing RDBMS (relational database management systems) and MPP (massively parallel processing) platforms due to the capabilities and cost savings. If you look back in time before Hadoop, MPP displaced RDBMS as the technology for data storage and processing. Before MPP, RDBMS displaced mainframe file storage technology. Data storage and processing technologies are ever evolving. Today it’s Hadoop, but what is next? Every time there is an advancement in data storage and processing technologies, new data integration processes and/or technologies must be implemented. Most of the time, this effort is a tremendously manual exercise of creating data movement and transformation scripts from one platform to another. It doesn’t have to be this way.
Imagine an architecture for data movement and transformation where the business rules and logic are translated into the instructions for the corresponding platform. The translation capabilities contain the native language of each platform and the ability to convert from one native language to another. That’s the essence of the 5th principle of modern data integration. The ability to separate business rules and logic from native platform language as well as the ability to convert from one native platform language to another.
When Sprint decided to implement Hadoop, they chose Diyotta’s technology because of the five principles of modern data integration. As a result, Sprint was able to implement their first project using Hadoop three times faster and with a cost saving that was 75% less than the contemplated approach of manual scripting. No other technology came close to Diyotta’s capabilities.
To learn more about Diyotta and the benefits that Diyotta’s technology has provided to Sprint, please visit our website: www.diyotta.com.