Data consolidation is a process of capturing data from multiple sources and integrating the information into a single, persistent data source for use by subscribers in an enterprise. Data consolidation is a part of three data integration techniques–the other two being data propagation and data federation. Data propagation is the process of replicating data from different sources in different locations, and data federation allows a virtual unified view of source data files. Data consolidation technologies include ELT and ETL.
Extract, Load, Transform (ELT) defines a process in which the system's architecture transforms a bulk amount of data after the data has been loaded in a target database. After the raw data is loaded, it is transformed and sent to tables where they can be accessed by individual end users. ELT systems are categorized as "pull systems" because the data transformation is initiated on-demand, by end-user instructions or by predefined publication schedules. ELT allows users to work on transformed and published data, when the information is "pulled" after the loading cycle.
Extract, Transform, Load (ETL) is another data consolidation technique that extracts data from one or more sources, transforms the data according to prescribed rules and then loads the resulting data into target systems or specified file formats. ETL, as distinct from ELT, transforms the data before the loading cycle, meaning that the data is streamlined, reformatted, standardized, aggregated or subjected to any other data manipulation rule laid down by management, before being fed to an end-user interface.
Extraction is the first stage in both data consolidation technologies. Data extraction may be of very high volumes, from multiple heterogeneous sources. Extraction may be from relational, hierarchical and object databases, RFID systems, XML documents, web services and packaged applications including SAP and PeopleSoft. In addition, extraction may be carried out on files with unstructured and structured information, and external data purchased from outside sources may also be included, depending on industry and data relevance.
In ELT and ETL systems, the transformation stage may be very varied, ranging from simple procedures, such as file or type conversions, to complex operations, such as logic-based manipulation and integration. Data transformation is a very robust feature of ELT and ETL systems, enabling businesses to forward timely, relevant information to their managers and supervisors for better decision-making. The transformation allows enterprises to customize data and produce tailor-made information for use internally. Depending on industry, and scope and volume of business, companies may streamline the transformation process or carry out comprehensive operations on collected data.
Loading refers to transferring data to a target application. In the case of ELT, the data loaded is unprocessed, but in ETL, the data is loaded after being processed. The loading stage in both systems may be modified based on data acceptance metrics, allowing both bulk and trickle modes for each data element in each loading cycle. The choice of a bulk or trickle mode in data loading depends on emphasis on data latency for the integration process, where the trickle mode suggests greater importance on minimizing latency.