An enterprise data warehouse (EDW) is central to the BI and analytics needs of an enterprise. With huge chunks of information generated from disparate sources, an EDW acts as a nerve centre of any business that wants to hear, study, and get insights on the social and online data generated by their targeted customers.
Enterprise data architecture of today
Standard enterprise data architecture comprises of various steps as given below:
Step 1 – A starting point of different sources of data extraction and collection – ERP, CRM, or OLTP
Step 2 – A staging process that stores raw data from these disparate sources into an Operational Data Store (ODS)
Step 3 – Transitioning the integrated data into a data warehouse or data mart
Step 4 – Segregating the data into dimensions and adding an access layer for reporting tools to pick up the data
Is this EDW future proof?
The EDW of today’s time needs to be scalable and ready for future demand and load. Some key needs to be met will include –
- High volume of information fed in
- Agile application rollouts
- Efficient process at both operational and analytics level
- Near real time and zero downtime
- Global yet local flavour
- Scale up at lower TCO
It is clear that today’s EDW will fall short of meeting tomorrow’s demands due to below mentioned challenges:
- Inability to handle unstructured data
- ETL conflicts with analytics layer
- Less than optimized use of storage
- Expensive and inefficient backups
- Expensive disaster recovery
The answer to these challenges is by introducing Hadoop into your EDW architecture. The Open Source Apache project dominates as the most preferred big data platform for giant storage and processing of data, with its below advantages -
- Popular enterprise framework for large scale data processing
- Uses commodity servers
- Massive scalability option
- Distributed and fault-tolerant
Hadoop’s USP
The below are some advantages offered by augmenting Hadoop to your EDW.
- Hadoop incurs lower costs yet faster recovery
- Hadoop ensures lower load on primary EDW
- Hadoop operates at just 10% of proprietary EDW and yet offers better utilisation of storage resources
In addition to the disparate data sources, you can now also introduce unstructured data on the Hadoop clusters. This can then be used for big data analytics or passed through a Teradata EDW for advanced BI and reporting.
Hadoop Use Cases
- Offers 360ᵒ view of stakeholders and customers
- Allows churn analysis, fraud detection, risk analysis, and operational analysis
- Provides powerful search capabilities and massive data in raw format as a data lake
- Drives targeted marketing for better business success
Hadoop EDW integration Success Story
Challenges and objectives
CIGNEX is proud to be associated with helping a major US based bank to transition from proprietary EDW to Hadoop architecture. The transition was based on the below key challenges:
- Inability to scale up to support new use cases
- Save on costly upgrades that would cost hundreds of millions of dollars
Keeping these challenges in mind, the client had two major outcomes expected from the transition –
- Eliminate costly upgrades permanently and reduce maintenance costs to minimum
- Improve the scalability of the data architecture as per business needs and use cases.
Solution Architecture
The proposed solution has a Hadoop integration built in between the mainframe and the Teradata EDW. So all ETL processes including data integrity, data movement, and data verification happened within the Hadoop layer by our data scientists and engineers. This was then made available to data analysts for their analysis purposes
Solution Approach
- Determining objectives and use cases
- Determining the right solution architecture blended with the most appropriate technology
- Carrying out strategic execution steps – implement, deploy, verify and test
Key Design to Production Challenges
From a business operations point of view, some of the challenges encountered were –
- Determining and assessing the right use-case discovery
- Deriving the best value from data
- Devising an optimum project management framework
- Finalizing a right fit technology strategy
- Getting the right talent for the various responsibilities
- Designing and enforcing an industry standard Process
It also brought to fore some crticial technical issues like:
- Getting the Data Structuring right
- Functional Gaps
- Floating Point Computation
- Source of Truth
- Key Management
- Verification, Integrity & Quality
- Getting the right data architecture
Advantages offered by the transition
Hadoop integration to the company’s EDW provided the below benefits to the client -
- Performance – Enhanced SLA for multiple workloads
- Capacity – Improved capacity to accommodate much higher volume of use cases
- Advantage Open Source – The solution also integrated multiple Open Source technologies to further boost the efficiency of the entire system
- Amazing TCO – The elimination of licensing costs and reduction of regular maintenance costs helped the company reap rich dividends in form of much higher TCO.
Interested in using the power of Hadoop EDW Integration? Find out more about our proficiencies in this domain.