A leading legal costs insurance provider wants to make more rational decisions by becoming more data-driven. To support this strategy, they want to build a data-driven enterprise platform for data analytics and self-service business intelligence.
About the Client
Problem Statement
The current on-premise solution on a relational database did not support this use case anymore due to multiple reasons:
- GDPR was not supported by the existing platform
- The current solution had scalability issues
- Semi-structured data, such as email and call notes, could not be processed efficiently
- The solution was limited to on-premise and could not integrate well with cloud services
The Challenge
- Evaluate several technical setups across different cloud providers
- Calculate and compare the total cost of ownership (TCO)
- Identify the right technology stack for the new data warehouse
- Confirm the preferred Data Vault 2.0 is applicable
That’s why they were looking for a vendor-independent consulting partner with deep knowledge of the selected data warehouse approach and several technology stacks.
The Solution
- Jointly developed a GDPR compliant data lake on Amazon S3 and HDFS as an alternative
- Built solutions on three different database technologies, including Snowflake DB
- Implemented all use cases of the initial proof of concept document using the Data Vault 2.0 approach
- Exceeded the performance requirements on all evaluated technology platforms
- Calculated and compared the TCO of each solution
Tangible Results for the Client
The project delivered two important assets to the client:
- A confirmation of the technical feasibility of each potential solution
- A decision guidance based on the individual TCOs and other aspects of the solutions
Both inputs were used by the client to select and procure Snowflake on AWS as the final solution. Scalefree assisted them during the introduction of the technology stack and continued to implement the later enterprise data warehouse on this stack.
Technologies used
- Snowflake DB including Snowpipe
- AWS, S3, Redshift, Kinesis
- Cloudera Hadoop, Impala, HDFS
- Python
- Vaultspeed