Managing and processing large volumes of daily incoming data for a £1bn-turnover consumer media organisation, making it available in a data warehouse for their analysts to query.
Our client, A&NY Media, is part of dmg::media – a large consumer media organisation with an annual turnover of over £1 billion. They manage the advertising for a number of major web properties, and their analysts needed up to date, detailed data in order to manage that advertising effectively.
We used our Kixi cloud-based system to create customised data processing pipelines, which bring together, scrub, and summarise over 10 billion datapoints from 18 different sources overnight each night.
The system uses Hadoop and other open source technologies, to reliably and efficiently process large volumes of data so that they are ready for analysts to use each morning. The post-processed data is available to in-house analysts to datamine through an Amazon Redshift data warehouse, which they can query with SQL.
This project embodied three key elements of our approach:
The system contains:
Having built the original system, we also operate the system day to day, including:
The infrastructure allows our client to join data inputs that weren’t previously joined, then roll this up to make big data small again. This means that they are seeing new trends that they didn’t see before, but are still able to pivot this information in Excel – where their analysts can extract most value from it.
The result of this insight has been a 360% increase in usage of their targeting products for improving campaign performance
It has also enabled true ‘big data’ analysis using MATLAB to analyse granular data points such as fraudulent IP addresses – this has allowed our client to reduce the amount of fraudulent behaviour they see more than tenfold.