The Challenge
GoDaddy is world’s largest domain name registrar company, with more than 16 million domain name registrations and 13 million customers. GoDaddy collects all the marketing related information for its customers to help them along in their customer’s lifecycle.
This marketing data is highly messy as it consists of data from Google Adwords, yahoo, bing etc. and thus it was a huge challenge for the GoDaddy team to get insights from this data.
Why Amazon Web Services
GoDaddy stores information on Amazon Simple Storage Service (Amazon S3), and processes data in parallel with Amazon Elastic MapReduce (Amazon EMR). EMR decouples compute and storage, giving us the ability to scale each independently and take advantage of the tiered storage of Amazon S3. With EMR, we can provision one, hundreds, or thousands of compute instances or containers to process data at any scale. The number of instances can be increased or decreased automatically using Auto Scaling (which manages cluster sizes based on utilization) and we only pay for what we use.
Running Critical Applications on AWS
Salesforce data is extracted in Amazon Simple Storgae Service (S3) in CSV format through API. Amazon EMR runs a Pyspark framework, which has automated the entire process of transformation, cleanup and loading data into S3 ADS Layer. Amazon Redshift table is created on top of this S3 ADS Layer. Tableau is connected to this redshift table via the Redshift connector, which allows analysts to analyze data and generate useful business insights.
The Benefits
The primary benefit to GoDaddy of moving to AWS is that enough resources are available to provide the services to customers of all sizes and onboard those customers in days. To generate deep insights with a more effective reporting process, GoDaddy turned to Virtue Tech to help deliver greater value for its client base by shifting time away from manual data wrangling and towards better analysis and insights.