As a Principal Data Engineer, you should be an expert with data warehousing technical components (e.g. Data Modeling, ETL and Reporting), infrastructure (e.g. hardware and software) and their integration. You should have deep understanding of the architecture for enterprise level data warehouse solutions using multiple platforms (RDBMS, Distributed, Columnar, Cloud). You should be an expert in the design, creation, management, and business use of extremely large datasets. You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions, and to build data sets that answer those questions. The individual is expected to be able to build efficient, flexible, extensible, and scalable ETL and reporting solutions. You should be enthusiastic about learning new technologies and be able to implement solutions using them to provide new functionality to the users, scale the existing platform, and help drive engineering/operational excellence. Excellent written and verbal communication skills are required as the person will work very closely with diverse teams. Having strong analytical skills is a plus. Above all, you should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive change.
Our ideal candidate thrives in a fast-paced environment, enjoys the challenge of highly complex business contexts (that are typically being defined in real-time), and, above all, is a passionate about data and analytics. In this role you will be part of a team of engineers to create a platform to drive deep insights around GoDaddy’s business to help our customers succeed in theirs.
- Interface with our Business Analytics & data science teams, gathering requirements and delivering complete BI solutions.
- Mentor junior engineers.
- Model data and metadata to support discovery, ad-hoc and pre-built reporting.
- Design and implement data pipelines using Hadoop, spark, and AWS services such as S3, RDS, Kinesis, Glue, Redshift, & EMR
- Partner with security, privacy, and legal teams to deliver solutions that comply to GoDaddy security and privacy policies.
- Own the design, development, and maintenance of datasets our BA teams will use to drive key business decisions.
- Develop and promote best practices in data engineering, including scalability, reusability, maintainability, and usability.
- Tune and ensure compute performance by optimizing queries, databases, files, tables, and processes.
- Ensure data & report SLAs are met.
- Analyze and solve problems at their root, stepping back to understand the broader context.
- Own continuous engineering operational excellence of the datasets that drive key business decisions.
- Learn and understand a broad range of GoDaddy’s data resources and know when, how, and which to use and which not to use.
- keep up to date with advances in big data technologies and run pilots to design the data architecture to scale with the increased data volume using AWS.
- Continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for datasets.
- Triage many possible courses of action in a high-ambiguity environment, making use of both quantitative analysis and business judgment.
- Bachelor’s degree in CS or related technical field.
- 10+ years of experience in data architecture and business intelligence.
- 3 + years of experience in developing solutions in distributed technologies such as Hadoop, hive and spark.
- Experience in delivering end to end solutions using AWS services – S3, RDS, Kinesis, Glue, Redshift, & EMR.
- Experience in programming using Python, Java or Scala
- Expert in data modeling, metadata management, and data quality.
- SQL performance tuning.
- Strong organizational and multitasking skills with ability to balance competing priorities.
- Excellent communication (verbal and written) and interpersonal skills and an ability to effectively communicate with both business and technical teams.
- An ability to work in a fast-paced ambiguous environment where continuous innovation is occurring.