Responsibilities:
- Work with data and analytics experts to strive for greater functionality in our data systems and can install Airflow from scratch and configure and maintain administer it.
- The engineer should be an independent guy and should keep up to date with the Airflow open-source community enhancements. He should work with other team members and communicate on planning to make sure nothing gets impacted because of the installation or configuration of Airflow
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Work with stakeholders including the Product, Data and Analytic teams to assist with data-related technical issues and support their data infrastructure needs.
- Implement CI/CD pipelines
Requirements:
- Experience with building data pipelines using EC2, EMR, RDS, Redshift, Spark, Python
- Understands data and ability to discuss with analysts and translate to material that data engineers can easily understand
- Understands and ability to develop data models for traditional relational and distributed technologies
- Experience with SQL and AWS
- Experience with engineering process (SDLC, data eng cycle, DataOps, DevOps)
- Effective in driving meetings, discussions
- Good project planning, execution skills
Apply HERE