We are currently looking for a DataOps/Python Engineer to join our partner's growing team in Budapest. In this position you can play a key role in Enterprise Data Platform (EDP) Operations organization providing business continuity for critical business processes, IT systems and IT solutions through project implementations, enhancements, documentation and operational support.
You would be responsible to support the validated and non-validated production systems by resolving access, incident and change requests in timely way and enhance user experience.
What will you do?
- Own and deliver enhancements associated with Data platform solutions.
- Maintain and Enhance scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.
- Enhance/Support solutions using Pyspark/EMR, SQL and databases, AWS Athena, S3, Redshift, AWS API Gateway, Lambda, Glue and other Data Engineering technologies.
- Write Complex Queries and edit them as required for implementing ETL/Data solutions.
- Implement solutions using AWS and other cloud platform tools, including GitHub, Jenkins, Terraform, Jira, and Confluence.
- Follow agile development methodologies to deliver solutions and product features by following DevOps, DataOps and DevSecOps practices.
- Propose Data load optimizations and continuously implement to improve the performance of the Data loads.
- Identify, design, and implement internal process improvements (automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.)
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep the data separated and secure across through multiple data centers and AWS regions.
What will you need?
- Bachelor’s degree in Computer Science
- At least 5 years of Data Engineering experience using AWS services, Pyspark/EMR.
- 4+ Years of Experience in Data Lake, Data Analytics & Business Intelligence Solutions and at least 2+ years as AWS Data Engineer.
- Full life cycle project implementation experience in AWS using Pyspark/EMR, Athena, S3, Redshift, AWS API Gateway, Lambda, Glue and other managed services.
- Strong experience in building ETL data pipelines using Pyspark on EMR framework.
- Hands on experience in using S3, AWS Glue jobs, S3 Copy, Lambda and API Gateway.
- Working SQL experience (to troubleshoot SQL code).
- Strong experience in DevOps and CI/CD using Git and Jenkins, experience in cloud native scripting such as CloudFormation and ARM templates.
- Experience working with Python, Python ML libraries for data analysis, wrangling and insights generation.
- Experience using Jira for task prioritization and Confluence and other tools for documentation.
- Experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.
- Strong understanding of AWS Data lake and data bricks.
- Exposure to Kafka, Redshift, Sage Maker would be added advantage
- Exposure to data visualization tools like Power BI, Tableau etc.
Knowledge & abilities:
- Experience with agile development methodologies by following DevOps, Data Ops and Dev Sec Ops practices.
- Excellent written, verbal and inter-personal and stakeholder communication skills.
- Ability to work with cross functional teams from multiple regions/ time zones by effectively leveraging multi-form communication (Email, MS Teams for voice and chat, meetings).
- Excellent prioritization and problem-solving skills.
- Functional Knowledge in the areas of Sales & Distribution, Material Management, Finance and Production Planning.
- Certifications like AWS Certified Data Analytics, CCA Spark and Hadoop Developer or CCP Data Engineer.
- Redshift knowledge.
What can we offer?
- Long-term career path
- Challenging projects
- Multinational environment
- Friendly work atmosphere
- Career growth
Place of work: Budapest