Job Description
Responsibilities:
- Design, develop, and maintain scalable and efficient data pipelines to process large volumes of data.
- To implement ETL processes to acquire, validate, and process incoming data from diverse sources.
- Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and translate them into technical solutions.
- Implement data ingestion, transformation, and integration processes to ensure data quality, accuracy, and consistency.
- Optimize Spark jobs and data processing workflows for performance, scalability, and reliability.
- Troubleshoot and resolve issues related to data pipelines, data processing, and performance bottlenecks.
- Conduct code reviews and provide constructive feedback to junior team members to ensure code quality and best practices adherence.
- Stay updated with the latest advancements in Spark and related technologies and evaluate their potential for enhancing existing data engineering processes.
- Develop and maintain documentation, including technical specifications, data models, and system architecture diagrams.
- Stay abreast of emerging trends and technologies in the data engineering and big data space and propose innovative solutions to enhance data processing capabilities.
What We’re Looking For:
- 5+ Years of experience in Data Engineering or related field
- Strong experience in Python programming with expertise in building data-intensive applications.
- Proven hands-on experience with Distributed data processing framework
- .Solid understanding of distributed computing concepts, parallel processing, and cluster computing frameworks.
- Proficiency in data modeling, data warehousing, and ETL techniques.
- Experience with workflow management platforms, preferably Airflow.
- Familiarity with big data technologies such as Hadoop, Hive, or HBase.
- Strong Knowledge of SQL and experience with relational databases.
- Hand on experience with AWS cloud data platform
- Strong problem-solving and troubleshooting skills, with the ability to analyze complex data engineering issues and provide effective solutions.
- Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
- Nice to have experience on DataBricks
Basic Qualifications: Bachelor’s degree in Information Technology, Computer Information Systems, Computer Engineering, Computer Science, or other technical discipline