Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


https://bayt.page.link/5giR7QNvtEYo75Bk6
Back to the job results
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

We are seeking an experienced Data Engineer (PySpark) to design, build, optimize, and maintain scalable data pipelines for production environments. The role requires strong hands-on experience in big data processing, pipeline optimization, and deployment using modern data engineering tools and frameworks.


Key Responsibilities
  • Design, develop, and maintain robust, scalable data pipelines using Python and PySpark


  • Perform data ingestion, transformation, cleansing, and validation across structured and unstructured datasets


  • Conduct Exploratory Data Analysis (EDA) to identify data patterns, anomalies, and quality issues


  • Apply data imputation techniques, data linking, and cleansing to ensure high data quality


  • Implement feature engineering pipelines to support analytics and downstream use cases


  • Optimize Spark jobs for performance, scalability, and cost efficiency


  • Deploy and tune production-grade data pipelines, ensuring reliability and performance


  • Automate workflows using Apache Airflow and/or Jenkins


  • Collaborate with cross-functional teams to integrate data solutions into production systems


  • Write and maintain unit tests to ensure code quality and reliability


  • Manage source code, CI/CD, and deployments using Git, GitHub, and GitHub Actions



RequirementsTo be considered for this role, you need to meet the following criteria:

Required Technical Skills
  • Strong proficiency in Python


  • Extensive hands-on experience with Apache Spark (PySpark)


  • Experience working with Jupyter Notebooks


  • Strong knowledge of SQL and NoSQL databases


  • Proven experience with Git for version control and CI/CD


  • Hands-on experience with Apache Airflow and/or Jenkins for scheduling and automation


  • Solid understanding of data engineering best practices in production environments


  • Demonstrated experience in Spark performance tuning and optimization


  • Ability to write clean, testable, and maintainable Python code


Mandatory Requirement
  • Previous production experience is a MUST, specifically in deploying, tuning, and maintaining data pipelines in production environments


Preferred Qualifications
  • Experience working in high-volume or big data environments


  • Strong problem-solving and analytical skills


  • Ability to work independently in a fast-paced environment


Why Join?
  • Competitive salary package


  • Opportunity to work on production-scale data platforms


  • Exposure to modern data engineering tools and practices


  • Dubai-based role with a dynamic and collaborative work environment


To view other requirements we have, please visit our website - www.blackpearlconsult.com

This job post has been translated by AI and may contain minor differences or errors.

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.