Lead Data Architect – Cloud Lakehouse; Azure | Databricks | Spark Job Abu Dhabi area,UAE/Dubai,IT/Tech

Position: Lead Data Architect – Cloud Lakehouse (Azure | Databricks | Spark)

Urgent requirement for
Lead Data Architect – Cloud Lakehouse (Azure | Databricks | Spark)
is required for our client in Abu Dhabi, UAE

Must-Have Experience

Strong experience on end-to-end architecture of a production data platform (lakehouse / warehouse / analytics)
Strong experience on Advanced PySpark optimization (joins, shuffles, skew handling, caching, AQE)
Strong experience on Databricks on Azure
Strong experience on Implementing lineage, metadata, and observability
Strong experience on CI/CD pipelines for data using Jenkins or Git Lab CI/CD

Core Responsibilities

Own the end-to-end data architecture for cloud-native analytical platforms, from ingestion to consumption, with zero tolerance for brittle or over-engineered designs
Design and evolve enterprise-grade data lake and warehouse architectures on Azure that scale to billions of records and multiple consumption patterns (BI, ML, analytics)
Make irreversible architectural decisions around storage formats, partitioning strategies, schema evolution, and data modeling — and stand behind them
Define and enforce non-negotiable architectural standards for performance, cost efficiency, reliability, and security

Advanced Data Engineering Leadership

Architect and optimize high-throughput, low-latency data pipelines using Databricks, PySpark, and Azure-native services
Set the technical bar for ETL/ELT frameworks, orchestration, dependency management, and failure recovery patterns
Personally review and challenge pipeline designs, Spark jobs, and SQL logic — no rubber‑stamp approvals
Lead the transition from ad-hoc pipelines to fully productionized, observable, and automated data workflows

Data Quality, Governance & Observability

Design and implement enterprise-grade data quality frameworks (validation, anomaly detection, reconciliation)
Establish data lineage, metadata management, and monitoring as first-class architectural components
Ensure datasets are audit-ready, reproducible, and trustworthy for executive, regulatory, and ML use cases

CI/CD & Engineering Excellence

Architect CI/CD pipelines for data using Git-based workflows and tools such as Jenkins or Git Lab CI/CD
Enforce automated testing strategies for data (unit, integration, data quality checks)
Eliminate manual deployments and fragile handoffs across environments

Cross-Functional & Strategic Influence

Translate ambiguous business requirements into clear, scalable data architectures
Partner deeply with ML engineers, analysts, product leaders, and executives to design data assets that directly enable business outcomes
Act as the final technical authority in data architecture discussions, tradeoffs, and escalations

Team Enablement (Without Micromanagement)

Mentor senior data engineers and technical leads, pushing them toward architectural thinking and ownership
Set expectations for engineering rigor, documentation, and decision-making clarity
Raise the technical maturity of the organization, not just deliver projects

Hard Requirements (Non-Negotiable)

7+ years in Data Engineering / Data Architecture, with proven ownership of large-scale production data platforms
3+ years making architectural decisions, not just implementing someone’s design
Deep, hands-on expertise with Databricks + PySpark in real-world, high-volume environments
Strong command of Microsoft Azure data services and cloud-native architecture patterns
Expert-level Python and strong Spark optimization skills (partitioning, joins, caching, tuning)
Proven experience designing fault-tolerant, highly available, cost-efficient data systems
Strong Git-based development practices and experience enforcing engineering standards
Demonstrated success implementing CI/CD for data pipelines
Ability to explain complex architectural tradeoffs clearly to both engineers and senior stakeholders

Skills:

architecture, cloud lakehouse, data, spark

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language