×
Register Here to Apply for Jobs or Post Jobs. X

Data Architect AIG

Job in 242221, Gurugram, Uttar Pradesh, India
Listing for: Confidential
Full Time position
Listed on 2026-02-03
Job specializations:
  • IT/Tech
    Data Engineer, Big Data
Job Description & How to Apply Below
Data Architect

About

The Role

We are seeking a Data Architect with deep Big Data Engineering expertise to design and modernize large-scale, cloud-native data platforms. This role emphasizes distributed data processing, real-time pipelines, data platform automation, and GenAI enablement on top of strong Big Data foundations.

Key Responsibilities

Architect and govern enterprise Big Data platforms (data lake, lakehouse, warehouse, real-time).
Design high-volume, high-velocity data pipelines using batch and streaming frameworks.
Lead implementation of distributed processing architectures (Spark, PySpark, EMR).
Build event-driven and real-time streaming solutions (Kafka, Kinesis, Flink).
Define ETL/ELT patterns, metadata-driven pipelines, and reusable ingestion frameworks.
Drive data platform automation (Airflow/Step Functions, CI/CD, data quality, observability).
Optimize performance, scalability, fault tolerance, and cost across Big Data workloads.
Integrate GenAI architectures (LLMs, embeddings, vector databases, RAG) with enterprise data lakes.
Ensure security, governance, lineage, and compliance across data platforms.
Provide hands-on leadership and technical mentoring to data engineering teams.

Required Technical Skills & Experience

12+ years in Big Data Engineering / Data Architecture roles.
Expert-level experience with Spark, PySpark, SQL, and distributed compute engines.
Strong knowledge of AWS Big Data stack: S3, EMR, Glue, Athena, Redshift, Lambda, Step Functions.
Hands-on experience with Snowflake (performance tuning, data sharing, optimization).
Expertise in streaming platforms:
Kafka, Kinesis, Flink, or Spark Streaming.
Strong experience with data modeling (dimensional, Data Vault 2.0).
Proficiency in Python, schema evolution, partitioning, and data versioning.

Experience with orchestration and automation tools (Airflow, Dagster, CI/CD).
Working knowledge of GenAI data integration (feature stores, vector DBs, RAG pipelines).

Experience with Agile delivery and leading globally distributed engineering teams.

Responsibilities for Internal Candidates

Key Responsibilities

Architect and govern enterprise Big Data platforms (data lake, lakehouse, warehouse, real-time).
Design high-volume, high-velocity data pipelines using batch and streaming frameworks.
Lead implementation of distributed processing architectures (Spark, PySpark, EMR).
Build event-driven and real-time streaming solutions (Kafka, Kinesis, Flink).
Define ETL/ELT patterns, metadata-driven pipelines, and reusable ingestion frameworks.
Drive data platform automation (Airflow/Step Functions, CI/CD, data quality, observability).
Optimize performance, scalability, fault tolerance, and cost across Big Data workloads.
Integrate GenAI architectures (LLMs, embeddings, vector databases, RAG) with enterprise data lakes.
Ensure security, governance, lineage, and compliance across data platforms.
Provide hands-on leadership and technical mentoring to data engineering teams.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary