More jobs:
Job Description & How to Apply Below
The Role
The AI Architect is responsible for defining, designing, and governing end‑to‑end artificial intelligence system architectures that align with business objectives, data strategies, and enterprise technology standards. This role provides technical leadership across AI solution life cycles, from ideation to production, ensuring scalability, security, interoperability, and regulatory compliance.
Competency Focus: AI Infrastructure and architecture design, cloud-native architecture, model governance, large‑scale distributed systems
Keyword s: HPC Architect, HPC Architecture and System Design
Responsibilities:
Architect, deploy, and operate large-scale accelerator clusters, including NVIDIA DGX platforms, discrete NVIDIA and AMD GPUs, and TPU-based systems, ensuring high availability, scalability, and performance.
Design and architect high‑bandwidth, low‑latency interconnect architectures, leveraging technologies such as Infini Band, NVLink, and RoCE to support distributed AI training and inference workloads.
Architect and design end‑to‑end AI training and inference platforms across on‑premises and public cloud environments (Azure, AWS, GCP), incorporating elastic GPU resource orchestration and automated scaling mechanisms.
Architect and engineer high‑performance, large‑scale data delivery and storage solutions, including petabyte‑scale object storage and distributed file systems (e.g., VAST Data, WekaIO, DDN) optimized for AI and high‑throughput workloads.
Design and architect streaming and batch data ingestion pipelines optimized for AI/ML workflows, enabling efficient data preprocessing, feature ingestion, and model training hitect and enforce secure GPU and compute isolation mechanisms, utilizing Kubernetes primitives such as RBAC, namespace isolation, and network policies to ensure multi‑tenant security, governance, and compliance.
Evaluate, benchmark, and qualify emerging AI hardware platforms and software frameworks, conducting performance, scalability, and cost‑efficiency assessments to inform technology adoption decisions.
Mentor engineers in AI infra best practices, observability, and capacity management Define the reference architecture for enterprise-wide AI adoption. Understanding on Sovereign AI
Qualifications & Experience
B. Tech/B.E. in Computer Science, Artificial Intelligence, Data Science, or related discipline; M. Tech/MS preferred
12+ years in infrastructure/cloud engineering, with 4+ years focused purely on AI/ML systems.
Deep expertise in GPU cluster management, distributed compute, and container orchestration.
Hands-on experience with Kubernetes for AI workloads, GPU scheduling, and Ray/Kubeflow pipelines.
Basic Understanding of LLM training, fine-tuning, quantization, and model optimization.
Certifications
Required:
NVIDIA Certified Associate – AI Infrastructure
NVIDIA Professional Certification for AI Networking and AI Infrastructure
Certified Kubernetes Administrator
Cloud Certification (AWS, Azure, GCP)
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×