×
Register Here to Apply for Jobs or Post Jobs. X

Technical Architect

Job in 201301, Noida, Uttar Pradesh, India
Listing for: Impetus
Full Time position
Listed on 2026-02-17
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Job Description & How to Apply Below
Responsibilities

Core AI/ML Fundamentals

Solid understanding of AI/ML concepts including:

Classification, regression, neural networks
OCR and transcription systems
Audio/Video processing and multimodal learning

OCR, Transcription & Audio/Video Intelligence

Implement specialized models for:

High‑accuracy document OCR
Real‑time audio transcription

Architect deep learning pipelines for audio/video analysis and generation.
Integrate multimodal models (e.g., LLaVA, Whisper) into broader GenAI systems.

Generative AI & LLM Expertise

Strong understanding of:

Generative AI techniques
Transformer architectures
RAG (Retrieval-Augmented Generation) pipelines
Modern LLM ecosystems

Hands-on experience with:

LLM parameter handling and model selection
Scaling strategies and performance optimization

Expertise in:

Prompt engineering and instruction tuning
Prompt tuning and optimization for high-quality outputs

Familiarity with evaluation frameworks covering:

Quality, grounding, accuracy, safety
Latency and cost analysis
Governance and compliance requirements

Agentic AI Systems

Experience designing and building agentic AI systems including:

Multi‑agent orchestration
Tool‑use workflows
Autonomous task execution

Design long‑term memory architectures:

Vector-based memory systems
Graph-based memory for complex, persistent context

Model Training, Fine‑Tuning & Optimization

Knowledge of fine‑tuning approaches:

LoRA, QLoRA, supervised fine-tuning

Experience with model compression techniques:

Quantization, distillation

Familiarity with performance-level tooling:

CUDA, Triton, or specialized custom kernels

Design AI systems capable of efficiently handling and routing multiple user requests simultaneously, including:

Scalable request handling
Load‑balanced inference
Multi‑tenant model utilization
Caching and prioritization strategies

Infrastructure, Pipelines & Deployment

Oversee:

Data pipeline integration
Training workflows
ML CI/CD processes

Strong understanding of:

GPU/compute requirements
Cost‑efficient deployment strategies

Experience designing and managing production-grade inference servers using:

vLLM
Text Generation Inference (TGI)
SGLang

Ability to collaborate with engineering teams to integrate LLMs into production systems:

APIs, microservices, cloud architectures

Research, Evaluation & Continuous Innovation

Stay current with advancements in AI, ML, and LLM ecosystems.
Evaluate new tools, frameworks, and platform technologies to continuously enhance system architecture.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary