×
Register Here to Apply for Jobs or Post Jobs. X

Technical Architect - AI, GCP

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Winwire Technologies
Full Time position
Listed on 2026-01-12
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 200000 - 250000 USD Yearly USD 200000.00 250000.00 YEAR
Job Description & How to Apply Below
  • As a Technical Architect specializing in LLMs and Agentic AI, you will own the architecture, strategy, and delivery of Enterprise‑grade AI solutions. Work with cross‑functional teams and customers to define the AI roadmap, design scalable solutions, and ensure responsible deployment of Generative AI across the organization.
  • Architect Scalable GenAI Solutions – Lead the design of enterprise architectures for LLM and multi‑agent systems, ensuring scalability, resilience, and security across Azure and GCP platforms
  • Technology Strategy & Guidance – Provide strategic technical leadership to customers and internal teams, aligning GenAI projects with business outcomes
  • LLM & RAG Applications – Architect and guide development of LLM‑powered applications, assistants, and RAG pipelines for structured and unstructured data
  • Agentic AI Frameworks – Define and implement agentic AI architectures leveraging frameworks like Lang Graph, Auto Gen, DSPy, and cloud‑native orchestration tools
  • Integration & APIs – Oversee integration of OpenAI, Azure OpenAI, and GCP Vertex AI models into enterprise systems, including Mule Soft Apigee connectors
  • LLMOps & Governance – Establish LLMOps practices (CI/CD, monitoring, optimization, cost control) and enforce responsible AI guardrails (bias detection, prompt injection protection, hallucination reduction)
  • Enterprise Governance – Lead architecture reviews, governance boards, and technical design authority for all LLM initiatives
  • Collaboration – Partner with data scientists, engineers, and business teams to translate use cases into scalable, secure solutions
  • Documentation & Standards – Define and maintain best practices, playbooks, and technical documentation for enterprise adoption
  • Monitoring & Observability – Guide implementation of Agent Ops dashboards for usage, adoption, ingestion health, and platform performance visibility
  • Innovation & Research – Stay ahead of advancements in OpenAI, Azure AI, and GCP Vertex AI, evaluating new features and approaches for enterprise adoption
  • Ecosystem Expertise – Remain current on Azure AI services (Cognitive Search, AI Studio, Cognitive Services) and GCP AI stack (Vertex AI, Big Query, Matching Engine)
  • Business Alignment – Collaborate with product and business leadership to prioritize high‑value AI initiatives with measurable outcomes
  • Mentorship – Coach engineering teams on LLM solution design, performance tuning, and evaluation techniques
  • Proof of Concepts – Lead or sponsor PoCs to validate feasibility, ROI, and technical fit for new AI capabilities
  • Experience 10+ years of experience in AI/ML‑related roles, with a strong focus on LLM’s & Agentic AI technology
  • Generative AI Solution Architecture (2-3 years) –– Proven experience in designing and architecting GenAI applications, including Retrieval‑Augmented Generation (RAG), LLM orchestration (Lang Chain, Lang Graph), and advanced prompt design strategies
  • Backend & Integration Expertise (5+ years) –– Strong background in architecting Python‑based Microservices, APIs, and orchestration layers that enable tool invocation, context management, and task decomposition across cloud‑native environments (Azure Functions, GCP Cloud Functions, Kubernetes)
  • Enterprise LLM Architecture (2-3 years) –– Hands‑on experience in architecting end‑to‑end LLM solutions using Azure OpenAI, Azure AI Studio, Hugging Face models, and GCP Vertex AI, ensuring scalability, security, and performance
  • RAG & Data Pipeline Design (2-3 years) –– Expertise in designing and optimizing RAG pipelines, including enterprise data ingestion, embedding generation, and vector search using Azure Cognitive Search, Pinecone, Weaviate, FAISS, or GCP Vertex AI Matching Engine
  • LLM Optimization & Adaptation (2-3 years) –– Experience in implementing fine‑tuning and parameter‑efficient tuning approaches (LoRA, QLoRA, PEFT) and integrating memory modules (long‑term, short‑term, episodic) to enhance agent intelligence
  • Multi‑Agent Orchestration (2-3 years) –– Skilled in designing multi‑agent frameworks and orchestration pipelines with Lang Chain, Auto Gen, or DSPy, enabling goal‑driven planning, task decomposition, and tool/API invocation
  • Performance Engineering…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary