Principal Engineer Job San Diego area,California USA,IT/Tech

Overview

Insight Global is seeking a Principal Software Engineer with AI experience for a direct hire opportunity to sit fully remote in the US. You will be joining a team to help improve AI governance and compliance platforms to help organizations manage and monitor AI systems securely and transparently. You will drive the end-to-end technical strategy, architecture, and productionization of the clients machine learning systems, large language model (LLM) capabilities, and AI infrastructure.

Own how models, evaluation pipelines, data workflows, and observability components are designed, deployed, monitored, and continuously improved to meet reliability, quality, safety, and cost goals. Provide deep AI/ML expertise and leadership across engineering teams, guiding model integration, AI/ML platform decisions, and scalable distributed systems that support enterprise-grade GenAI workloads.

Responsibilities

Define and own the architecture for scalable AI/ML systems, including training, fine-tuning, inference, evaluation, and monitoring pipelines.
Translate ambiguous business and product requirements into robust AI/ML system designs and staged delivery plans.
Make strategic decisions on model selection, LLM integrations, evaluation frameworks, model gateways, guardrails, and safety mechanisms.
Lead design reviews, architecture forums, and technical decision-making across teams.
Build and deploy production-grade AI/ML/LLM models, transformers, and generative AI features—from initial concept through production rollout.
Establish standards for model readiness, evaluation gates, rollout/rollback, drift detection, observability, and ongoing performance management.
Partner with engineering teams to integrate models into distributed systems with clear SLOs, telemetry, and error-budget mechanisms.
Design and improve data pipelines, feature stores, and data quality/lineage workflows supporting model training and inference.
Develop scalable AI/MLOps/AIOps practices for automation of training, testing, deployment, and monitoring.
Evaluate and implement AI/ML workflow orchestration platforms (e.g., AI/MLflow, Kubeflow, Vertex AI) and CI/CD for AI/ML.
Own evaluation pipelines—latency, accuracy, cost, hallucination metrics, prompt versioning, and model performance insights.
Instrument tracing and model observability using best-practice frameworks and telemetry standards.
Implement guardrails and safety systems to ensure consistent, controlled behaviour of LLM-powered features.
Partner closely with product, engineering, and leadership to shape platform strategy and AI feature roadmap.
Provide trade-off analyses that incorporate model performance, security, compliance, scalability, and long-term maintainability.
Write clear technical documents, proposals, and mechanism-based recommendations to guide executive decision-making.
Mentor senior/junior engineers in AI/ML best practices, distributed systems, experimentation, and model governance.
Support hiring, leveling, performance feedback, and the growth of a high-calibre engineering team.

Equal Opportunity

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances.

If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:

Qualifications

10+ years of software engineering experience
5+ years of hands-on AI/ML development experience
Full stack development w/ Java and C++
Bachelor’s Degree in Computer Science or related field
Proven experience deploying AI/ML productions or LLM systems at scale (not prototypes)
Extensive experience with Python programming
Experience w/ cloud platforms (AWS/GCP/Azure) and Kubernetes experience
Experience in AI/ML flow – Kubeflow, Vertex AI, Sage Maker or similar platform
Expertise with LLM productionization including fine tuning, retrieval-augmented generation (RAG), safety/guardrails, and evaluation
Masters or PhD in Computer Science, Machine Learning, etc
Cloud platform experience with deploying AI/ML workloads at scale
Contributions to AIOps/MLOps platform
Previous experience with AI observability and troubleshooting

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language