AI Data Scientist– RAG, SLM & Distributed Data; Spanner
Listed on 2026-03-02
-
Software Development
AI Engineer, Software Engineer
Job Description
We are looking for a mid‑level AI Engineer with hands‑on experience in Retrieval‑Augmented Generation (RAG) systems, Small Language Models (SLMs), and distributed databases such as Google Cloud Spanner.
You will work closely with senior engineers and product teams to build scalable AI systems that integrate retrieval pipelines, language models, and distributed transactional infrastructure. This role is ideal for someone who has already built AI features in production and wants to deepen their expertise in applied GenAI systems.
Contract:
Through the end of the year
- Production RAG features.
- Distributed knowledge storage backed by Spanner.
- AI‑powered APIs and services.
- Retrieval optimization and evaluation.
- Model cost/latency optimization.
- AIRAG pipelines, Embeddings, Prompt engineering
- Models
SLM/LLM integration - Database Spanner schema design, SQL optimization
- Backend Python, APIs
- CloudGCP
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances.
If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:
- 3–5 years of software engineering experience.
- 1–2 years working with LLM or RAG‑based systems.
- Strong proficiency in Python.
- Experience with:
- Embedding models and vector search
- Lang Chain, Llama Index, or similar frameworks
- API development (FastAPI/Flask)
- Experience working with Google Cloud Spanner or similar distributed SQL databases.
- Solid understanding of distributed systems fundamentals.
- Comfortable working in cloud environments (GCP preferred).
- Experience fine‑tuning or quantizing small language models.
- Familiarity with evaluation metrics for retrieval systems (Recall@K, etc.).
- Knowledge of:
- Vertex AI
- Pub/Sub
- Dataflow
- Experience optimizing AI inference for cost and latency.
- Exposure to CI/CD pipelines.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).