LLM Engineer Job New York New York USA,Software Development

Location: New York

Saroe, Inc. is a technology consulting and staff-augmentation firm helping small to mid-size and enterprise organizations build scalable, modern software solutions. We partner closely with our clients to deliver high-impact work across cloud, AI, automation, and application development.

We are currently supporting one of our clients in hiring a skilled LLM Engineer to work on next-generation AI-powered applications.

About the Role

We are seeking a highly skilled LLM Engineer to design, develop, and optimize applications powered by Large Language Models (LLMs). In this role, you’ll work on cutting-edge AI systems involving tool-using agents, vector search, and modern orchestration frameworks.

You will collaborate with engineers, architects, and product stakeholders to translate real-world business needs into scalable, secure AI solutions.

Key Responsibilities

Design and develop LLM-powered applications using APIs such as OpenAI, Claude, Gemini, and similar platforms
Build AI workflows using frameworks like Lang Graph, DSPy, and tool-use architectures
Implement MCP server integrations to extend LLM capabilities
Develop and integrate vector search solutions using Qdrant, Milvus, or pgvector
Optimize prompts, orchestration logic, and tool chaining for accuracy and performance
Collaborate with cross-functional teams to deliver AI-enabled solutions
Ensure best practices around security, scalability, and maintainability
Use Azure Dev Ops for source control, CI/CD, and deployments
Participate in Agile ceremonies including sprint planning and reviews

Required Skills & Qualifications

Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or a related field
4+ years of software development experience, including 2+ years working with LLM-based systems
Strong hands-on experience with LLM APIs (OpenAI, Claude, Gemini, etc.)
Experience with Lang Graph, DSPy, and tool-use patterns
Familiarity with MCP server integration
Solid experience with vector databases (Qdrant, Milvus, pgvector)
Experience working in Agile teams and using Azure Dev Ops

Nice to Have

Experience building RAG (Retrieval-Augmented Generation) pipelines
Exposure to MLOps practices and AI model deployment
Experience in multi-cloud environments (Azure, AWS, GCP)
Familiarity with Docker and Kubernetes

Why Join Through Saroe?

Work on real-world, production-grade AI systems
Opportunity to collaborate with strong engineering teams and modern tech stacks
Hybrid work model with flexibility
Saroe acts as a long-term partner, not just a placement firm

Location: New York, Princeton, or Chicago

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language