Sr. AI Platform Engineer
Job in
New York, New York County, New York, 10261, USA
Listed on 2026-03-01
Listing for:
TechWize
Full Time
position Listed on 2026-03-01
Job specializations:
-
Software Development
AI Engineer, Cloud Engineer - Software, Machine Learning/ ML Engineer
Job Description & How to Apply Below
Location
501, Fifth Avenue, Suite 805 New York, NY 10017
Minimum Qualifications- 8+ years of experience as a Platform Engineer (Site Reliability / Dev Ops), with at least 3+ years in AI/ML platform development (MLOps).
- Deep expertise in Python, with strong design and debugging skills.
- Ability to work independently and lead complex projects with excellent problem‑solving, analytical, and communication skills.
- Proficiency with cloud platforms such as AWS, GCP, or Azure and familiarity with MLOps/AI Dev Ops tools like MLflow or Kubeflow, proficient in CI/CD, infrastructure as code (Terraform / Cloud Formation).
- Hands‑on expertise with CI/CD pipelines, model observability, and incident response for AI/ML services.
- Experience implementing and optimizing platforms supporting large language model (LLM) pipelines with frameworks such as Lang Chain, Llama Index, Hugging Face Transformers, or similar.
- Hands‑on knowledge of scaling & setting up vector database platforms such as Qdrant (or other vector DBs like Pinecone, Weaviate) for semantic search and embeddings management.
- Exposure to MLOps tools, Ray.io, Anyscale, or other distributed orchestration & inference frameworks.
- Experience with developing and deploying containerized applications using Docker and Kubernetes, including Helm charts and automated scaling.
- Understanding of LLMOps patterns — model registry, prompt versioning, and feedback loops.
- Platform Design and Architecture: build and operate a highly available, scalable, modular AI platform using technologies such as Qdrant, Anyscale, and Ray to support LLM orchestration, vector search, and multi‑agent frameworks.
- Core Infrastructure Development: build essential APIs and infrastructure to power conversational applications, AI agents, and analytics tools.
- LLM Operational Solutions: implement workflows for large language models, including inference pipelines, fine‑tuning, caching, and evaluation for open‑weight and hosted models.
- Deployment & Performance Optimization: deploy AI services on AWS with Kubernetes (EKS), Lambda, and ECS, ensuring scalability and resilience while optimizing vector databases and model runtimes for cost and performance.
- Collaboration, Governance, & Mentorship: partner with engineering teams and research teams to deliver production‑grade, self‑healing, and performance‑optimized services for AI/RAG pipelines, establish governance/security standards, and mentor junior engineers in AI infrastructure best practices & reviews.
Sr. AI Platform Engineer
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×