×
Register Here to Apply for Jobs or Post Jobs. X

Reliability and Observability Lead

Job in Charlotte, Mecklenburg County, North Carolina, 28245, USA
Listing for: Vanguard
Full Time position
Listed on 2026-01-12
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Reliability and Observability Lead – Vanguard

At Vanguard, we are committed to delivering an exceptional client experience for all investors. The systems powering this experience operate within a complex and rapidly evolving resiliency landscape. As an Application Engineer within the ChAI (Chat & AI) team, you will contribute directly to building, enhancing, and supporting the conversational AI platform within the Chief Data & Analytics Office. This platform powers voice AI agents, chatbots, and Natural Language Processing capabilities that optimize client interactions across our Personal Investor and Workplace Solutions businesses.

In this role, you will design, build, and support application‑level capabilities that improve reliability, performance, and observability for AI and Generative AI workloads. You will also play a key role in migrating the Kore.ai platform from on‑premises hosting to the Kore.ai SaaS offering, ensuring a secure, seamless, scalable, and well‑observed transition. The role blends hands‑on engineering, automated testing, resiliency design, and close collaboration with platform partners and SaaS vendors.

Responsibilities
  • Develop, enhance, and maintain application components that improve system reliability, observability, and performance.
  • Implement application‑level instrumentation and telemetry to close observability gaps and strengthen monitoring coverage.
  • Collaborate with platform, AI/ML, and infrastructure teams to evaluate system health, performance, and failure patterns.
  • Build automation and tooling that improves deployment repeatability, enhances resiliency, and reduces operational toil.
  • Develop and maintain automated testing suites and regression test beds to validate functionality, resiliency, and performance.
  • Participate in incident management, troubleshooting, and root‑cause analyses, and contribute to recovery and prevention strategies.
  • Contribute to architectural discussions and design reviews, influencing decisions related to scalability, fault tolerance, and non‑functional requirements.
Kore.ai SaaS Migration Responsibilities
  • Support and contribute to the migration of the Kore.ai conversational AI platform from on‑prem to the Kore.ai SaaS offering.
  • Partner with vendor and Vanguard engineering teams to analyze platform gaps, data flows, integrations, security requirements, and service dependencies related to the migration.
  • Assist in the design and execution of migration test plans, including functional, resiliency, and performance validation in the SaaS environment.
  • Develop and enhance automation, telemetry, and regression tests specific to the Kore.ai SaaS platform.
  • Support cutover planning, environment readiness, UAT coordination, and post‑migration stability monitoring.
  • Contribute to documentation, runbooks, and operational readiness deliverables for the SaaS environment.
Qualifications
  • Minimum of eight years related experience, with at least two years of development experience.
  • Undergraduate degree or equivalent combination of training and experience. Graduate degree preferred.
  • Strong proficiency in Java or Node.js; experience with APIs, multithreaded applications, and Graph

    QL.
  • Experience building automated testing frameworks (unit, integration, resiliency, performance) and maintaining regression test beds.
  • Experience with observability frameworks/tools such as Open Telemetry, Cloud Watch, Grafana, and Splunk.
  • Familiarity working with SaaS platforms, including designing integrations and implementing observability for SaaS‑based products.
  • Experience with containerized and microservices architectures (e.g., Docker) and distributed systems.
  • Working knowledge of AWS networking, application services, IAM concepts, and cloud‑native patterns.
  • Comfort with
    * nix environments, scripting, and command‑line tooling.
  • Strong ability to diagnose system issues in high‑throughput, mission‑critical applications.
  • Excellent communication and documentation skills.
  • Experience with the Kore.ai platform or similar conversational AI platforms; migration or SaaS enablement experience preferred.
Sponsorship

Vanguard is not offering visa sponsorship for this position.

About Vanguard

At Vanguard, we don't just…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary