Software Engineer - Principal Member of Technical Staff; PMTS - Availability Job Bellevue area,Washington USA,IT/Tech

Position: Software Engineer - Principal Member of Technical Staff (PMTS) - Availability

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition meets action. Tech meets trust. And innovation isn't a buzzword - it's a way of life. The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all.

Ready to level-up your career at the company leading workforce transformation in the agentic era? You're in the right place! Agentforce is the future of AI, and you are the future of Salesforce.

Role Description

The Availability Standards team is part of the overall Salesforce technology organization. We manage the high-level frameworks used to measure platform uptime and performance, bridging the gap between centralized reporting and the individual engineering teams that own specific services. We follow a consultative engineering approach where our experts partner with service owners to build a deep understanding of service health, telemetry, and automated testing.

This level of expertise allows our team to advocate for the customer and influence the product roadmap by ensuring that every service team has the visibility they need to maintain world-class availability.

Role Description:

The Engineering Availability Standards position is a critical role designed for a seasoned engineering veteran who has experience managing, leading, or coordinating with high-scale cloud services. Your mission is to transform how we calculate, visualize, and act upon platform health data. You will serve as the technical bridge between our global availability standards and the distributed engineering teams that power our infrastructure.

You will be responsible for shifting our monitoring strategy from simple reporting into active, high-fidelity signals that engineering teams use for real-time alerting and incident response. This role requires the ability to influence technical roadmaps across different product families and automate the integration of reliability testing and observability into standard software development life cycles.

Job Responsibilities

Utilize software engineering skills and production experience to provide input into long-range platform requirements and operational guidelines, with a focus on making health data actionable for service owners.
Analyze and understand how service teams manage their telemetry, and help drive continuous improvement of health signals based on the knowledge of specific service architectures.
Partner with internal engineering teams to integrate global availability standards into their existing monitoring pipelines, dashboards, and automated alerting flows.
Identify and mitigate friction in the onboarding process by leveraging existing automated test suites to create high-quality, streamlined reliability signals with minimal manual effort.
Serve as a technical subject matter expert to ensure that centralized infrastructure services (logging, monitoring, and data platforms) are optimized to support the needs of individual service owners.
Quarterback the integration of failure signals into standard engineering workflows, ensuring that detected issues result in automated work items and proactive investigations.
Deliver presentations highlighting availability metrics, reliability trends, and success stories to diverse engineering and leadership audiences.

Required Skills

A related technical degree required.
5+ years of proven experience in production environments (this could include previous experience as a software engineer, systems engineer, service owner, or lead developer).
Fluency in Java or a similar object-oriented language (Python, C++, etc.) to provide input on platform requirements and automation.
Deep understanding of telemetry systems and experience building or managing production monitoring and alerting frameworks.
Experience using Linux environments and the…


Increase/decrease your Search Radius (miles)



Job Posting Language