Site Reliability Engineer - Remote Job California Missouri USA,IT/Tech

Location: California

At Pay Near Me , we’re on a mission to make paying and getting paid as simple as possible. We build innovative technology that transforms the way businesses and their customers experience payments. Our industry-leading platform, PayXM™, is the first of its kind—designed to manage the entire payment experience from start to finish. Every click, swipe or tap is seamless, fast and secure, helping non-commerce businesses boost customer satisfaction, accelerate payments, and reduce costs.

Our single platform handles it all: cards, ACH, digital wallets such as Pay Pal, Venmo, Cash App Pay, Apple Pay and Google Pay, and even cash at more than 62,000 retail locations nationwide. Today, thousands of businesses across consumer lending, iGaming and online sports betting, property management, and toll……………………

In September 2025, we raised a $50 million Series E positi…

We’re a team of 200+ employees across 41 states, headquartered in Silicon Valley with satellite offices in Dallas, TX and Holmd vertrouwid.

Join us and be part of a team that’s shaping the future of payments—one experience at a time.

Job Description

As our Site Reliability Engineer, you will design, build, and maintain the systems and infrastructure that power our applications, ensuring their reliability, scalability, and performance. You will bring a software engineering approach to operations, automating processes, and continuously improving the infrastructure and tools to support our business needs.

Responsibilities

Infrastructure Management: Design, implement, and maintain scalable and resilient infrastructure using Terraform for infrastructure as code, ensuring high availability and performance
Kubernetes and Containers: Deploy, manage, and optimize Kubernetes clusters and containerized applications using Docker. Implement best practices for container orchestration and management
Systems and Application Monitoring/Observability: Develop and maintain comprehensive monitoring and observability solutions using Datadog. Ensure detailed visibility into system performance and application health
SLOs and SLA Management: Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure reliable and consistent service delivery
Incident Response and Troubleshooting:
Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence. Participate in post-incident reviews and contribute to blameless postmortems
Reliability and Production Environment Management:
Ensure the reliability and stability of our production environments. Continuously assess and improve system reliability, identifying and addressing potential points of failure
Automation and Scripting: Develop automation scripts and tools to reduce manual intervention and improve system reliability using Python, Bash, or Go. Implement and improve CI/CD pipelines
CI/CD Pipeline Management: Enhance and maintain continuous integration and continuous deployment pipelines using Git Lab CI. Ensure seamless and reliable deployment processes
Capacity Planning and Scaling: Assist in capacity planning and ensure that systems are scalable to meet future demands. Implement auto-scaling strategies where applicable
Security and Compliance: Implement security best practices and ensure compliance with industry standards. Regularly review and update security policies and procedures
< услов>

Collaboration and Support:
Work closely with development teams to ensure reliability and scalability of new features and services. Provide technical support and guidance on infrastructure-related issues
Software Engineering for Operations: Develop and maintain internal tools and services that enhance the efficiency and reliability of our operations
On-Call Rotation: Participate in an on-call rotation to address production issues and collaborate in incident response efforts

Qualifications

+3 years of experience in SRE, Dev Ops,ankanrole
Cloud Platform

Experience:

Proficient with cloud platforms such as AWS, GCP QCOMPARE…
Kubernetes and Containers: Strong experience with Kubernetes and Docker, including deployment, scaling, and management of containerized applications
Infrast…


Increase/decrease your Search Radius (miles)



Job Posting Language