Senior Site Reliability Engineer
Listed on 2026-01-14
-
IT/Tech
Cloud Computing, Systems Engineer, IT Support, Cybersecurity
Senior Site Reliability Engineer About Us
Founded in 2014, we offer the industry’s first and only cloud‑based, fully‑customisable, end‑to‑end software solution to automate securities‑based lending from origination through the life of the loan. By combining thought leadership in suitability and risk management with industry‑leading education and the latest technology, Supernova enables advisors to deliver holistic, goals‑based advice and to help their clients achieve financial wellness. We partner with the industry’s largest banks, most prominent insurance companies and leading online brokerages to democratise access to securities‑based lending and better the entire financial ecosystem.
WhyJoin Supernova?
At Supernova Technology, we believe that the best results come from a team that is passionate, driven, and supported in all aspects of their professional lives. Here, you’ll work alongside talented and innovative individuals who are committed to driving the future of securities‑based lending technology. We foster a culture of collaboration, continuous learning, and growth, where each person’s contributions make a real impact.
Job DescriptionThe Senior Site Reliability Engineer will own the reliability, scalability, and performance of our production systems. This role bridges engineering, platform, and security teams to ensure infrastructure meets strict uptime, compliance, and client experience requirements. This position will lead the design and implementation of observability tools, incident response processes, and resilience strategies, shifting the organization from reactive to proactive reliability practice.
Responsibilities- Ensure systems meet high‑availability targets through well‑defined SLAs, SLOs, and SLIs
- Own and optimise the monitoring, logging, and alerting stack to ensure actionable alerts
- Lead incident response and post‑mortem processes, driving remediation and prevention
- Plan capacity and optimise performance to address bottlenecks before they impact customers
- Automate operational tasks to reduce manual intervention
- Collaborate with Dev Ops to improve CI/CD reliability and with Platform Engineering to ensure infrastructure scalability
- Implement reliability controls required for SOC 2 and other regulatory standards
- 5–8 years in SRE, operations, or performance engineering roles
- Bachelor’s Degree in Computer Science or related fields
- Advanced expertise with monitoring and alerting tools
- Proficiency in at least one programming or scripting language such as Python, Go, or Bash
- Strong background in AWS cloud environments
- Experience with container orchestration using AWS ECS
- Proven track record in leading high‑severity incident response calmly and effectively
- Familiarity with ITIL, post‑mortem processes, and change management controls
- Demonstrated ability to work cross‑functionally with development, platform, and security teams
- Reliability‑focused mindset with an emphasis on uptime and recovery speed
- Analytical problem‑solving skills supported by metrics and data
- Calm and effective performance in high‑pressure situations
- Technical depth to diagnose and resolve complex system issues
- Proactive leadership in anticipating and addressing reliability risks
- Medical, Dental, and Vision Insurance: multiple plans with coverage for employees and dependents
- HSA and FSA Accounts: tax‑advantaged accounts for health and dependent care expenses
- Life and Disability Insurance: employer‑paid basic coverage with options for additional voluntary coverage
- Compensation: $130,000 – $170,000 per year
- Retirement Savings: 401(k) plan with employer contributions
- Employee Assistance Program (EAP): confidential support services, including free therapy sessions
- Paid Time Off: flexible PTO policies
- Additional Perks: commuter benefits, pet insurance, continuing education assistance, and more
- Form, execute, and communicate new ideas that add value to our employees and customers
- Strive through obstacles and failures
- Follow‑through on promises or commitments to others, accept responsibility, and answer for actions & decisions
- Listen to, understand, and support our employees and customers
- Act with speed, positive attitude, and flexibility
- Exceed expectations and surpass ourselves every day; we embrace a sense of pride and never stop growing
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analysing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans.
Seniority Level- Not Applicable
- Full‑time
- Engineering and Information Technology
- Transportation, Logistics, Supply Chain and Storage
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).