×
Register Here to Apply for Jobs or Post Jobs. X

Senior Observability Engineer

Job in Nashville, Davidson County, Tennessee, 37247, USA
Listing for: Universal Music Group
Full Time position
Listed on 2026-01-12
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support, Cybersecurity, Cloud Computing
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Join to apply for the Senior Observability Engineer role at Universal Music Group
.

We are UMG, the Universal Music Group, the world’s leading music company. We are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.

How

You'll LEAD

As a Senior Observability Engineer within UMG’s IT Technology Services team, you will drive the reliability, performance, and stability of our global technology ecosystem. You’ll own the design and evolution of our observability platform, ensuring visibility across systems, applications, and services. This role is both hands‑on and strategic — ideal for an engineer passionate about building scalable, automated, and data‑driven monitoring solutions that empower teams to deliver high-performing, resilient systems.

You’ll partner with Dev Ops, Infrastructure, and Application teams to lead observability best practices and shape a culture of proactive system insight across UMG.

How You'll CREATE Observability Architecture & Implementation
  • Design, implement, and maintain end-to-end observability solutions across infrastructure, applications, and services.
  • Select, configure, and integrate industry‑leading monitoring and telemetry tools (e.g., Prometheus, Grafana, ELK, Dynatrace, Datadog).
  • Develop automation and integrations to streamline metrics, logging, and tracing pipelines.
Monitoring & Incident Response
  • Establish effective alerting frameworks and SLO/SLA-driven dashboards for real‑time visibility.
  • Partner with incident‑response and SRE teams to diagnose, remediate, and prevent production issues.
  • Conduct root‑cause analysis and proactively identify performance bottlenecks and capacity needs.
Collaboration & Leadership
  • Partner with development, security, and operations teams to embed observability into system design.
  • Lead cross‑functional initiatives to standardize monitoring practices and enhance operational maturity.
  • Mentor peers and provide training on observability tools and best practices.
Continuous Improvement
  • Evaluate emerging technologies to evolve UMG’s observability strategy.
  • Drive automation and process improvements to improve system performance, resiliency, and insight quality.
  • Integrate observability with security monitoring and compliance workflows.
Data Analysis & Reporting
  • Analyze metrics, logs, and traces to surface insights into system behavior and performance trends.
  • Deliver reports and visualizations tailored for both technical and business stakeholders.
Required Skills & Experience
  • 7+ years of professional experience in information technology, including 3+ years specializing in observability, monitoring, or SRE engineering.
  • Deep knowledge of monitoring toolsets such as Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog, or equivalent.
  • Proficiency in Python, Go, or Java for automation and tool development.
  • Hands‑on experience with Kubernetes, Docker, and cloud platforms (AWS, GCP, or Azure).
  • Strong understanding of networking, infrastructure, and performance optimization.
  • Familiarity with configuration management tools (Ansible, Chef, Puppet) and CI/CD integration.
  • Proven track record designing and delivering dashboards, alerts, and performance reports for multiple audiences.
  • Excellent communication skills, with the ability to translate technical insights into actionable recommendations.
Preferred Certifications (Highly Desirable)
  • Prometheus Certified Admin
  • Kubernetes Administrator or Application Developer
  • Grafana Certified Observer
  • Dynatrace Associate
  • Splunk Core Certified Power User/Admin
  • Elastic Certified Engineer
  • Dev Ops Engineer Certification (AWS and/or Google)
Perks Playlist
  • Be part of an entrepreneurial, global organization that values authenticity, drive, creativity, relationships, and a competitive spirit
  • Comprehensive medical, dental, vision, and FSA options, as well as:
    • 100% coverage for out‑patient mental…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary