More jobs:
Senior Observability Architect; Datadog
Job in
Greenville, Greenville County, South Carolina, 29610, USA
Listed on 2026-01-12
Listing for:
KCM Technical
Full Time
position Listed on 2026-01-12
Job specializations:
-
IT/Tech
Systems Engineer, IT Support, Cloud Computing
Job Description & How to Apply Below
Senior Observability Architect (Datadog)
Greenville, SC
Architecture & Design- Design end-to-end observability architecture using Datadog across cloud Azure, containers, Kubernetes, and on-prem workloads.
- Define monitoring standards, SLIs/SLOs, dashboards, alerting strategy, and tagging governance.
- Design and Architect end to end solution to integrate Mainframe platforms
- Architect log ingestion pipelines, retention policies, and cost-optimized indexing strategies.
- Build scalable APM instrumentation patterns for microservices, serverless, and distributed environments.
- Deploy Datadog agents, integrations, and custom checks across large-scale infrastructure.
- Configure APM, RUM, Logs, SIEM, Synthetics, Network Performance Monitoring, and CI/CD Observability.
- Work closely with Dev Ops, SRE, Cloud, and Application teams to instrument services and ensure visibility.
- Analyze and optimize Datadog costs: usage, retention settings, indexing, and billing insights.
- Establish organization-wide tagging standards, dashboards, alerting guardrails, and onboarding processes.
- Create reusable templates, Terraform modules, and automation scripts for Datadog deployment.
- Ensure compliance with security and observability best practices.
- Mentor teams on Datadog usage, training engineers on dashboards, logs, traces, and alerts.
- Lead RCA investigations using Datadog metrics, traces, logs, and correlated events.
- Collaborate with engineering teams to improve system reliability, resilience, and performance.
- Identify gaps in observability and propose improvements across the stack.
- 6 years in Observability, Monitoring, SRE, Dev Ops, or Cloud Engineering.
- 3+ years of hands-on experience with Datadog.
- Strong understanding of distributed systems, microservices, and cloud-native architectures.
- Expertise with Kubernetes, Docker, AWS/Azure/GCP cloud services.
- Experience with Infrastructure as Code (Terraform preferred).
- Strong knowledge of APM, Metrics, Logs, RUM, Synthetics, and Security Monitoring.
- Deep experience with Datadog dashboards, alerting, monitors, service maps, event correlation, and notebooks.
- Proficiency with Python, Bash, or similar scripting languages.
- Strong analytical, communication, and problem-solving skills.
- Datadog Certifications (Datadog Fundamentals, APM, Log Management, or Observability).
- Experience with Retail for observability tools.
- CI/CD observability experience (Git Hub Actions, Jenkins, Git Lab CI, etc).
- Background in Performance Engineering, Reliability Engineering, or Platform Engineering.
Position Requirements
10+ Years
work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×