×
Register Here to Apply for Jobs or Post Jobs. X

QA Engineer

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: InfraCloud Technologies
Full Time position
Listed on 2026-01-17
Job specializations:
  • IT/Tech
    SRE/Site Reliability, IT Support
Job Description & How to Apply Below

Responsibilities

  • Test product‑specific use cases and validate end‑to‑end alerting workflows across monitoring systems.
  • Simulate incidents and test scenarios that trigger alerts in tools like Datadog, Prometheus, or similar monitoring platforms.
  • Verify that alerts raised in monitoring tools are correctly consumed and acted upon by downstream systems or automated workflows.
  • Understand alert rules so test cases are easier to design, execute, debug, and maintain (alert configuration will be handled by Developers/SREs, but QA must understand them).
  • Collaborate closely with engineering teams (Developers, SREs/Dev Ops) to improve detection, investigation, and automated incident response.
  • Analyse alert behaviour, validate incident pipelines, and ensure seamless integration across all monitoring and automation tools.
  • Identify gaps in monitoring, logging, and alert workflows and provide clear, actionable QA feedback.
  • Document test scenarios, alert behaviour, and monitoring workflows in a clear and reproducible manner.
Requirements
  • Monitoring Tools Expertise:
    Hands‑on experience with at least one major monitoring system (Datadog or Prometheus), including working with alerts, dashboards, and troubleshooting.
  • Alert Simulation and Validation:
    Ability to trigger, simulate, and validate alert events end‑to‑end.
  • Incident Workflow Understanding:
    Strong understanding of how alerts propagate through monitoring systems and how automated systems respond to them.
  • Automation Mindset:
    Ability to use or write simple scripts (Python, Shell, etc.) to simulate workloads or events that trigger alerts.
  • Communication and Problem Solving:
    Ability to collaborate effectively with Developers and SRE/Dev Ops teams to ensure monitoring accuracy.
  • Experience with automated incident investigation or remediation tools.
  • Familiarity with CI/CD pipelines and integrating monitoring validation into pipelines.
  • Understanding of observability fundamentals, metrics, logs, and traces.
  • Exposure to infrastructure or SRE environments.
  • Basic knowledge of Kubernetes, Docker, or cloud platforms (AWS/GCP/Azure).
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary