Job Description & How to Apply Below
Location:
Hyderabad, Noida, Gurgaon, Chennai & Kolkata
Experience:
8-14 years
Immediate Joiners Preferred.
Mandatory
Skills:
Datadog and automation tools. Service Now and Terraform
If you are immediate joiner, Kindly share resume to with Sub of "IT Monitoring" along with notice period.
The ideal candidate is methodical, detail-oriented, and demonstrates strong ownership in building resilient observability solutions that deliver reliability insights, enable proactive incident management, and ensure operational excellence.
Responsibilities
1. System and Network Monitoring
• Continuously monitor servers, networks, applications, and databases to ensure availability and performance.
• Utilize monitoring tools to track system health, resource usage, and network traffic patterns
• Implement & configure Datadog agents across different environments (servers, containers, Cloud services)
• Set up alerting mechanisms to notify the relevant teams of performance issues or outages.
• Monitor and proactively manage operational alerts, incidents, and infrastructure issues, promptly addressing problems to maintain optimal uptime and service reliability.
2. Incident Detection and Response
• Identify and analyze incidents reported by monitoring tools or users.
• Respond to alerts promptly and coordinate with IT support to escalate and resolve issues.
• Document incidents, including root causes and remedies, to help inform future prevention strategies.
• Actively manage and coordinate incident responses using Text message and Service Now
3. Performance Analysis and Reporting
• Generate reports on system performance, availability, and incidents to provide insights to management.
• Analyze trends to predict potential issues and advise on capacity planning and resource allocation.
• Monitor adherence to service level agreements (SLAs) and report on compliance.
• Establish the Service Level Objectives (SLO), Service Level Indicators (SLI) to enhance application performance.
4. Tool Management and Optimization
• Maintain and optimize monitoring tools and systems to ensure effective performance.
• Implement new monitoring solutions as needed to enhance coverage and functionality.
• Stay updated on the latest monitoring technologies to improve monitoring capabilities.
• Automation and Optimization
• Maintain and ensure consistent availability and performance of Datadog observability solutions, including Application Performance Monitoring (APM), Infrastructure Monitoring, Log Management, Synthetic Monitoring, and Real User Monitoring (RUM).
• Utilize Terraform to manage infrastructure efficiently and securely, ensuring stability and compliance.
• Eliminate manual effort by leveraging Power Shell and Python.
• Observability coverage specifically to Kubernetes and Docker, ensuring comprehensive monitoring and performance optimization of containerized environments.
• Create, update, and manage Datadog monitors and dashboards based on incoming requests, ensuring they accurately reflect operational requirements and KPIs.
• Customize and fine-tune monitoring solutions to meet specific team or application needs, enhancing visibility and system health awareness.
5. Collaboration and Communication
• Work collaboratively with IT teams to understand monitoring requirements and expectations.
• Communicate findings and performance metrics effectively to stakeholders, ensuring clarity and understanding.
• Provide knowledge sharing sessions to educate teams on monitoring practices and tool usage.
Qualifications we seek in you!
Minimum Qualifications
• Bachelor's/Master's in CS/IT or equivalent practical experience.
Preferred Qualifications / Skills
• Stay updated on the latest monitoring technologies to improve monitoring capabilities.
• Automation and Optimization
• Maintain and ensure consistent availability and performance of Datadog observability solutions, including Application Performance Monitoring (APM), Infrastructure Monitoring, Log Management, Synthetic Monitoring, and Real User Monitoring (RUM).
Why join Genpact?
• Lead AI-first transformation – Build and scale AI solutions that redefine industries
• Make an impact –…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×