More jobs:
Job Description & How to Apply Below
Site Infrastructure Management and Operations
Experience:
5 to 10 Years
Employment Type:
Full-Time
Location:
Chennai(On-site)
Industry: Healthcare
Role Overview :
The Site Reliability Engineer / Systems Administrator will be responsible for managing, automating, and optimizing enterprise infrastructure across on-premises and cloud environments. The role blends traditional systems and network administration with Site Reliability Engineering and Dev Ops practices, supporting global customer engagements delivered from India-based IT services and GCC environments.
Key Responsibilities
- Endpoint, Network & Infrastructure Management
- Manage and support Windows desktops, laptops, servers, and mobile/tablet devices
- Administer endpoint security, patching, configuration baselines, and device compliance
- Manage network devices including routers, switches, firewalls, and VPNs
- Operate and maintain wireless infrastructure (WAPs, controllers, authentication)
- Monitor and troubleshoot LAN, WAN, and internet connectivity issues
Identity & Enterprise Authentication Frameworks
- Administer and support hybrid identity environments, including On-Prem Active Directory, Azure AD (Entra ), and Azure AD Connect, ensuring reliable user synchronization and access.
- Manage and troubleshoot authentication and access, including SAML/SSO integrations, identity attributes (UPN, SMTP, Immutable
ID), and sign-in issues across enterprise applications.
- Execute and maintain Power Shell automation for bulk identity, Exchange Online, and Microsoft 365 tenant operations with proper validation, logging, and rollback awareness.
- Administer Microsoft 365 and Exchange Online, including tenant configuration, mail flow, shared mailboxes, aliases, and identity-related email settings.
- Perform and validate DNS configurations (SPF, DKIM, DMARC and related records) and resolve email, authentication, and access issues during migrations and steady-state operations.
Cloud & Platform Operations
- Provision, manage, and optimize cloud infrastructure resources (compute, storage, networking)
- Ensure secure access, identity controls, and least-privilege configurations
- Monitor cloud costs, capacity, and performance trends
Reliability Engineering & Availability
- Define, implement, and continuously improve SLIs, SLOs, and SLAs
- Ensure high availability, fault tolerance, and resilience across systems
- Design and test business continuity and disaster recovery strategies
- Conduct regular failover, backup restore, and resilience testing
Backup, Recovery & Data Protection
- Define and manage backup and recovery policies for endpoints, servers, and cloud workloads
- Ensure backups meet RPO/RTO objectives and compliance requirements
- Periodically validate backup integrity and recovery readiness
Observability & Monitoring
- Implement and enhance logging, monitoring, and alerting across infrastructure and applications
- Reduce alert fatigue through intelligent thresholds and actionable alerts
- Use observability data to drive proactive reliability improvements
Incident & Problem Management
- Define and manage incidents following ITIL/SRE best practices
- Act as incident responder and coordinator during outages and degradations
- Perform root cause analysis (RCA) and document post-incident reviews
- Drive permanent fixes and reliability improvements from incident learnings
Operational Excellence & Documentation
- Create, maintain, and improve operational documentation, runbooks, and SOPs
- Standardize operational procedures to reduce risk and human error
- Support audits, compliance reviews, and internal controls through accurate documentation
- Collaborate with security, engineering, and business teams on operational readiness
Required
Skills & Qualifications
Technical Skills
- Strong hands-on experience with Windows OS administration (desktop & server)
- Strong hands-on expertise in hybrid identity platforms, including On-Prem Active Directory, Azure AD (Entra ), and Azure AD Connect, with a clear understanding of synchronization and identity consistency.
- Proven ability to automate and operate at scale using Power Shell, including bulk identity updates, Microsoft 365 administration, and safe execution of high-impact changes.
- Deep operational knowledge of Microsoft 365 and Exchange Online, including tenant configuration, mail flow, mailboxes, aliases, and identity-related email dependencies.
- Solid understanding of authentication and access protocols, including SAML/SSO, identity attributes (UPN, SMTP, Immutable
ID), and troubleshooting sign-in failures.
- Strong fundamentals in DNS and email authentication, with hands-on experience managing and validating SPF, DKIM, and DMARC to ensure reliable mail delivery and secure access.
- Solid understanding of networking fundamentals (TCP/IP, DNS, DHCP, VLANs, VPNs, firewalls)
- Experience managing endpoint management and security tools
- Practical experience with cloud platforms (IaaS, PaaS fundamentals)
- Knowledge of backup, DR, and business…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×