Production Support and Site Reliability Engineer; SRE
Job Description & How to Apply Below
Responsibilities
- Manage day-to-day production support activities for both web and mobile applications (Android and iOS).
- Maintain overall health, stability, and safety of production systems and applications.
- Identify operational risks and recommend mitigation strategies.
- Improve application instrumentation, logging, alerting, and monitoring capabilities.
- Perform change management activities across test and production environments.
- Execute code deployments while adhering to source code management, release management, and compliance policies.
- Ensure proper governance and documentation of all deployment activities.
- Work closely with development teams and business partners to recommend solutions that combine internal development, integration with other applications, and vendor platforms.
- Thrive in an agile environment by contributing to sprint activities and collaborative planning.
- Communicate effectively with team members, management, infrastructure teams, and other interface groups throughout the project lifecycle.
- Develop strong understanding of business processes and enterprise systems.
- Provide coaching, expertise, and continuous feedback to help build the team’s capability.
- Share technical knowledge to support onboarding and skill growth.
- Participate in occasional weekend and after‑hours support for critical issues or deployments.
- Technical Troubleshooting & Monitoring
- Hands‑on experience troubleshooting application and database issues using:
- Elastic / Kibana
- Mongo
DB services running on Linux - IIS Web Servers on Windows
- Kafka (basic to intermediate knowledge)
- Strong proficiency with Following Database & Application Technologies:
- Solid ability to write, read, and troubleshoot SQL queries.
- Knowledge of SQL database architecture, performance monitoring, and optimization.
- Good understanding of Following Operating Systems:
- Windows
- Linux
- Automation & Infrastructure
- Experience automating routine database or infrastructure operations.
- Proficiency working with cloud‑hosted applications and services
- Dev Ops & SRE Practices
- Experience with Dev Ops and Site Reliability Engineering tools such as:
Helios, Urban Code Deploy (UCD), Jenkins, Ansible - Knowledge of CI/CD pipelines, release workflows, and automation strategies.
- Experience with Dev Ops and Site Reliability Engineering tools such as:
- Needs
Experience with :- Monitoring tools such as Catchpoint and Aternity.
- Productivity & Support Tools
- Jira and Confluence for project/task management.
- Firebase, Google Play Console, and Google Analytics for Android apps.
- Apple App Store experience for iOS application operations.
- Soft Skills & Frameworks
- Strong analytical, problem‑solving, and decision‑making skills.
- Solid understanding of ITIL service management practices.
- Experience using Service Now for incident, problem, and change management.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×