Job Description & How to Apply Below
Job Description:
Site Reliability Engineer
For this position, we're looking for talented & experienced engineers who have a passion for infrastructure & automation.
As a Site Reliability Engineer (SRE), you will work within the development team to combine software and systems engineering and run large-scale distributed systems. You will also maintain the client's systems' capacity and performance.
Responsibilities
Taking part in architecture-level discussions, design, planning, and implementation.
Researching to ensure what we are building is always the best path forward.
Documenting each project to facilitate integration for users.
Driving proof of concepts and minimal viable products for demonstration.
Designing and delivery of Infrastructure as Code.
Developing and implement automation for routine tasks, including alerting, system monitoring, and response mechanisms.
Developing and maintaining dashboards for monitoring and observability.
Supporting multiple services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Incident management and participating in on call rotation.
Education And Experience
To succeed in this role, candidates must have a strong foundational knowledge and demonstrated proficiency of Linux/Unix. (Talos)
At least 5 years of SRE or similar experience as a Dev Ops or Software Engineer.
At least two years of programming experience in a conventional programming language.
Kubernetes knowledge is required.
Experience with bare metal / non-managed Kubernetes would be a plus.
Experience in Python and other scripting languages.
Experience with infrastructure-as-code and configuration management tools (e.g., Terraform, Ansible, Helm, Puppet, or Chef).
Networking and cloud computing platform experience.
Proficiency in scripting and programming languages (e.g., Bash, Python, Go, Node, Java, or similar).
Familiarity with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK Stack, or similar).
Experience with Grafana Mimir.
Familiarity with CI/CD tools and SDLC practices.
You have strong problem-solving skills and excellent communication skills.
You can work independently as well as collaboratively in a remote team environment.
You are friendly, collaborative, humble, honest, and always s
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×