More jobs:
Site Reliability Engineer
Job in
Atlanta, Fulton County, Georgia, 30301, USA
Listed on 2026-03-03
Listing for:
ACL Digital
Full Time
position Listed on 2026-03-03
Job specializations:
-
IT/Tech
Cloud Computing, SRE/Site Reliability
Job Description & How to Apply Below
Site Reliability Engineer
Atlanta, GA
Duration: 12 months
Site Reliability Engineer (SRE) with AWS Cloud and Application Monitoring Experience
We are seeking a skilled Site Reliability Engineer (SRE) with expertise in AWS cloud infrastructure and robust application monitoring capabilities.
As an integral part of our team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems and applications.
Responsibilities:
* Implement, improve monitoring, alerting, and logging solutions to detect and respond to incidents.
* Collaborate closely with development team to deploy applications and services and ensure they meet reliability and performance standards.
* Automate deployment, configuration management, and troubleshooting processes to streamline operations.
* Participate in on-call rotation and triage production incidents, lead RCAs, and implement preventive actions.
* Conduct capacity planning and performance analysis to handle growing user traffic and data volume effectively.
* Establish and enforce best practices for security, monitoring, and disaster recovery.
* Continuously evaluate and implement new technologies to optimize infrastructure efficiency and reliability.
Requirements:
* Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent work experience.
* Proven experience as a Site Reliability Engineer or similar role, with a strong focus on AWS cloud infrastructure.
* Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, Cloud Formation).
* Hands-on experience with monitoring tools such as Cloud Watch, Sumo Logic, Dynatrace, Grafana, or similar for application performance monitoring and alerting.
* Proficiency in scripting and automation (e.g., Python, Bash) to build and maintain deployment pipelines and infrastructure.
* Strong analytical and troubleshooting skills to diagnose and resolve complex infrastructure and application, data issues.
* Experience with containerization (Docker, Kubernetes) and serverless architecture (AWS Lambda).
* Familiarity with CI/CD pipelines and version control systems (Git) for continuous integration and deployment.
* Excellent communication skills and ability to collaborate effectively with cross-functional teams.
AWS Certification is plus.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×