Site Reliability Engineer
Listed on 2026-02-28
-
IT/Tech
Systems Engineer, Cloud Computing, Cybersecurity, IT Support
Job# 3020213
Job Description:
Irving Texas
As part of tech modernization and cloud migration, digital applications are undergoing migration to Azure cloud environment. The applications are needed to be performance tested with the required tunable to be resilient enough.
Responsibilities- Performance Testing:
Design and execute performance tests to evaluate the system's responsiveness, stability, scalability, and resource usage. - Identify performance bottlenecks and provide recommendations for improvements.
- Analyze test results and generate detailed performance reports.
- Resiliency Testing:
Conduct resiliency tests to ensure the system can handle failures and recover gracefully. - Implement and test failure scenarios to validate the system's fault tolerance.
- Recommend and validate resiliency patterns such as circuit breakers, bulkheads, and retries.
- Performance Monitoring:
Set up and maintain performance monitoring tools to continuously track system performance. - Analyze performance metrics and logs to detect and diagnose performance issues in real-time.
- Capacity Planning:
Perform capacity planning to ensure the system can handle expected and peak loads. - Provide recommendations for scaling resources based on performance data and future growth projections.
- Performance Optimization:
Collaborate with development and operations teams to optimize code, database queries, and infrastructure configurations. - Recommend best practices for performance tuning and optimization.
- Kubernetes Performance Parameters:
Recommend and configure performance parameters for Kubernetes clusters, such as resource limits, requests, and autoscaling policies. - Ensure optimal performance of containerized applications running in Kubernetes environments.
- Resiliency Patterns:
Recommend and implement resiliency patterns like circuit breakers, rate limiters, and fallback mechanisms to enhance system reliability. - Validate the effectiveness of these patterns through testing and monitoring.
- Documentation and Training:
Document performance testing methodologies, tools, and best practices. - Provide training and support to development and operations teams on performance and resiliency best practices.
- Continuous Improvement:
Continuously evaluate and improve performance testing and monitoring processes, staying updated with the latest performance engineering tools, techniques, and industry trends.
- Experience with containerization technologies like Docker.
- Strong scripting skills in languages such as Bash, Python.
- Effective problem‑solving and analytical skills.
- Familiarity with observability and APM tools like Splunk, ELK, App Dynamics, etc.
- Good understanding of architecture patterns and resiliency.
- Programming experience in Java and Spring Boot.
- Strong microservices application support experience.
- Proficient understanding of algorithms, data structures, architectural design patterns and best practices.
- Experience with Cloud is required.
- Experience working with applications using Kubernetes platform.
- Understanding of networking concepts, including DNS, load balancing, firewalls, and VPNs.
EEO Employer. Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law.
Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law.
Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401(k) program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers an HSA and a Support Linc Employee Assistance Program (EAP) with up to 8 free counseling sessions.
In terms of professional development, Apex hosts an on‑demand training program, provides access to certification prep and a library of technical and leadership courses once you have 6+ months of tenure, and offers certification discounts and other perks to associations that include CompTIA and IIBA.
Apex Systems is part of the Commercial Segment of ASGN Incorporated.
NYSE: ASGN
4400 Cox Road
Suite 200
Glen Allen, Virginia 23060
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).