Job Description
As a software engineer on the Emergency Call Management site reliability engineering (ECM-SRE) team you will join a team of talented software engineers who work directly with product and engineering teams to constantly improve reliability across our suite of public safety products.
Your responsibilities will include:
Architecture and implementation of Monitoring/Observability objectives. This includes maintenance of Alert response playbooks.
Creation and reinforcement of the HA and reliability strategy.
Triage of customer-reported incidents and problems to the proper software team, requiring troubleshooting and problem management skills.
Maintenance and reporting of SLOs and error budget.
Facilitation of Chaos Engineering activities with multiple engineering teams.
Developing the SRE culture and sharing best practices across Motorola Solutions’ Emergency Call Management organization.
On-call support alongside multiple engineering teams for products and services in production. This role focuses on Incident Command to maintain focus and direction of the incident process. This is essential to meet regulatory reporting requirements.
Assist Motorola Solutions’ customer support teams in creating customer facing communication documents, requiring strong communication skills.
Facilitation of Failure Mode and Effects Analysis with multiple engineering teams.
The right individual will have a passion for observability, reliability, automation, incident response, and enabling innovation.
Qualifications:
BS in Computer Engineering (or equivalent degree)
4+ years of professional software development
Excellent communication skills
Experience developing cloud-based applications
Experience developing REST-based APIs and implementing microservice principles and architectures
Experience with modern Dev Ops tooling (including CI/CD pipelines)
Familiarity with the concepts involved in designing a high availability architecture
Familiarity with observability and monitoring
Familiarity with automated testing
Creativity and persistence when solving complex problems
Enthusiasm for learning key technologies, architectures, processes, and best practices
Preferred Skills
Familiarity with SRE or Dev Ops
Familiarity with container deployment and orchestration technologies at scale
Familiarity with SLOs and SLIs
Familiarity with incident response, disaster recovery, root cause analysis, and postmortems
Familiarity with IaC
Familiarity with chaos engineering
Familiarity with redundancy and failovers
Familiarity with capacity planning and load balancing
Familiarity with service mesh
Familiarity with feature flags, canary releases, or blue/green deployments
Familiarity with hybrid cloud architecture
Familiarity developing cloud-based applications with a multi-tenant database architecture
Familiarity with systems programming (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
Experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts
Experience being on-call for a product in production
Description du poste
En tant qu'ingénieur logiciel au sein de l'équipe d'ingénierie de fiabilité du site de gestion des appels d'urgence (ECM-SRE), vous rejoindrez une équipe d'ingénieurs logiciels talentueux qui travaillent directement avec les équipes de produits et d'ingénierie pour améliorer constamment la fiabilité de notre suite de produits de sécurité publique.
Vos responsabilités comprendront :
L'architecture et la mise en œuvre des objectifs de surveillance/observabilité. Cela inclut la maintenance des manuels d'intervention en cas d'alerte.
La création et le renforcement de la stratégie de haute disponibilité (HA) et de fiabilité.
Le triage des incidents et des problèmes signalés par les clients vers l'équipe logicielle appropriée, nécessitant des compétences en dépannage et en gestion des problèmes.
La maintenance et le reporting des SLOs (Objectifs de Niveau de Service) et du budget d'erreur.
La facilitation des activités d'ingénierie du chaos (Chaos Engineering) avec plusieurs équipes d'ingénierie.
Le développement de la culture SRE et…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: