Lead, Technical Operations Center
Listed on 2026-01-12
-
IT/Tech
Cloud Computing, IT Project Manager, Systems Administrator
#LI-Onsite
Who We AreFounded in 2005, the 2K label includes some of the most talented game development studios in the world today including:
Firaxis Games, Visual Concepts, Hangar 13, 2K Czech and Cat Daddy Games. Our world-class team of engineers, developers, graphic artists and publishing professionals are stewards of a growing library of critically‑acclaimed franchises such as Battleborn, Bio Shock, Borderlands, The Darkness, Mafia, NBA 2K, Sid Meier’s Civilization, WWE 2K, and XCOM. 2K is headquartered in Novato, California and is a wholly owned label of Take‑Two Interactive Software, Inc.
(NASDAQ: TTWO).
2K develops and publishes interactive entertainment globally for console systems, handheld gaming systems and personal computers, including smartphones and tablets, which are delivered through physical retail, digital download, online platforms, and cloud streaming services. 2K publishes titles in today’s most popular gaming genres, including shooters, action, role‑playing, strategy, sports, casual, and family entertainment.
Our vision at 2K is to create a diverse and inclusion environment to “Come as You are and Feel Equipped to do Your Best Work!” We are dedicated to promoting diversity, multiculturalism, and equality in all that we do. Our communities are focused on increased access and personal growth, and their greatness depends on a diversity of race, gender, sexual orientation, religion, ethnicity, national origin, and perspective.
We're an equal opportunity employer, and we're excited to build the future of co‑living with the world's most hardworking and passionate people.
We are seeking a highly motivated and experienced Technical Operations Center Lead to manage and mentor our 24/7 Technical Operations Center team. This role is the lynchpin of our live service operations, critical for maintaining the high availability, performance, and reliability of our global game infrastructure.
The ideal candidate is a composed, decisive leader with deep technical expertise in incident, problem, and service request management. They must be adept at balancing immediate, high‑pressure incident response with strategic, long‑term process improvements to optimize all operational workflows and service delivery.
What You Will Do- Lead the daily operations of the 24/7 TOC team, including prioritization and execution of work, emergency response, and ad hoc duties.
- Serve as the primary Incident Commander during major production outages, owning the incident life cycle from detection and triage to resolution and executive notification.
- Manage Service Request fulfillment within the TOC, ensuring that internal requests (e.g., service restarts, access grants, environmental data refreshes) are prioritized, documented, and executed efficiently by the team.
- Champion the Problem Management process by analyzing trends in recurring incidents, driving Root Cause Analysis, and tracking permanent corrective actions to resolution across engineering teams.
- Develop, maintain, and facilitate operational procedures, escalation matrices, and comprehensive runbooks for all critical game services and infrastructure.
- Oversee and optimize our monitoring, alerting, and logging platforms (e.g., Datadog, Check
MK) to ensure effective coverage and minimize alert fatigue. - Collaborate with SRE, Development, and QA teams to integrate new services into the TOC's operational scope and improve the observability of services.
- Mentor and train TOC Engineers and Analysts in advanced troubleshooting techniques, cloud infrastructure fundamentals, and effective service management principles.
- 5+ years of experience in a Technical Operations Center (TOC), Network Operations Center (NOC), Site Reliability Engineering (SRE), or similar operational role.
- 2+ years of demonstrated leadership or management experience overseeing a 24/7 team.
- Deep technical understanding and proven application of IT Service Management (ITSM) concepts, including Incident, Request, and Problem Management.
- Expertise in formal incident management methodologies (e.g., ITIL, SRE Incident Response).
- Deep technical understanding of cloud…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).