IC Net Engineer, Operations/Reliability
Listed on 2026-03-01
-
IT/Tech
Systems Engineer, IT Support, Network Engineer, Cloud Computing
Location: Remote, with a preference for candidates based near major data infrastructure hubs in Texas or New York
Industry: Infrastructure Technology / High-Performance Computing
Job Type: Full-Time
About the Company:
Our client is a cutting-edge technology innovator enabling the next generation of artificial intelligence infrastructure. Their mission is to support the world’s leading research institutions and enterprise clients in accelerating machine learning workloads and optimizing compute h operations expanding across the U.S. and beyond, they are scaling their network and datacenter operations team to ensure performance and reliability across all deployments.
This is a rare opportunity to get in on the ground floor of a transformative industry player driving progress in AI, machine learning, and large-scale systems engineering.
Position Overview:
This role is tailor-made for hands-on network operators who want both strategic ownership and real-world execution in the field. As the IC Net Engineer focused on Operations and Reliability, you will be the go-to expert for a designated region, acting as a key node between the core network operations team and on-the-ground data infrastructure. You’ll support incident response, coordinate hardware repairs, validate deployments, and shape the future of operational runbooks.
This is a critical role requiring high accountability, technical depth, and a collaborative mindset.
Key Responsibilities:
- Serve as the primary operational lead for datacenter network performance in your assigned region, owning incident response, reliability, and repair execution.
- Act as Tier 2/3 escalation for network-related incidents, troubleshooting physical and logical issues and ensuring timely resolution.
- Partner with remote NOC teams and regional hardware specialists to drive incident closure and prevent recurrences.
- Manage field testing, diagnostics, and support dashboards to ensure system-wide observability.
- Work with cross-functional teams (Deployment, Hardware, Logistics) to ensure seamless network integration during datacenter expansions and new pod deployments.
- Drive the execution and evolution of runbooks for repair and non-repair tasks, identifying improvement opportunities and updating documentation accordingly.
- Build close working relationships with vendors, field techs, and internal teams, representing the voice of network engineering in operational scenarios.
Preferred Candidate Profile:
- 5-8 years of hands-on network engineering and data center operations experience, particularly in production environments where uptime is mission-critical.
- Deep familiarity with modern data center networking concepts including EVPN/VXLAN, BGP, CLOS architectures, and high-throughput switching systems.
- Demonstrated strength in debugging Layer 2/3 networking issues, BGP pathing, hardware faults, and signal integrity.
- Proficient in basic scripting (Python, Jupyter), and data tools (SQL, Grafana, Tableau) to support operational metrics and observability.
- Strong incident leadership skills with the ability to remain calm, clear, and effective during time-sensitive outages.
- Prior experience in matrixed environments requiring coordination across teams, locations, and time zones.
- Able to adapt to hybrid environments with roughly 30-40% regional travel depending on operational demand.
Bonus
Experience:
- Prior exposure to high-performance computing networks (HPC, AI/ML fabrics, RDMA) and lossless Ethernet.
- Familiarity with automation frameworks like Ansible or custom scripting for operational efficiencies.
- Experience managing or mentoring others in field roles or regional leadership.
What’s In It For You:
- Join a high-growth, mission-driven team at the forefront of AI infrastructure.
- Flexible work environment with remote-first policies balanced by essential onsite visits.
- Generous compensation package including competitive base, equity, and full health benefits.
- Opportunity to grow into a regional leadership role as the team scales.
- Direct impact on the performance and reliability of cutting-edge systems powering the future of intelligence.
About Blue Signal:
Blue Signal is an award-winning, executive search firm specializing in various specialties. Our recruiters have a proven track record of placing top-tier talent across industry verticals, with deep expertise in numerous professional services. Learn more /46
Gs4yS
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).