Infrastructure Services Director - TALON
Listed on 2026-01-12
-
IT/Tech
Systems Engineer, Cybersecurity, Cloud Computing, IT Project Manager
Infrastructure Services Director - TALON
Join to apply for the Infrastructure Services Director - TALON role at Tyto Athene, LLC
Tyto Athene is searching for a high-caliber Infrastructure Services Director to spearhead the establishment and operation of our high-performance AI R&D Lab/Data Center, our Technology Acceleration Lab for Operational Needs' TALON. This strategic role is critical for delivering high-quality, self-service infrastructure that empowers our AI R&D teams to rapidly develop and test mission-oriented solutions, including advanced defensive and mission cyber AI technologies.
This leader must blend strategic planning, deep technical expertise (HPC/GPU), an unyielding commitment to CMMC compliance, and a strong focus on Site Reliability Engineering (SRE) and Dev Ops principles to ensure secure, efficient, and reliable service delivery. A core mandate is to manage the Service Catalog and implement processes that allow developers to “go fast” while adhering to strict security and operational guardrails.
- Lead Data Center Hardware and Software Acquisition:
Finalize labor requirements, and coordinate with OEMs, VARs, Software Vendors, and partners to build compute and transport infrastructure in TALON lab. - Operationalize Data Center:
Oversee the delivery, receipt, installation, racking/stacking, configuration, integration and making infrastructure available for service. - Manage TALON Data Center in Dulles, VA:
Apply Dev Ops principles in operating, managing configurations, making entire service management lifecycle for all assets within the TALON on premise data center (including specialized GPU based infrastructure), remote nodes and cloud environments. - Attain CMMC Accreditation for TALON environments:
Establish and drive the plan to attain a CMMC accreditation for the TALON environments including data center, remote node and cloud environments, while future proofing the infrastructure strategy to embrace future needs such as NIPR/SIPR/JWICS interconnects. The plan will support needs for persisting and training models on customer provided data sets. - Cyber Network Strategy:
Design, implement and operate network segments and associated infrastructure to securely meet the unique needs of TALON AI cyber projects, covering both defensive and mission cyber considerations. - Serve as Technical Lead and administrator for TALON Data Center and TALON lab IT infrastructure.
- Maintain data center, audio visual, wifi, software, and all lab IT infrastructure.
- Cloud platforms:
Plan, provision, and optimize AWS/Azure/GCP (compute, networking, IAM, cost control); enforce guardrails and landing zones. Experience managing IL 2/4/5/6 environments. - Networks & OT connectivity:
Design and secure LAN/WAN/SD-WAN/Wi‑Fi, firewalls. Must have experience managing NIPR and SIPR, and high‑level knowledge of JWICS networks. - Cybersecurity & compliance:
Implement zero‑trust controls, patching, identity, logging/SIEM, and audit readiness (NIST/ISO). Implement CCMC standards. - Service management:
Own the service catalog, SLAs, capacity planning, vendor contracts, and budget. - Facilities interface:
Coordinate with facilities on power, cooling, UPS/generators, and physical security for server rooms.
- 10+ years of experience with:
- Windows/Linux, virtualization, storage, backups, and disaster recovery‑standardized via infrastructure‑as‑code and live dashboards.
- HPC cluster system administration, preferably in rapid AI and cyber solution prototyping environments.
- State‑of‑the‑art GPU technologies and their integration into HPC environments (driver management, software stack tools, monitoring, workload scheduling).
- Infiniband, NVLink, NVQLink, Spectrum‑X (driver management, software stack tools, monitoring).
- Container platforms (ex: Apptainer, Docker, Open Shift, Kubernetes, EKS).
- Familiarity and prior work experience with technologies such as Ansible, GIT, Slurm, Zabbix, Prometheus, Grafana and Docker.
- Slurm or other cluster schedulers, configuration and management solutions.
- NFS, SMB, and distributed object, file, and block storage management and configuration.
- High‑performance parallel file system management and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).