IT Systems Administrator
Listed on 2026-01-12
-
IT/Tech
IT Support, Systems Engineer, Cybersecurity
Positron.ai specializes in developing custom hardware systems to accelerate AI inference. These inference systems offer significant performance and efficiency gains over traditional GPU-based systems, delivering advantages in both performance per dollar and performance per watt. Positron exists to create the world's best AI inference systems.
About the roleWe’re hiring an IT Systems Administrator to own the on‑prem environment that powers AI inference systems and an on‑prem compute cluster reliable, secure, and observable; support remote access (via VPN) for distributed teammates; and be the hands‑on owner of server room operations, storage, networking, virtualization, provisioning, and monitoring. This is a high‑impact IC role with broad scope across hardware, software, and documentation.
Whatyou'll do
- Server room operations: Rack/unrack servers and network gear; manage cabling; configure PDUs; maintain accurate inventories and diagrams
- Storage & backups: Operate and harden NAS; manage NFS exports/mounts; implement/test backup/restore; enforce access controls
- Networking: Configure/maintain switches, routers, APs, and firewalls; manage VLANs, VPNs (incl. IPsec), DNS/DHCP/IPAM; monitor performance and security; troubleshoot connectivity; manage primary/backup ISPs; support Tailscale access
- Provisioning & config management: Maintain PXE/kickstart/UEFI workflows; automate OS/app configuration with Ansible; keep golden images and templates current
- Cluster & job infrastructure: Monitor cluster utilization and job health; troubleshoot failures/performance issues; plan/execute software and hardware upgrades
- Virtualization: Administer Proxmox (or similar); create/manage VMs and templates; monitor host/guest performance; triage virtualization issues
- Observability & incident response: Operate Prometheus/Grafana (and related exporters/alerts); create actionable alerts; analyze trends; run incident comms and postmortems; schedule and report maintenance windows
- Documentation & process: Maintain runbooks, SOPs, topology maps, and asset records (make/model/SN/tags/location/usage); champion repeatable, auditable operations
- 5+ years administering Linux systems in a mixed on‑prem environment (servers, switches/firewalls, NAS, SAN). Strong in Ethernet/IP, VLANs, firewalls/VPNs, DNS/DHCP/NTP; confident with Ansible
, PXE
, Bash
, and Git - Hands‑on with NFS/NAS
, snapshots/replication, and backup/restore drills - Experience with virtualization (Proxmox/KVM/ESXi), VM templating, and host lifecycle management
- Monitoring/alerting with Prometheus/Grafana (or equivalent), plus log collection and dashboarding
- Clear documentation habits; steady incident responder with on‑call experience
- Tailscale administration; IPsec tunnels;
Proxmox clustering and Ceph; L2/L3 switch config (e.g., VLAN trunks, LACP);
Terraform; secrets management; hardware automation (Redfish/IPMI) - Familiarity with SLURM or job schedulers; GPU server care and feeding; basic Python for ops tooling
- Must work in Spokane, WA facility for racking, wiring, and inventories
- Ability to lift/move ~50 lb servers; follow ESD and safety best practices
Your work keeps our engineers productive and our systems dependable—shortening time-to-result for ASIC/FPGA/DV and software teams while raising our security, reliability, and velocity.
Equal Opportunity Employer. If you’re excited about the role but don’t meet every bullet, we’d
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).