Senior/Software Engineer, Super Compute Memory
Listed on 2026-03-01
-
IT/Tech
AI Engineer, Systems Engineer, Machine Learning/ ML Engineer, Cloud Computing
Overview
About Pryon: We’re a team of AI, technology, and language experts whose DNA lives in Alexa, Siri, Watson, and virtually every human language technology product on the market. Pryon is building an industry-leading knowledge management and Retrieval-Augmented Generation (RAG) platform. Our proprietary, cutting-edge natural language processing capabilities transform unstructured data into meaningful experiences that increase productivity with unmatched accuracy and speed.
Pryon is building one of the industry's most ambitious cloud-native AI infrastructure platforms: a petabyte-scale ingestion and inference system powering mission-critical government and enterprise deployments. We need an Engineering Manager who excels at designing distributed systems for large-scale AI memory workloads in modern cloud and on-prem environments. Youulllead the team building our ingestion, retrieval, and inference layers, ensuring scalability, reliability, and compliance while navigating the ambiguity inherent in a fast-growing startup.
You will be a founding member of our Super Compute Memory (SCM) team, reporting to the VP of Engineering. This team's charter is to build the high-performance computing infrastructure that enables Pryons AI memory layer to scale to petabytes of knowledge while maintaining real-time retrieval performance.
This is a high-visibility role with significant ownership. Youll work closely with the Research, AI/ML Engineering, and Platform teams.
In This Role You Will- Build and lead a team delivering cloud-native ingestion, retrieval, and inference layers that will power mission-critical deployments for commercial and federal entities with millions of public users.
- Architect and deliver scalable, fault-tolerant distributed systems capable of handling billions of documents and burst loads of 30K+ concurrent users on managed cloud infrastructure and on-premises deployments.
- Guide implementation of multimodal ingestion pipelines (PDF, HTML, DOCX, JSON, XML, PPTX, TIFF) optimized for cloud-scale AI memory workloads.
- Oversee design and optimization of LLM-driven data ingestion and retrieval workflows using modern orchestration frameworks.
- Own optimization and tuning of high-throughput, low-latency production environments via async orchestration and resource management.
- Establish performance benchmarking, compliance frameworks, and automated testing strategies for petabyte-scale systems.
- Balance technical leadership with people leadership—guiding architecture decisions at the application and service layer while scaling and mentoring a high-performing team.
- Collaborate cross-functionally with Product, Executive Leadership, and Customer Success in a dynamic startup environment.
- 10+ years in software engineering, 5+ years in management roles delivering large-scale AI/ML systems and cloud infrastructure.
- Expert-level proficiency in Python, with strong experience in at least one systems language (Go, Rust, C++, or Java).
- 5+ years building production-grade distributed systems on cloud platforms (AWS, GCP, or Azure).
- Hands-on experience with modern ML orchestration frameworks (Ray, Kubeflow, Airflow, or similar open-source tools).
- Production experience with vector databases (Pinecone, Weaviate, Qdrant, Milvus, or similar).
- Deep understanding of message queuing and streaming systems (Kafka, Pulsar, Rabbit
MQ, Kinesis). - Proven track record designing and operating scalable, fault-tolerant distributed architectures in cloud environments.
- Direct experience building multimodal ingestion pipelines for knowledge management platforms.
- Experience optimizing LLM inference and retrieval workloads at the application/framework level (PyTorch, Tensor Flow, vLLM, or similar).
- Previous success managing engineering teams delivering production-scale AI infrastructure in startup or high-growth environments.
- Deep understanding of cloud-native distributed systems architecture: compute orchestration (Kubernetes/EKS/GKE), storage systems, networking, observability, security, disaster recovery, and cost optimization.
- Strong knowledge of AI memory and knowledge management system design patterns,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).