Senior/Software Engineer,Super Compute Memory Job Chicago area,Illinois USA,IT/Tech

Position: Senior/Staff Software Engineer, Super Compute Memory

Overview

About Pryon: We’re a team of AI, technology, and language experts whose DNA lives in Alexa, Siri, Watson, and virtually every human language technology product on the market. Pryon is building an industry-leading knowledge management and Retrieval-Augmented Generation (RAG) platform. Our proprietary, cutting-edge natural language processing capabilities transform unstructured data into meaningful experiences that increase productivity with unmatched accuracy and speed.

Pryon is building one of the industry's most ambitious cloud-native AI infrastructure platforms: a petabyte-scale ingestion and inference system powering mission-critical government and enterprise deployments. We need an Engineering Manager who excels at designing distributed systems for large-scale AI memory workloads in modern cloud and on-prem environments. Youulllead the team building our ingestion, retrieval, and inference layers, ensuring scalability, reliability, and compliance while navigating the ambiguity inherent in a fast-growing startup.

You will be a founding member of our Super Compute Memory (SCM) team, reporting to the VP of Engineering. This team's charter is to build the high-performance computing infrastructure that enables Pryons AI memory layer to scale to petabytes of knowledge while maintaining real-time retrieval performance.

This is a high-visibility role with significant ownership. Youll work closely with the Research, AI/ML Engineering, and Platform teams.

In This Role You Will

Build and lead a team delivering cloud-native ingestion, retrieval, and inference layers that will power mission-critical deployments for commercial and federal entities with millions of public users.
Architect and deliver scalable, fault-tolerant distributed systems capable of handling billions of documents and burst loads of 30K+ concurrent users on managed cloud infrastructure and on-premises deployments.
Guide implementation of multimodal ingestion pipelines (PDF, HTML, DOCX, JSON, XML, PPTX, TIFF) optimized for cloud-scale AI memory workloads.
Oversee design and optimization of LLM-driven data ingestion and retrieval workflows using modern orchestration frameworks.
Own optimization and tuning of high-throughput, low-latency production environments via async orchestration and resource management.
Establish performance benchmarking, compliance frameworks, and automated testing strategies for petabyte-scale systems.
Balance technical leadership with people leadership—guiding architecture decisions at the application and service layer while scaling and mentoring a high-performing team.
Collaborate cross-functionally with Product, Executive Leadership, and Customer Success in a dynamic startup environment.

What You Need to Be Successful

10+ years in software engineering, 5+ years in management roles delivering large-scale AI/ML systems and cloud infrastructure.
Expert-level proficiency in Python, with strong experience in at least one systems language (Go, Rust, C++, or Java).
5+ years building production-grade distributed systems on cloud platforms (AWS, GCP, or Azure).
Hands-on experience with modern ML orchestration frameworks (Ray, Kubeflow, Airflow, or similar open-source tools).
Production experience with vector databases (Pinecone, Weaviate, Qdrant, Milvus, or similar).
Deep understanding of message queuing and streaming systems (Kafka, Pulsar, Rabbit

MQ, Kinesis).
Proven track record designing and operating scalable, fault-tolerant distributed architectures in cloud environments.
Direct experience building multimodal ingestion pipelines for knowledge management platforms.
Experience optimizing LLM inference and retrieval workloads at the application/framework level (PyTorch, Tensor Flow, vLLM, or similar).
Previous success managing engineering teams delivering production-scale AI infrastructure in startup or high-growth environments.

Technical Depth

Deep understanding of cloud-native distributed systems architecture: compute orchestration (Kubernetes/EKS/GKE), storage systems, networking, observability, security, disaster recovery, and cost optimization.
Strong knowledge of AI memory and knowledge management system design patterns,…


Increase/decrease your Search Radius (miles)



Job Posting Language

Senior​/Software Engineer, Super Compute Memory

Senior/Software Engineer, Super Compute Memory