Engineering Manager - Aggregates Systems
Listed on 2026-01-16
-
IT/Tech
Systems Engineer, Cybersecurity
About the Role
Abnormal AI is seeking an experienced Engineering Manager to lead our Aggregates Systems team, which owns some of the most mission-critical infrastructure in the company
. These systems sit at the core of our detection architecture, transforming raw behavioral signals into high-quality aggregate datasets that directly power security decisions across all Abnormal products.
The Aggregates Systems operate at extreme scale—processing tens of billions of events per day and approaching one million events per second at peak
— and correctness, availability, and timeliness are non-negotiable. In this role, you will manage and grow a team of engineers responsible for building, operating, and evolving large-scale batch and real-time aggregation systems that Abnormal’s detection, messaging, and customer trust depend on. You will partner closely with engineering, data science, product, and infrastructure teams to ensure these systems remain reliable, performant, and scalable as the company grows.
You’ll Do Technical & Systems Leadership
- Own the technical direction, reliability, and long-term evolution of aggregation systems spanning both batch and real-time processing.
- Guide architectural decisions for distributed data processing, storage, and retrieval systems with strict correctness, latency, and availability requirements.
- Ensure aggregation systems consistently meet SLAs for data freshness, accuracy, and uptime across detection and messaging use cases.
- Act as a senior technical steward for systems whose failure or inaccuracy would have direct customer and security impact.
- Manage, mentor, and develop a team of software engineers working across batch and streaming aggregation systems.
- Foster a culture of technical excellence, operational ownership, and continuous improvement.
- Support career growth through coaching, feedback, and clear performance expectations.
- Partner with Detection Engineering, Data Science, Product, and Infrastructure teams to translate detection requirements into scalable aggregation capabilities.
- Collaborate with stakeholders to define roadmaps that balance feature delivery with system stability and long-term maintainability.
- Serve as a technical and organizational point of accountability for aggregation systems.
- Drive strong operational practices, including monitoring, alerting, incident response, and post-incident analysis.
- Improve observability, data quality validation, and resiliency across batch and real-time systems.
- Ensure systems scale efficiently as data volume, customer footprint, and use cases expand.
Aggregates System
- In Online systems, the team operates low-latency aggregation systems designed to process high-throughput event data with strict timeliness guarantees.
- Composed of event producers, a Kafka-based event stream, and a signals stack including ingestion and storage update services.
- Enable fast, flexible retrieval of aggregate values for security and messaging applications.
- In Offline systems, the team operates batch processing systems that transform large volumes of raw behavioral signals (e.g., emails and user activity) into comprehensive aggregate datasets.
- Operate on scheduled cadences and support long lookback windows for historical analysis and trend detection.
- Play a foundational role in determining whether messages are safe or malicious by powering detection logic and models.
Required Qualifications
- 5+ years of software engineering experience, with 2+ years in an engineering management role.
- Strong experience designing and operating distributed, data-intensive systems.
- Hands-on familiarity with batch processing and/or real-time streaming systems (e.g., Kafka, Spark, Flink, Beam).
- Proven ability to lead complex, high-impact technical initiatives and make sound architectural tradeoffs.
- Excellent communication skills and experience working cross-functionally.
- Experience building or owning mission-critical aggregation or signals systems.
- Familiarity with data correctness, quality guarantees, and observability in…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).