Engineering Manager, ML/Data Engineering; Content Trust
Listed on 2026-01-13
-
IT/Tech
Data Engineer, AI Engineer, Machine Learning/ ML Engineer, Data Science Manager
About The Company
At Scribd Inc. (pronounced “scribbed”), our mission is to spark human curiosity. Join our team as we create a world of stories and knowledge, democratize the exchange of ideas and information, and empower collective expertise through our four products:
Everand, Scribd, Slideshare, and Fable.
This posting reflects an approved, open position within the organization. We support a culture where employees can be real and bold, debating and committing to customer priorities. We believe in flexible work while prioritizing intentional in‑person moments. All Scribd employees are required to occasionally attend in‑person meetings regardless of location.
About The Team And RoleThe ML Data Engineering team is the backbone of Scribd’s commitment to a safe and trustworthy library. We build high‑throughput, ML‑driven data pipelines that process hundreds of millions of documents to detect, classify, and mitigate untrustworthy content.
As the Manager of ML Data Engineering
, you will lead a specialized team of engineers responsible for building scalable ML‑based foundations that detect and deal with harmful content. Your work ensures that safety classifiers and automated policy enforcement tools are performant, scalable, and resilient. You will sit at the intersection of Big Data, AI, MLOps, and Platform Integrity, directly impacting the safety of millions of our users.
Will
- Lead and grow a high‑performing engineering team: manage, mentor, and recruit a world‑class team of data and ML engineers. Foster a culture of technical excellence, operational rigor, and deep empathy for the user safety mission.
- Architect scalable ML data pipelines: design and oversee the development of distributed data processing systems capable of handling hundreds of millions of documents. Ensure these pipelines support both batch and real‑time inference for content moderation and risk detection.
- Build the “Trust” scores: develop and maintain foundational data layers – including semantic embeddings, metadata extracts, and behavioral signals – that power our Content Trust ML models.
- Partner on AI/LLM integration: work closely with Search & Discovery and Applied Research teams to integrate ML/LLM‑based reasoning into our trust pipelines, enabling more nuanced understanding of complex policy violations.
- Drive operational excellence: establish SLAs for infrastructure, ensuring our automated enforcement systems are fast and explainable.
- Cross‑functional leadership: collaborate with Product Managers (Content Trust), Legal/Policy teams, and Data Science to translate evolving regulatory requirements (like the DSA) into robust technical architectures.
- Leadership
Experience:
8+ years of total engineering experience, with 3+ years specifically in a people management or technical lead role within a Data or ML Engineering organization. - Scale Expertise: proven track record of building and operating production‑grade data pipelines at massive scale (100M+ entities) using technologies like Spark, Flink, Kafka, or Airflow.
- ML Infrastructure Fluency: deep understanding of the ML lifecycle, including deployment (MLOps), and vector databases (e.g., Pinecone, Milvus, or Weaviate).
- Trust & Safety Context: prior experience building systems for content moderation, fraud detection, spam prevention, or digital rights management.
- Technical Breadth: strong proficiency in Python, Scala, or Go, and experience with cloud‑native infrastructure (AWS/GCP, Kubernetes, and Snowflake/Big Query).
- Strategic Communication: ability to explain complex architectural trade‑offs to non‑technical stakeholders in Legal, Policy, and Product.
- LLM Pipelines: experience building RAG (Retrieval‑Augmented Generation) pipelines or managing the data infra for fine‑tuning Large Language Models.
- UGC
Experience:
background working with large‑scale User Generated Content (UGC) ecosystems and the unique challenges of unstructured document data. - Regulatory Knowledge: familiarity with the technical requirements of global safety regulations such as the Digital Services Act (DSA) or the UK Online Safety Act.
- Adversarial Mindset: experience building systems…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).