Senior Researcher - Text to Speech | San Francisco
Listed on 2026-01-10
-
IT/Tech
AI Engineer, Data Scientist, Machine Learning/ ML Engineer, Artificial Intelligence
What you’ll do
Lead research on Text-to-Speech models focused on naturalness, expressiveness, latency, and robustness
Design and train TTS systems for real-world voices across accents, languages, and speaking styles
Improve streaming and low-latency speech synthesis pipelines
Experiment with architectures, loss functions, and data strategies (multi-speaker training, style modeling, distillation, data augmentation)
Translate research ideas into production-ready TTS systems
Collaborate closely with infra, product, and voice engineering teams
Strong background in Text-to-Speech / speech generation research
Hands-on experience with deep learning frameworks (
PyTorch preferred
)Experience with
real-time or low-latency TTS systemsFamiliarity with modern TTS architectures (Tacotron-style, Fast Speech, VITS, diffusion-based, neural vocoders)
Ability to think end-to-end:
data → model → inference → deploymentPrior work in multilingual, expressive, or accented speech synthesis is a strong plus
Publications in top speech / ML conferences
Experience deploying TTS models in real-time production
Exposure to conversational AI or voice agents
3–6 years of specialized experience in speech through academia or industry
Master’s or PhD in Speech, ML, or a related field
Note: We often make exceptions and hire brilliant candidates regardless of years of experience or education.
Proof of work is paramount.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).