Senior Research Scientist; LLMs
Listed on 2026-01-24
-
IT/Tech
AI Engineer, Data Scientist, Artificial Intelligence
Aldea is a multi-modal foundational AI company reimagining the scaling laws of intelligence. We believe today's architectures create unnecessary bottlenecks for the evolution of software. Our mission is to build the next generation of foundational models that power a more expressive, contextual, and intelligent human–machine interface.
The RoleWe are hiring a Foundational AI Research Scientist (LLMs) to pioneer next‑generation large‑language‑model architectures. Your work will focus on designing, prototyping, and empirically validating efficient transformer variants and attention mechanisms that can scale to production‑grade systems.
You'll explore cutting‑edge ideas in efficient sequence modeling, architecture design, and distributed training—building the foundations for Aldea's next‑generation language models. This role is ideal for researchers who combine deep theoretical grounding with hands‑on systems experience.
What You'll Do- Research and prototype sub‑quadratic attention architectures to unlock efficient scaling of large language models.
- Design and evaluate efficient attention mechanisms including state‑space models (e.g., Mamba), linear attention variants, and sparse attention patterns.
- Lead pre‑training initiatives across a range of model scales from 1B to 100B+ parameters.
- Conduct rigorous experiments measuring the efficiency, performance, and scaling characteristics of novel architectures.
- Collaborate closely with product and engineering teams to integrate models into production systems.
- Stay at the forefront of foundational research and help shape Aldea's long‑term model roadmap.
- Requires a Ph.D. in Computer Science, Engineering, or related field.
- 3+ years of relevant industry experience.
- Deep understanding of modern sequence modeling architectures including State Space Models (SSMs), Sparse Attention mechanisms, Mixture of Experts (MoE), and Linear Attention variants.
- Hands‑on experience pre‑training large language models across a range of scales (1B+ parameters).
- Expertise in PyTorch, Transformers, and large‑scale deep‑learning frameworks.
- Proven ability to design and evaluate complex research experiments.
- Demonstrated research impact through patents, deployed systems, or core‑model contributions.
- Experience with distributed training frameworks and multi‑node optimization.
- Knowledge of GPU acceleration, CUDA kernels, or Triton optimization.
- Publication record in top‑tier ML venues (NeurIPS, ICML, ICLR) focused on architecture research.
- Experience with model scaling laws and efficiency‑performance tradeoffs.
- Background in hybrid architectures combining attention with alternative sequence modeling approaches.
- Familiarity with training stability techniques for large‑scale pre‑training runs.
- Performance‑based bonus aligned with research and model milestones
- Equity participation
- Flexible Paid Time Off
- Comprehensive health, dental, and vision coverage
Aldea is proud to be an equal‑opportunity employer. We are committed to building a diverse and inclusive culture that celebrates authenticity to win as one. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, disability, protected veteran status, citizenship or immigration status, or any other legally protected characteristics.
Aldea uses E‑Verify to confirm employment eligibility in compliance with federal law. For more information please visit: https://(Use the "Apply for this Job" box below)..gov
Please note:
We do not accept unsolicited resumes from recruiters or employment agencies and will not be responsible for any fees related to unsolicited resumes.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).