Research Scientist - CAST Propensity
Listed on 2025-11-18
-
Research/Development
Research Scientist, Data Scientist
Research Scientist - CAST Propensity
AI Security Institute is the world’s largest, best‑funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We work with frontier developers and governments globally, and we have direct lines to the UK government, including the Prime Minister’s office.
Role SummaryAs part of the Cyber & Autonomous Systems Team (CAST), you will study unprompted or unintended model behaviour – the propensity of a model to cause harm. The current project focuses on effect sizes of environmental factors on these propensities, for example whether models are more inclined to take harmful actions when their existence is threatened. You will help scale this work across different scenarios, design experiments, and provide empirical evidence with strong scientific credibility.
ResearchQuestions We Address
- How can we define when a change in one scenario is the “same” as another change in a different scenario, so that we can determine consistent, context‑independent effects?
- How can we iterate on scenario specifications to avoid bugs and over‑fitting while maintaining statistical validity?
- What research questions transfer to future, more capable models that have not yet been developed? What are the propensity analogues of the clear capability trends seen in large language models?
The team currently has one research scientist and two research engineers, and we are looking to add a second research scientist to tackle the challenges described above. You will discuss, write plans and designs, and code or review implementation of those designs.
Ideal Candidate Skills- Proven ability to identify and ope rationalise key uncertainties in a research area, and to propose and improve experimental approaches for collecting evidence.
- Knowledge of statistical inference methods to draw risk‑relevant and action‑guiding conclusions.
- Critical engagement with existing or proposed research methodology, assessing its impact on conclusions and adapting proposals where needed.
- Strong Python skills for developing and iterating on Inspect tasks (on‑the‑job learning of Inspect is acceptable).
- Solid understanding of transformer architecture and training dynamics to analyse and predict behaviour (hands‑on MLE tasks like fine‑tuning or RL not required).
- 3+ years in a quantitative research discipline (PhD student, data scientist, or researcher) involving experimental design and analysis.
- Experience writing production‑quality Python code, collaboratively or independently.
- Professional, educational, or serious hobbyist contact with large language models and transformer theory.
- Mission‑driven colleagues.
- Direct influence on AI governance and deployment worldwide.
- Work with the Prime Minister’s AI Advisor and leading AI companies.
- Opportunity to shape the first, best‑resourced public‑interest research team focused on AI security.
- Pre‑release access to frontier models and ample compute.
- Operational support to focus on research and rapid delivery.
- Collaboration with national security, policy, AI research, and adjacent science experts.
- Early ownership of important problems.
- 5 days of learning & development, annual stipends, and conference funding.
- Freedom to pursue research bets without product pressure.
- Opportunities to publish and collaborate externally.
- Modern central London office (or similar government offices in Birmingham, Cardiff, Darlington, Edinburgh, Salford or Bristol). Hybrid working available.
- At least 25 days annual leave, 8 public holidays, team breaks and 3 volunteering days.
- Generous paid parental leave (36 weeks statutory + 3 extra paid weeks + optional unpaid leave).
- Employer contributes 28.97% of base salary to pension.
- Additional benefits for cycling, donations, and retail/gyms.
Annual salary is benchmarked to role scope and experience, ranging from £65,000 to £145,000 (base salary plus technical allowance). Additional 28.97% employer pension contribution applies to base salary.
- Level 3: £65,000–£75,000 (Base £35,720 + Allowance £29,280–£39,280)
- Level 4: £85,000–£95,000…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: