Research Scientist , Machine Learning NYC/REMOTE
New York City, Richmond County, New York, 10261, USA
Listed on 2025-11-29
-
Research/Development
Data Scientist -
IT/Tech
Data Scientist, AI Engineer
Location Remote or New York City, US
Organization Poseidon Research
Compensation $100,000–$150,000 annually; or higher, depending on experience
Type One year contract
Poseidon Research is an independent AI safety laboratory based in New York City. Our mission is to make advanced AI systems transparent, trustworthy, and governable through deep technical research in interpretability, control, and secure monitoring. We investigate how models think, hide, and reason— from understanding encoded reasoning and steganography in reasoning models to building open‑source monitoring tools that preserve human oversight.
Our research spans mechanistic interpretability, reinforcement learning, control, information theory, and cryptography, bridging the theoretical and the practical. You could be a cog in a big lab and gamble with humanity’s future. Or you could own your entire research agenda at Poseidon Research, pioneering our understanding of AI’s inner workings to build a safe, secure, and prosperous future.
We are seeking a Research Scientist to help design, execute, and publish cutting‑edge research on how advanced models represent, encode, and conceal information. This high‑autonomy position is suited to those who want to pursue fundamental research with immediate practical implications—bridging theory, experiment, and deployment
. You will collaborate closely with research engineers to turn conceptual ideas into reproducible systems by building pipelines, datasets, and model organisms that make opaque behaviors measurable and controllable.
- Design and conduct experiments on base LLMs and reasoning models (e.g., Deep Seek‑R1 and V3, GPT‑OSS, QwQ) to study phenomena like encoded reasoning, steganography, and reward hacking.
- Develop and analyze model organisms
—controlled, interpretable LLMs that exhibit key properties such as hidden communication or deceptive reasoning. - Contribute to interpretability tools and pipelines for whitebox monitoring using frameworks like Transformer Lens.
- Formalize security and information‑theoretic bounds on steganographic or deceptive behaviors in LLMs.
- Collaborate across domains
—from RL and interpretability to cryptography and complexity theory—to unify empirical and theoretical insights. - Publish and communicate findings through open‑source releases, benchmarks, and papers aimed at improving AI governance and safety evaluation.
- Core ML / AI Safety: Reinforcement learning, interpretability, or model evaluations.
- Theoretical Foundations: Information theory, cryptography, or complexity theory.
- Applied Research: Developing reproducible ML experiments, model organisms, or interpretability pipelines.
- Systems & Tools: PyTorch, Hugging Face, and Transformer Lens.
- Reproducibility & Engineering: Strong Python proficiency, Git, and experiment tracking (W&B, MLflow, etc.).
- Prior publications or strong research engineering experience in interpretability or control.
- Familiarity with concepts like steganography, chain‑of‑thought faithfulness, or reward hacking.
- Background in formal methods, information security, or RL‑based training regimes.
- Excited by deep technical challenges with high safety implications.
- Values open science, clarity, and reproducibility
. - Comfortable working in a small,
fast‑moving research team with high autonomy
. - Conscientiousness, honesty, agentic disposition.
- Mission‑Driven Research: Every project contributes directly to AI safety, transparency, and governance.
- Ownership: Lead your own research agenda with mentorship, not micromanagement.
- Interdisciplinary
Collaboration:
We regularly work with top researchers from Deep Mind, Anthropic, other AI safety startups, and academic partners. - Impact: Develop techniques, open‑source tools and benchmarks that shape global standards for safe AI deployment. Work from our staff has already been cited by Anthropic, Deep Mind, Meta, Microsoft and MILA.
- Lean, fast, and serious: We move quickly, publish openly, and care deeply about getting it right.
Please include
- A short research statement (what problems you’d be most excited to work on and why).
- CV, and Google Scholar link if applicable.
- Links to code or papers.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).