Research Engineer Job New York City area,New York USA,IT/Tech

At Datalab, we train state-of-the-art language models that read documents with human-level accuracy and power the next generation of AI products, workflows, and research.

Our models - Chandra, Surya, and Marker - have become the backbone of document intelligence, with more than 50,000 Git Hub stars and adoption across top tier 1 AI research labs, Fortune 500 enterprises, and government agencies.

We've grown to 7-figure ARR with ~7x growth in 2025, driven by a lean, senior team that operates with high autonomy and deep technical ownership.

Backed by founding members of OpenAI, FAIR, and Hugging Face. We move fast, ship often, and we're hiring builders who do the same.

Role Description

We’re looking for a Research Engineer to work across our open-source repos, inference API, and model training stack. You’ll operate at the intersection of applied research and engineering — shaping the models that power real-world document intelligence systems used by enterprises and developers globally.

You’ll be training and evaluating new model architectures, integrating them into production, and shipping updates across our open-source ecosystem. You’ll also help close the loop with users — investigating issues, improving benchmarks, and turning real feedback into better model performance.

Our team focuses on training small, efficient models that outperform much larger LLMs on domain-specific tasks (like OCR, structured extraction, and math recognition). We move fast, prioritize practical results, and build tools that are open, reproducible, and built to last.

Day to day, you will:

Train and evaluate models: Train task-specific models (OCR, layout, text recognition, extraction). Explore architectures and training strategies to optimize task performance.
Optimize inference: Profile and accelerate model inference across different hardware setups (H100s, L40s, CPUs).
Contribute to open source: Ship features and improvements to our core open-source repos, including model APIs, data loaders, evaluation scripts, and benchmark tooling.
Build and maintain datasets: Source, design, and clean datasets for supervised and synthetic training; create reproducible pipelines for data versioning and evaluation.
Experiment and benchmark: Run ablations, track metrics, and publish findings that inform model design and internal research direction.
Engage with users and partners: Occasionally join calls or Slack threads to help customers evaluate, deploy, and extend models.

Ideal Candidate

You’ve shipped models that made it into production. You understand how to balance exploration with delivery, and how to turn research insights into products people actually use. You work autonomously and thrive in unstructured environments, but you’re also a strong collaborator — you communicate clearly, document your work, and elevate the people around you.

3+ years experience training, fine-tuning, and evaluating LLMs
Trained at least one production‑grade model or system used in real‑world applications
Deep expertise in PyTorch and Python, with strong fundamentals in deep learning (optimization, evaluation, architecture design)
Comfortable with data engineering, benchmarking, and performance profiling across hardware setups
Have experience with OCR, document AI, or structured extraction
Have published work — whether that’s a paper, a benchmark report, or a deep technical blog post
Have been a major contributor to open‑source projects, especially in ML, vision, or NLP
Enjoy writing about your work and sharing learnings with the community

Seniority level

Mid‑Senior level

Employment type

Full‑time

Job function

Engineering and Information Technology

Referrals increase your chances of interviewing at Datalab by 2x.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language