Data Scientist; Junior to Mid-Level – Applied LLM/RAG Intelligence
Listed on 2026-02-28
-
IT/Tech
Data Analyst, Data Scientist, Machine Learning/ ML Engineer, AI Engineer
Location:On-site / hybrid in the El Paso, TX – Las Cruces, NM corridor, supporting work at White Sands Missile Range and nearby government sites. Open torelocationcandidates.
Travel:Local travel to WSMR/Fort Bliss; occasional CONUS travel for test events.
May include weekends and nights.
Employment:Full-time
About TorchTorch is an AI-powered interview intelligence platform used by government customers to generate structured reporting, detect leads, recommend follow-up questions, and analyze interviews r job is to make those outputs moreaccurate, more defensible, and more useful in real-world conditions that include offline and edge-constrained environments.
You'llreport to theonsite Project Team leaderand work closely with a small exercise support team.
You'llown problems end-to-end: designing evaluation frameworks, improving prompts and retrieval pipelines, analyzing interview data, and shipping measurable improvements into production.
- Improve prompting strategies and structured outputs across reporting formats including law enforcement, intelligence, after-action, interview summaries, and survey analysis
- Design evaluation sets, scoring rubrics, and automated evaluation pipelines (including LLM-as-judge approaches) for relevance, coherence, completeness, and error modes
- Reduce hallucinations and improve traceability and attribution
- Build and iterate on RAG pipelines, curated knowledge packs, and question-tree triggers
- Create andmaintainbasedatasets (follow-up triggers, Essential Elements of Information/Critical Information Requirements, glossaries, watchlist cues) with versioning and documentation
- Tune retrieval and reranking to perform reliably under edge constraints (limited compute, memory, and connectivity)
- Analyze transcripts to surface evasiveness, inconsistencies, and actionable leads
- Develop labeling strategies, analytic rubrics, and ground-truth datasets
- Conduct quantitative and qualitative analysis of interview data toidentifypatterns and support operational decisions
- Build lightweight dashboards and metrics for model performance and field reliability
- Document methods andmaintainaudit trails so outputsremaindefensible for government end users
- Partner with engineering tovalidateand ship improvements into production
- 2–5 years of applied data science or NLP experience
- Strong Python skills (pandas, Num Py, scikit-learn) with comfort standing up experiments and pipelines
- Hands-on experience with LLMs: prompt engineering, output evaluation, safety, and quality controls
- Experience with unstructured text data — cleaning, labeling, building evaluation metrics
- Proficiency in data analysis, reporting, and visualization for technical and non-technical audiences
- Ability to work on-site in the El Paso / Las Cruces / WSMR area
- RAG implementation experience (vector databases, embeddings, reranking)
- Experience with structured evaluation frameworks (RAGAS, custom LLM-as-judge, or equivalent)
- Familiarity with edge or offline deployment constraints
- Exposure to interview analytics, structured debriefing, structured reporting, or HUMINT-adjacent workflows
- Experience delivering in classified or regulated environments
- Opportunity to work on a meaningful mission with amazing teammates
- 12.5% flexible benefits allowance on top of base pay:use it for health coverage, retirement,additional time off, or other personal priorities
- 6% employer 401(k) contribution (non-elective) with a 4-year rolling vesting schedule
- 20 days
PTO plus 11 paid federal holidays, withoptiontopurchaseup to 10additionaldays
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).