AI Engineering Intern; SLM Job Paramus area,New Jersey USA,Software Development

Position: AI Engineering Intern (SLM)
Company Description

Veolia in North America is the top-ranked environmental company in the United States for three consecutive years, and the country's largest private water operator and technology provider as well as hazardous waste and pollution treatment leader. It offers a full spectrum of water, waste, and energy management services, including water and wastewater treatment, commercial and hazardous waste collection and disposal, energy consulting and resource recovery.

Veolia helps commercial, industrial, healthcare, higher education and municipality customers throughout North America. Headquartered in Boston, Veolia has more than 10,000 employees working at more than 350 locations across North America.

Job Description

Student Exploration and Experience Development (SEED) is a 12-week internship opportunity at Veolia for students to gain hands-on experience in sustainability and ecological transformation. They will work on real-world projects, receive mentorship from industry professionals, and participate in workshops and networking events. The program aims to nurture talent, promote innovation, and foster meaningful connections between students and industry professionals. Overall, the SEED program provides students with the skills, knowledge, and connections needed to make a positive impact in the industry.

Program Dates:
June 1, 2026 to August 21, 2026.

Position

Purpose:

We are seeking a motivated AI Engineering intern to support the development and implementation of an AI-powered agent for

This role offers hands-on experience with cutting-edge Small language models, cloud infrastructure, and enterprise software development.

Primary Duties/Responsibilities:

* Small Language Models (SLMs):

* Understanding and working with lightweight models such as Phi-3 (Microsoft), Llama(Meta), Mistral (Mistral), Gemma (Google), and Tiny Llama (for resource-constrained environments).

* Design Phase:

* Requirements Gathering:
Using Figma for UI/UX design and workflow planning.

* Data Analysis:
Utilizing Jupyter Notebooks and pandas for data exploration, cleaning, and analysis.

* Model Selection:
Leveraging Hugging Face Model Hub to compare, select, and download pre-trained models.

* Development Framework:

* Core Framework:
Building SLM-powered applications using Lang Chain or Lanngraph for orchestration and workflow management.

* Model Serving:
Deploying models locally with Ollama or at scale with vLLM for efficient inference.

* Vector Database:
Implementing semantic search and retrieval-augmented generation (RAG) using Chroma

DB.

* API Framework:
Developing RESTful APIs with FastAPI (Python) or Express.js (Node.js) to expose model functionality

* Development Tools:

* IDE:
Coding in VS Code, ideally with Python and Copilot extensions for productivity.

* Model Fine-tuning:
Customizing models using Hugging Face Transformers and parameter-efficient fine-tuning (PEFT) methods like LoRA/QLoRA.

* Prompt Engineering:
Managing and optimizing prompts with Lang Smith or Prompt Layer.

* Version Control:
Tracking code and data changes with Git and DVC (Data Version Control).

* Experiment Tracking:
Logging experiments, metrics, and results with MLflow or Weights & Biases.

* Testing Stack:

* Unit Testing:
Writing and running tests using pytest or unittest to ensure code correctness.

* Model Evaluation:
Assessing model performance with RAGAS (for RAG pipelines) and Deep Eval.

* Load Testing:
Simulating high-traffic scenarios using Locust or Apache JMeter.

* Quality Assurance:
Using Lang Chain Evaluators and custom metrics to ensure output quality and reliability.

* Deployment Stack:

* Containerization:
Packaging applications with Docker or Docker Compose for portability and reproducibility.

* Orchestration:
Managing and scaling containers with Kubernetes (K8s) or Docker Swarm.

* Model Optimization:
Accelerating inference and reducing resource usage with ONNX Runtime, OpenVINO, or llama.cpp.

* API Gateway:
Managing and securing APIs with Apigee.

* Observability:
Monitoring application and model performance with Lang Smith.

* CI/CD:
Automating build, test, and deployment pipelines with Git Hub Actions or Git Lab CI.

* Infrastructure Options:

* On-Premise:
Deploying solutions on local servers with GPU or CPU resources.

* Cloud:
Utilizing GCP (Vertex AI) for scalable, managed AI infrastructure.

* Hybrid:
Integrating edge devices for local inference with cloud backup for resilience and scalability.

* An intern should be able to demonstrate familiarity with these models, tools, and workflows, as they are essential for designing, developing, testing, deploying, and monitoring efficient, production-ready small language model applications.

Work Environment:

* Environments vary by intership function from office to field to plant.

* Our aim is to provide tangible industry job experience to each intern.

Qualifications

Education/Experience/Background:

* Working towards a PhD degree and you have in AI/ML/Computer Science.

* 3.8 Cumulative G.P.A required.

Knowledge/Skills/Abilities:

* Strong…


Increase/decrease your Search Radius (miles)



Job Posting Language