×
Register Here to Apply for Jobs or Post Jobs. X

LLM Inference Performance Engineer

Job in Germany, Pike County, Ohio, USA
Listing for: EnCharge AI
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer, Data Scientist
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: Germany

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications.

EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is seeking an LLM Inference Deployment Engineer to optimize, deploy, and scale large language models (LLMs) for high-performance inference on its energy efficient AI accelerators. You will work at the intersection of AI frameworks, model optimization, and runtime execution to ensure efficient model execution and low-latency AI inference.

Responsibilities

  • Deploy and optimize LLMs (GPT, LLaMA, Mistral, Falcon, etc.) post-training from libraries like Hugging Face
  • Utilize inference runtimes such as ONNX Runtime, vLLM for efficient execution.
  • Optimize batching, caching, and tensor parallelism to improve LLM scalability in real-time applications.
  • Develop and maintain high-performance inference pipelines using Docker, Kubernetes, and other inference servers.

Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field.
  • Experience in LLM inference deployment, model optimization, and runtime engineering.
  • Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, Tensor

    RT-LLM, Deep Speed).
  • In-depth knowledge of the Python programming language for model integration and performance tuning.
  • Strong understanding of high-level model representations and experience implementing framework-level optimizations for Generative AI use cases
  • Experience with containerized AI deployments (Docker, Kubernetes, Triton Inference Server, Tensor Flow Serving, Torch Serve).
  • Strong knowledge of LLM memory optimization strategies for long-context applications.
  • Experience with real-time LLM applications (chatbots, code generation, retrieval-augmented generation).

Encharge

AI is an equal employment opportunity employer in the United States.

Apply for this job

*

First Name *

Last Name *

Preferred First Name

Email *

Phone

Country *

Phone *

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Education

School
* Select...

Degree
* Select...

Select...

End date year

Linked In Profile

Website

U.S. Standard Demographic Questions

We invite applicants to share their demographic background. If you choose to complete this survey, your responses may be used to identify areas of improvement in our hiring process.

How would you describe your gender identity? Select...

How would you describe your racial/ethnic background? Select...

How would you describe your sexual orientation? Select...

Do you identify as transgender? Select...

Do you have a disability or chronic condition (physical, visual, auditory, cognitive, mental, emotional, or other) that substantially limits one or more of your major life activities, including mobility, communication (seeing, hearing, speaking), and learning? Select...

Are you a veteran or active member of the United States Armed Forces? Select...

Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey.

Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in aconfidential file.

As set forth in EnCharge AI’s Equal Employment Opportunity policy,we do not discriminate on the basis of any protected group status under any applicable law.

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection.

As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary