×
Register Here to Apply for Jobs or Post Jobs. X

HPC Sr. Scientific Software Engineer; IT@JH Research Computing

Job in Baltimore, Anne Arundel County, Maryland, 21276, USA
Listing for: Johns Hopkins University
Full Time position
Listed on 2026-01-06
Job specializations:
  • IT/Tech
    AI Engineer, Systems Engineer
Job Description & How to Apply Below
Position: HPC Sr. Scientific Software Engineer (IT@JH Research Computing) - #Staff

IT@JH Research Computing is seeking a HPC Sr. Scientific Software Engineer who will design, build, and support Johns Hopkins University’s high-performance computing and AI research infrastructure. This role integrates elements of both systems and software engineering, ensuring scalable, secure, and reproducible environments for scientific and data-intensive research. The Engineer develops and automates system and application workflows across CPU/GPU clusters, parallel storage, and hybrid cloud platforms.

Responsibilities include configuring and optimizing large-scale Linux environments, implementing job scheduling and orchestration frameworks, containerizing applications, and supporting researchers in optimizing performance and reproducibility. Work combines project-based engineering with operational support, requiring both independent problem-solving and close collaboration with the Research Computing team and faculty stakeholders.

Specific Duties & Responsibilities Software Deployment and Design
  • Develop and refine deployment strategies for scientific software on HPC and AI systems.

  • Design computational workflows, selecting optimal software configurations, and utilizing tools like Ansible for automation.

  • Assist teams in implementing, tuning, and optimizing AI models and gateway applications (e.g., XDMoD, Coldfront, Open OnDemand, CryoSPARC Live, SBGrid, AI Agents).

Performance Optimization
  • Analyze and optimize the performance of AI models and HPC applications, focusing on GPU-enabled computing.

  • Implement parallel processing, distributed computing, and resource management techniques for efficient job execution.

Integration and Optimization
  • Develop, debug, and maintain software tools, libraries, and frameworks supporting HPC and AI workloads.

  • Collaborate with the system team and software vendors (e.g., NVIDIA, Intel, Matlab) to optimize systems for maximum performance.

  • Utilize CUDA, DNN, Tensor

    RT, and Intel Compilers to enhance system performance.

HPC Scientific Software Support
  • Manage and support scientific software deployment across HPC, cloud-based, and colocation facilities.

  • Oversee installation, configuration, and maintenance of HPC packages with tools like CMake, Make, Easy Build, Spack, and Lua module files

Collaboration and Mentorship
  • Work closely with cross-functional teams, including researchers, data scientists, and software developers, to address complex HPC/AI challenges.

  • Mentor junior engineers and foster a culture of continuous learning.

Technical Support and Training Workshops and Troubleshooting
  • Resolve complex technical issues and perform root cause analysis for HPC/AI software challenges.

  • Implement effective solutions to prevent recurrence and improve system reliability

  • Provide training workshops for researchers and students, focusing on troubleshooting, optimizing workflows, and effectively using HPC systems.

Learning and Development
  • Stay current with advances in HPC and AI technologies and methodologies.

  • Incorporate new research findings into existing systems to improve performance and capabilities.

Container Orchestration
  • Develop and manage container orchestration strategies to ensure scalability, reliability, and security of applications.

  • Oversee the container lifecycle from creation and deployment to scaling and removal.

Documentation and Compliance
  • Create comprehensive documentation for system designs, performance metrics, and project status.

  • Ensure compliance with security and regulatory standards for all HPC and AI systems.

In Addition to the Duties Described Above
  • Design, deploy, and maintain large-scale Linux HPC clusters with CPU/GPU resources, high-speed networks, and distributed storage.

  • Develop and maintain automation frameworks for provisioning, monitoring, and software lifecycle management.

  • Implement and optimize job scheduling, container orchestration, and workflow automation tools to support diverse research workloads.

  • Collaborate with faculty and research teams to parallelize, containerize, and scale computational workflows for multi-GPU and distributed environments.

  • Benchmark and tune application performance across architectures, documenting findings and sharing best practices.

  • Integrate and support AI/ML frameworks, scientific libraries, and workflow engines (Snakemake, Nextflow, Dask, Ray).

  • Ensure system and application reliability through proactive monitoring (Prometheus, Grafana, ELK) and incident response participation.

  • Support reproducibility and FAIR data principles through version-controlled, containerized environments.

  • Contribute to documentation, training materials, and technical guidance to enhance user experience and self-service capabilities.

  • Participate in evaluation and adoption of new technologies to advance performance, efficiency, and sustainability in research computing.

Minimum Qualifications
  • PhD in a quantitative discipline.

  • Five years of experience in HPC user support, software deployment, and performance optimization within an academic or research environment.

  • Additional education may…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary