HPC Storage Systems Engineer
Listed on 2026-03-01
-
IT/Tech
Systems Engineer, Data Engineer
NERSC's mission is to accelerate scientific discovery through high performance computing and data analysis for the DOE Office of Science programs. NERSC provides critical HPC and data systems and support for NERSC's 11,000+ users researching high energy physics, materials sciences, chemistry, fusion, and other DOE mission areas. The National Energy Research Scientific Computing Center (NERSC) is inviting applications for the position of Storage System Administrator.
The HPC Storage System Administrator position is focused on extreme scale high performance storage. The NERSC Storage Systems Group provides petabytes of capacity and terabytes per second of bandwidth to the NERSC user community. In this role, the incumbent will work with 7-10 system engineers and programmers in the Storage Systems Group collaborating to help architect, deploy and manage NERSC's storage hierarchy (composed of Lustre, Storage Scale (formerly GPFS) file systems, VAST and two HPSS tape archive systems).
We seek an experienced, motivated high performance storage administrator who has broad knowledge of storage hardware and software technologies, in particular, hierarchical storage management systems and object stores. At the CSE-3 level, the incumbent will be responsible for operating and maintaining the NERSC mass storage service (HPSS) as part of a team, as well as contributing to the NERSC Storage Strategy.
At the CSE-4 level, the incumbent will lead the HPSS effort with a few key Storage System engineers under their direction. In this role, you will be responsible for the day-to-day administration of NERSC's HPSS service and contribute significantly to the development and execution of NERSC's Storage Strategy, particularly with respect to mass storage.
The selected candidate(s) will be hired at the Computer Systems Engineer 3 or 4 (CSE3 or CSE4) depending on their level skills and experience.
At Level 3, You will:
Participate in projects to architect, deploy and manage NERSC's mass storage hierarchy
Contribute to the effort to manage and maintain the HPSS systems
Day to day administration of tape-based complex storage systems
Analyze storage usage and system monitoring
Administration of storage servers and block storage arrays
Participate in the management of storage area network
Troubleshoot and debug problems in our production storage systems
Help define storage requirements for NERSC, ensuring that NERSC users' needs are represented
Engage with NERSC users to identify projects which will improve data management and movement at the center
Identify and evaluate new storage hardware and software technologies and features
Participate in 24x7 on-call rotation
Work on and resolve complex issues where analysis of situations or data requires an in-depth evaluation of variable factors
Exercise judgment in selecting methods, techniques and evaluation criteria for obtaining results
Determine methods and procedures on new assignments and may coordinate activities of other personnel
Network with key contacts outside of their own area of expertise
In addition to Above, At Level 4, You will:
Lead the mass storage system administrators team within the Storage Systems Group, leading effort to manage and maintain the HPSS systems
Lead projects to architect, deploy and manage NERSC's mass storage hierarchy
Work on and resolve significant and unique issues where analysis of situations or data requires an evaluation of intangibles
Exercise independent judgment in methods, techniques and evaluation criteria for obtaining results
Additional Desired Responsibilities:
Present technical information at conferences and meetings
We are looking for:
Bachelor's degree or equivalent experience and a minimum of 8 years of computing or storage experience; or 6 years and a Master's degree; or equivalent experience
Wide-ranging expertise in the areas of mass storage solutions (such as HPSS) and storage networking technologies (such as RDMA, RoCE, Infiniband and Fibre Channel).
Experience managing storage systems
Excellent technical troubleshooting skills with the ability to resolve complex issues in creative and effective ways
Knowledge of trends in storage system hardware and software
Strong communication skills, and the ability to work independently and collaboratively as part of a creative and diverse team
Ability to script in Python, Perl, Shell or other interpreted language
Knowledge of block storage arrays, storage networks, parallel file systems, hierarchical storage systems and object stores
Ability to resolve complex issues in creative and effective ways.
Ability to network and collaborate with key contacts outside of their own area of expertise
Excellent oral and written communication skills
Demonstrated ability to work effectively as part of a cross-disciplinary team
In Addition to Above, Level 4 Requirements:
Bachelor's degree or equivalent experience and a minimum of 12 years of computing or storage experience; or 8 years and a Master's degree; or equivalent experience
Broad expertise…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).