×
Register Here to Apply for Jobs or Post Jobs. X

HPC System Administrator

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Santa Clara University
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    Cybersecurity, Cloud Computing, Systems Engineer, Systems Administrator
Salary/Wage Range or Industry Benchmark: 129000 - 161265 USD Yearly USD 129000.00 161265.00 YEAR
Job Description & How to Apply Below
$129,000 - $161,265 /annually;
* Compensation will be based on education, experience, skills relevant to the role, and internal equity.*#
*
* A. POSITION PURPOSE
*** Knowledge Transfer:
Develops and implements a formal cross-training program for existing system administrators by creating documentation and delivering hands-on instruction to enhance the team's collective expertise in HPC-specific technologies (Slurm, Infini Band, parallel file systems).
* Operational Resilience:
Ensures robust, shared support capabilities across the IT team by strategically transferring HPC knowledge, actively preventing single points of failure, and improving the overall efficiency and responsiveness of the operational support model.
* Strategic Enhancement:
Contributes to the strategic planning and roadmap development for future HPC infrastructure and software enhancements by researching emerging technologies, evaluating vendor solutions, and providing expert recommendations to ensure the environment remains cutting-edge and meets long-term organizational goals.
* Use broad expertise and unique skills to play an active role as a technical expert during the planning and implementation phases of new technologies, and participate in architecture brainstorming and design discussions with technical team members.
* Provide technical guidance on complex infrastructure architecture challenges to IS team members and other solution partners.
* Act as a role model for developing and trying different problem-solving approaches and supporting team members to do the same.
* Coaches and develops new team members on how to provide the best customer service.
* Models and supports other team members to conduct themselves with openness and honesty to enhance positive relationships based on trust, predictability, and communication.
* Provide input on setting Enterprise Systems, and CIT, goals, objectives and strategies based on the University's mission, goals and strategic plan.
* Provide input in technology planning processes to develop cost-effective customer-focused solutions.
* Uses strong technical and organizational knowledge to plan and lead projects and working groups.
* Work closely with the ES Manager in the creation, planning, maintenance, and secure expansion of SCU's computing infrastructure.  This includes, but is not limited to, local and hosted servers, virtual appliances and devices, and storage.
* Work closely with ES Manager to ensure that architecture principles and standards are consistently applied across the data center compute and storage services.
* Collaborate with the Information Security Office (ISO) to ensure a secure and compliant enterprise environment.
* Work with the ISO to ensure that systems are secure and to plan for future security needs and threats.
* Ensure the appropriate distribution of infrastructure services to faculty, staff, and students.
* Create and document standards and practices regarding data center, compute and storage services for use across the University.
* Oversee the creation and performance of infrastructure production and test environments.
* Create scalable, interoperable, and flexible infrastructure solutions.
* Support assigned systems with on-call availability and respond within agreed upon time frames.
* Analyze and evaluate processes to document and implement standard routine and process for the application of patches/updates to operating systems, applications, and hardware and firmware to ensure all physical, virtual, and hosted systems are patched with the appropriate level of security and versioning.
* Participate as necessary in backup operations, ensuring all required file systems and system data are successfully backed up to the appropriate media and are available off site.
* Participate in disaster recovery and business continuity planning.
* Perform daily system monitoring, verifying integrity and availability of all hardware, server resources, systems, and key processes. Check for potential problems, resource availability, capacity, performance and load characteristics, network integrity, and security threats. Monitor systems activity and usage to maintain a secure environment.…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary