Principal Software Engineer - AI Infrastructure
Listed on 2026-01-12
-
Software Development
Software Engineer, Cloud Engineer - Software
Principal Software Engineer - AI Infrastructure
Austin, TX, United States
United States
- Job Identification 309518
- Job Category Product Development
- Posting Date 12/02/2025, 04:59 PM
- Job Type Regular Employee
- Does this position require a security clearance? No
- Years 3 to 5+ years
- Additional Info N/A
- Applicants are required to read, write, and speak the following languages English
As a Principal Member of Technical Staff, you will own the software design and development for major components of Oracle's Cloud Infrastructure. You should be both a rock-solid lead developer, curious problem solver, a distributed systems generalist and/or skilled Linux engineer with Systems triage experience able to dive deep into any part of the stack and low-level systems to design broad distributed system interactions.
You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
This role resides within the Compute AI Infrastructure Bare Metal Provisioning team, which owns the critical infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware bring-up to customer-ready instance provisioning and firmware management. The services operate at the intersection of bare metal hardware and full-stack orchestration frameworks, a unique combination where both distributed systems engineers and engineers with background in Linux and firmware are highly valued.
The team interfaces directly with components like BMCs, NICs, Smart
NICs, ILOMs, GPUs, and custom firmware stacks. The team builds high performance, scalable micro-services and tooling that provision, configure, secure, and validate server platforms across OCI’s massive fleet of Compute and GPU Infrastructure. You will partner closely across other teams in Compute, Networking, Security, Data center Engineering, and Hardware Development to ensure OCI can launch, scale, and maintain new server platforms with minimal operational overhead and high reliability.
You will work directly with cutting edge GPU hardware and see the direct impact of your work on the business.
We strive for equity, inclusion, and respect for all. We are committed to the greater good in our products and our actions. We are constantly learning and taking opportunities to grow our careers and ourselves. We challenge each other to stretch beyond our past to build our future. You are the builder here. You will be part of a team of really smart, motivated, and diverse people and given the autonomy and support to do your best work.
It is a dynamic and flexible workplace where you’ll belong and be encouraged. If you are interested in building large-scale distributed infrastructure for the cloud, want to work on cutting edge GPU infrastructure and the latest Compute systems, have a knack for distributed systems and/or Linux development with Systems experience then this is your team! Oracle is aggressively investing in the Oracle Cloud to provide the broadest, most comprehensive cloud in the industry.
Job Responsibilities:
- Own the software design and development for major components of Oracle’s Cloud Infrastructure.
- Be a rock solid developer, driven problem solver and a distributed systems generalist and/or Linux developer with Systems experience able to dive deep, design, develop, operate, and debug any part of the stack and low level systems such as Linux, Docker, Java web services and Terraform, as well as design broad distributed system interactions.
- Have a tenacious attitude to improve the status quo, independently seek out problems to solve and take action to deliver results wherever needed.
- Value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
Qualifications:
- 7-10+ years' experience delivering and operating large scale, highly available distributed systems, Linux development and Systems debugging.
- Strong knowledge of Object Oriented programming such as C++ or Java, and experience with scripting languages such as Python.
- Strong knowledge of data structures, algorithms, operating systems, and distributed…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).