Software Engineer - Reliability GPU Infrastructure
Luma AI
This job is no longer accepting applications
See open jobs at Luma AI.See open jobs similar to "Software Engineer - Reliability GPU Infrastructure" General Catalyst.Software Engineering, Other Engineering
Palo Alto, CA, USA
USD 170k-360k / year
- Hybrid Cloud Strategy: Architect a seamless infrastructure mesh that spans multiple cloud providers and bare-metal environments, optimizing for cost, performance, and reliability.
- Intelligent Scheduling: Design the logic that allocates massive compute resources across competing priorities, ensuring optimal throughput for both research training and production inference.
- Infrastructure as Software: Lead the effort to define our entire stack as code, building the rigorous CI/CD and GitOps workflows that allow us to move with speed and safety.
- Architectural Vision: You have a history of designing complex distributed systems, demonstrating the judgment to navigate trade-offs between immediate velocity and long-term scalability.
- Cloud Polyglot: You possess deep expertise across various infrastructure providers but understand the fundamental primitives well enough to build outside of them.
- Technical Leadership: You can mentor the team and drive consensus on technical decisions, setting the standard for engineering excellence in operations.
Compensation
This job is no longer accepting applications
See open jobs at Luma AI.See open jobs similar to "Software Engineer - Reliability GPU Infrastructure" General Catalyst.