Senior Staff Engineer - Site Reliability Engineer
Verta
Software Engineering
Bengaluru, Karnataka, India
Business Area:
EngineeringSeniority Level:
Mid-Senior levelJob Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
Cloudera seeks a Senior Staff Site Reliability Engineer (SRE) to drive innovation of our platforms. You will take part in setting the technical vision and defining cross-organizational patterns across many areas, including Infrastructure as Code (IaC), Distributed systems, GitOps, CI/CD/CT, Observability, Security, and more.
As a Senior Staff Site Reliability Engineer, you will:
Collaborate across the organization to improve reliability and maintainability
Drive organization-wide architectural decisions and platform engineering practices
Serve as a role model and mentor, to up skill other engineers
Platform Engineering across diverse infrastructure, cloud providers and hybrid use cases
Eliminate toil through simplification and automation
Provide technical leadership to SRE peers and other engineers
Architected products for both SaaS and self-hosted while keeping same code base
Influence the product roadmap for delivering reliable services to customers
Monitor availability, latency and overall service health
Practice sustainable incident response and blameless postmortems
Participate in an on-call rotation
We’re excited about you if you have:
Have 10+ years industry experience in SRE, DevOps or related practices
Generalist mindset with mastery of many domains, and pride in quickly learning new areas
Enjoy collaborating, mentoring and are a strong communicator
Proven ability to handle multiple complex technical projects with distributed teams
Have architected PaaS/SaaS products using container-based micro-services patterns
Experience with performance analysis, troubleshooting, tuning, and capacity planning
Strong troubleshooting skills that span contexts: Linux, network and application code
Deep understanding of Kubernetes and the wider ecosystem
Experience with observability, logging and monitoring tools
Experience with Terraform and related technologies
Experience with CI/CD tools, such as Jenkins, Github Actions, Flux CD, Argo CD
Programming experience in Python, Go or similar languages
Experience with GitOps and Git-based automation
Expert level experience in Amazon Web Services (AWS)
You may also have:
Experience with systems hardening such as CIS, STIG, SELinux
Experience with compliance programs such as SOC, FedRAMP, HITRUST CSF
Experience with database systems, including Postgres, MariaDB, and MySQL
Experience with Microsoft Azure or Google Cloud Platform or OpenStack
What you can expect from us:
Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Employee Resource Groups
EEO/VEVRAA
#LI-AB1
#LI-Hybrid