SRE Specialist 3
Lacework
SRE Specialist 3
Job Info
- Job Identification 20211
- Job Category Site Reliability Engineering
- Posting Date 07/15/2025, 05:34 AM
- Locations Burnaby, BC, Canada
- Job Schedule Full time
Job Description
We are a service-focused team managing high-traffic, consumer-facing systems deployed globally. Our responsibility spans the full lifecycle of services running on top of OpenStack, Kubernetes, and physical/virtual infrastructure. We own both the operational stability of the services and the automation systems that support them.
We are looking for a hands-on SRE Specialist with DevSecOps mindset who can get things done — someone who combines deep technical expertise with strong ownership and a bias for action. You will help us build, run, and improve the infrastructure and automation that powers our production services, while contributing to the design of a scalable, maintainable DevOps system.
What You’ll Do
Automation & CI/CD
- Build and maintain automation workflows for service and infrastructure operations using Ansible, Bash, or Python.
- Create and optimize CI/CD pipelines with GitLab, enabling safe, reliable, and fast deployments.
- Contribute to our evolving DevOps architecture, identifying gaps and continuously improving efficiency and resilience.
Service Deployment & Operations
- Deploy, manage, and support services running on OpenStack and Kubernetes platforms, along with some on VMware and hardware.
- Troubleshoot service issues across application layers, OS, network, and infrastructure.
- Handle service lifecycle tasks including provisioning, monitoring, patching, and scaling.
- Participate in on-call rotation to ensure 24/7 uptime of critical systems.
Monitoring & Troubleshooting
- Monitor service and system health using tools like Zabbix, Grafana, and the ELK stack.
- Investigate and resolve performance bottlenecks and production incidents.
- Write and maintain documentation for operational procedures, troubleshooting guides, and system workflows.
System & Network Administration
- Administer and troubleshoot Linux servers (Red Hat/CentOS/Ubuntu) and assist with MySQL database support in production environments.
- Manage network-level configurations and problems (IPtables, routing, LDAP, SMTP, DNS, firewall rules etc).
- Handle infrastructure maintenance on OpenStack, Kubernetes, VMware, and physical servers as needed.
- Work with security and compliance teams to prepare for audits, implement required controls, and ensure visibility into operational activities.
- Maintain secure configurations and enforce access control, logging, and change management processes.
- Assist in integrating security practices into CI/CD pipelines (e.g., image/OS hardening, patching and compliance).
- Ensure system changes are documented and traceable to meet compliance needs (e.g., SOC 2, ISO 27001).
What You Bring
- 5+ years of experience in Linux system administration and production environment support.
- Proven ability to manage services in virtualized and containerized environments (especially OpenStack and Kubernetes).
- Strong experience with infrastructure automation tools like Ansible and scripting in Bash or Python.
- Familiarity with building and operating GitLab CI/CD pipelines or similar.
- Solid knowledge of networking fundamentals (TCP/IP, firewalls, DNS, etc.).
- Experience working with monitoring/logging tools (Zabbix, Grafana, ELK, etc.).
- Familiarity with information security principles and experience supporting compliance-driven environments (e.g., SOC 2, ISO 27001).
- Excellent debugging and root cause analysis skills across complex systems.
- A proactive attitude, strong sense of ownership, and ability to work both independently and within a team
Nice to Have
- Experience designing or evolving DevOps systems for service lifecycle management.
- Knowledge of Docker, Git, and software-defined infrastructure tools.
- Experience integrating security tooling (e.g., vulnerability scanners, secret managers, audit logs) into DevOps pipelines.
- Prior experience operating in a 24/7 production support environment.
- Certifications such as:
- RHCE (Red Hat Certified Engineer)
- CKA/CKAD (Kubernetes certifications)
- OpenStack Administrator Certification
Education
- Degree or diploma in Computer Science, Computer Technology, or a related field.
Why Join Us?
- Join a stable, technically strong team with real impact on customer-facing services.
- Work with modern infrastructure and automation technologies.
- Solve meaningful operational challenges at scale.
- Grow with a team that values action, clarity, and continual improvement.
Similar Jobs