Observability Engineer
inDrive
Observability Engineer
Cyprus, Nicosia · Kazakhstan · Georgia
- Hybrid
- Full-time
- Middle
We are looking for a Observability Engineer.
The Observability team is part of the SRE department. The team is focused on the development, adoption and scaling of observability tools in the company. Our current observability stack is deployed in multiple regions: Prometheus/Thanos/VictoriaMetrics, EFK, Jaeger, etc. But we have no restrictions, we are always looking for more suitable solutions.
In addition, the team is responsible for automating the entire Incident Management process. The reliability and accessibility of observability tools is very important. Our task is to ensure low MTTD, MTTA, MTTR metrics, which in turn affect the SLA of the company's products.
Responsibilities
- Improvement and support of observability tools
- improvement of the incident management process
- SLA 99.99% for the product
- Implementation of SRE practices to dev teams
Qualifications
Must have:
- Experience with observability tools Prometheus-like TSDB, EFK/EFK/Loki, Jaeger
- Experience to adaptation observability tools in company
- Experience in troubleshooting problems in production
- Good experience with Kubernetes (including with different operators)
-
Any tool for Incident Management (PagerDuty, Opsgenie, etc).
Nice to have:
- Experience working with AWS
- Experience building SRE in the company
- Development experience: python/go
Conditions & Benefits
- Stable salary, official employment
- Health insurance
- Hybrid work mode and flexile schedule
- Relocation package offered for candidates from other regions
- Access to professional counseling services including psychological, financial, and legal support
- Discount club membership
- Diverse internal training programs
- Partially or fully payed additional training courses
- All necessary work equipment