Observability Engineer

inDrive

inDrive

Other Engineering
Nicosia, Cyprus · Kazakhstan
Posted on Nov 11, 2024

Observability Engineer

Cyprus, Nicosia · Kazakhstan · Georgia

  • Hybrid
  • Full-time
  • Middle

We are looking for a Observability Engineer.

The Observability team is part of the SRE department. The team is focused on the development, adoption and scaling of observability tools in the company. Our current observability stack is deployed in multiple regions: Prometheus/Thanos/VictoriaMetrics, EFK, Jaeger, etc. But we have no restrictions, we are always looking for more suitable solutions.

In addition, the team is responsible for automating the entire Incident Management process. The reliability and accessibility of observability tools is very important. Our task is to ensure low MTTD, MTTA, MTTR metrics, which in turn affect the SLA of the company's products.

Responsibilities

  • Improvement and support of observability tools
  • improvement of the incident management process
  • SLA 99.99% for the product
  • Implementation of SRE practices to dev teams

Qualifications

Must have:

  • Experience with observability tools Prometheus-like TSDB, EFK/EFK/Loki, Jaeger
  • Experience to adaptation observability tools in company
  • Experience in troubleshooting problems in production
  • Good experience with Kubernetes (including with different operators)
  • Any tool for Incident Management (PagerDuty, Opsgenie, etc).

Nice to have:

  • Experience working with AWS
  • Experience building SRE in the company
  • Development experience: python/go

Conditions & Benefits

  • Stable salary, official employment
  • Health insurance
  • Hybrid work mode and flexile schedule
  • Relocation package offered for candidates from other regions
  • Access to professional counseling services including psychological, financial, and legal support
  • Discount club membership
  • Diverse internal training programs
  • Partially or fully payed additional training courses
  • All necessary work equipment