Senior Site Reliability Engineer — Davai

Responsibilities

Design and maintain cloud infrastructure across AWS, Scaleway and Hetzner
Build and improve CI/CD pipelines and deployment workflows
Define and implement observability with Grafana, Prometheus and alerting
Lead incident response and post-mortem culture
Champion reliability best practices across the engineering organisation

Requirements

5+ years in SRE, DevOps or infrastructure engineering
Strong experience with Kubernetes and infrastructure-as-code (Terraform, Pulumi or equivalent)
Solid Linux, networking and cloud platform fundamentals
Experience operating production monitoring and alerting stacks
Comfort with on-call rotations and incident management

Nice to have

Experience with Cloudflare Workers or edge computing platforms
Open-source contributions to infrastructure tooling
Background supporting AI or data-intensive workloads

Apply for this role