Engineering·Remote (Europe)·Full-time

Senior Site Reliability Engineer

Own the reliability, performance and scalability of our production systems. You will shape our infrastructure practices and help the engineering team ship with confidence.

Responsibilities

  • Design and maintain cloud infrastructure across AWS, Scaleway and Hetzner
  • Build and improve CI/CD pipelines and deployment workflows
  • Define and implement observability with Grafana, Prometheus and alerting
  • Lead incident response and post-mortem culture
  • Champion reliability best practices across the engineering organisation

Requirements

  • 5+ years in SRE, DevOps or infrastructure engineering
  • Strong experience with Kubernetes and infrastructure-as-code (Terraform, Pulumi or equivalent)
  • Solid Linux, networking and cloud platform fundamentals
  • Experience operating production monitoring and alerting stacks
  • Comfort with on-call rotations and incident management

Nice to have

  • Experience with Cloudflare Workers or edge computing platforms
  • Open-source contributions to infrastructure tooling
  • Background supporting AI or data-intensive workloads

Apply for this role