Jobs / mLabs

SRE (Terminal)

mLabs · London, ENG, United Kingdom
London, ENG, United KingdomFull timeOnsite
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Not specified

Job summary

Our client, a high-growth software development organization in the decentralized crypto social networks space, is seeking a battle-tested Site Reliability Engineering (SRE) Expert. This role involves navigating ambiguous, critical infrastructure challenges, scoping solutions, making architectural trade-offs, and executing with precision to ensure continuous uptime of a high-stakes, high-throughput environment.

Benefits

Competitive Base SalaryEquityToken Allocation

Qualifications

  • Deep expertise in infrastructure-as-code (Terraform/OpenTofu), network topology, high-availability architecture, and system internals.
  • Experience building foundational infrastructure and running high-availability environments where reliability is treated with financial-system levels of seriousness.
  • Advanced proficiency with modern cloud providers (AWS, GCP) and container orchestration platforms (Kubernetes).
  • Strong capacity to operate independently in high-stakes environments, deciding when to gather consensus versus when to execute autonomously.
  • Experience with infrastructure security hardening, IAM architecture, or compliance mapping (e.g., SOC2, ISO).
  • Hands-on experience managing and scaling high-throughput, low-latency data backbones and event streaming systems (Kafka, Redpanda, PostgreSQL).
  • Working understanding of Web3/crypto infrastructure patterns and comfort operating within them.

Responsibilities

  • Design, scale, and maintain highly available, multi-region, or active-active cloud infrastructure patterns.
  • Lead critical incident response efforts, participate in on-call rotations, and drive comprehensive, blameless post-mortems to continuously harden the system.
  • Write clean, production-grade automation code for infrastructure tooling, operators, and seamless systems integration.
  • Exercise sharp judgment regarding system risks, balancing rapid deployment velocity with robust infrastructure safety and stability.
  • Raise the engineering and operational bar across the organization through the implementation of rigorous standards, modern tooling, and technical mentorship.

Skills

AWSGCPGoIAMKafkaKubernetesOpenTofuPostgreSQLPythonTerraform

Industry

Software developmentDecentralized crypto social networks

Relocation

No