Jobs / Signal AI

Site Reliability Engineer

Apply Now

Signal AI · London, ENG, United Kingdom

London, ENG, United Kingdom70,000-85,000 GBP/yearlyRemote

Apply Now

Remuneration

70,000-85,000 GBP/yearly

Location

London, ENG, United Kingdom

Visa sponsorship

Not specified

Job summary

Signal AI is seeking a Site Reliability Engineer to join their Infrastructure team. The role involves evolving and scaling the infrastructure behind Signal AI's decision intelligence platform, with a focus on AI-augmented operations, security in the age of AI, and acquisition integration. The ideal candidate will be curious, collaborative, and eager to shape the team's direction.

Qualifications

Solid AWS and Terraform experience
Proficiency in Python or Go for operational problem-solving
Understanding of distributed systems, failure modes, observability, and blast radius
Ability to take problems end-to-end
Pragmatic approach to AI tooling, with clear reasoning for its use or non-use
Open communication skills
Comfortable providing constructive feedback for improvement

Responsibilities

Run and evolve infrastructure for Signal AI's decision intelligence platform
Scale existing infrastructure work
Integrate infrastructure from recent acquisitions
Thoughtfully apply AI in operational work
Define SRE best practices for incident triage, runbook generation, capacity planning, and cost analysis
Address security concerns in the age of AI
Bring acquired product infrastructure to Signal AI's reliability, security, and operational standards
Consolidate batch jobs onto EKS for unified scheduling, cost visibility, and operational tooling
Own workstreams end-to-end
Lead SRE response to production incidents
Host post-mortems
Identify and implement measurable improvements
Drive multi-quarter workstreams with clear direction
Contribute insights to the AI-in-operations playbook

Skills

AWSEKSElasticsearchGoLinuxPythonTerraform

Relocation

Apply Now