Jobs / LSEG (London Stock Exchange Group)

Principal Cloud SRE / Cloud SME – LSEG Workspace

LSEG (London Stock Exchange Group) · London, ENG, United Kingdom
London, ENG, United KingdomRemote
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Not specified

Job summary

LSEG is seeking a Principal Cloud Site Reliability Engineer (SRE) to enhance the reliability, scalability, and operational health of their LSEG Workspace platform. This role involves hands-on engineering, technical guidance, and shaping long-term platform approaches for cloud-native services across AWS and Azure. The ideal candidate will contribute to thoughtful engineering, solve complex problems, and influence the building and running of reliable systems.

Benefits

HealthcareRetirement planningPaid volunteering daysWellbeing initiatives

Qualifications

  • Substantial experience in cloud engineering, site reliability engineering, or DevOps at a senior or principal level
  • Hands-on approach to building and supporting production systems
  • Strong experience with AWS and/or Azure
  • Comfortable operating Kubernetes-based platforms such as EKS
  • Solid Linux foundation
  • Familiar with SRE concepts and observability practices
  • Experience with CI/CD tooling such as GitLab
  • Experience with infrastructure-as-code solutions like Terraform
  • Understand cloud-native networking and high-availability design
  • Comfortable working in Agile or Scrum-based teams
  • Enjoy working collaboratively, communicating clearly, and learning in an evolving technical environment

Responsibilities

  • Contribute hands-on engineering skills
  • Share technical guidance
  • Shape long-term platform approaches for cloud-native services
  • Design and operate Workspace platform services for resilience, scalability, and understanding
  • Contribute to cloud infrastructure and application reliability across AWS and Azure
  • Focus on availability, performance, and sustainable cost practices
  • Develop and refine SLIs, SLOs, and error budgets
  • Use incidents for shared learning and improvement
  • Provide experience-based input into distributed systems design
  • Provide input on container platforms such as EKS, cloud networking, storage, and security
  • Support exploration of new technologies aligned with platform needs and long-term maintainability
  • Investigate and resolve complex service issues
  • Improve observation, understanding, and prevention of service issues
  • Build observability practices for early detection, clear communication, and effective root-cause analysis
  • Contribute to automating infrastructure, delivery pipelines, and operational processes
  • Support CI/CD workflows using GitLab
  • Use infrastructure as code tools such as Terraform for consistency and confidence
  • Ensure cloud services align with governance expectations, vulnerability management, and regulatory requirements
  • Support teams to deliver safely and efficiently
  • Share cloud and SRE knowledge

Skills

AWSAzureEKSGitLabKubernetesLinuxTerraform

Industry

Financial markets

Relocation

No