Jobs / BlackRock

Vice President, Site Reliability Engineer

Apply Now

BlackRock · Edinburgh, SCT, United Kingdom

Edinburgh, SCT, United KingdomExp: 5-8 yrsRemote

Apply Now

Remuneration

Not specified

Location

Edinburgh, SCT, United Kingdom

Visa sponsorship

Not specified

Job summary

BlackRock is seeking a Site Reliability Engineer (SRE) for a new Client Services-focused role. This position combines deep reliability engineering with strong client partnership, working closely with the Technology Client Experience team. The SRE will provide focused reliability engagement for priority clients, managing escalations, improving onboarding readiness, surfacing systemic risks, and translating client pain points into durable engineering improvements.

Benefits

Retirement investmentEducation reimbursementPhysical health resourcesEmotional well-being resourcesFamily support programsFlexible Time Off (FTO)

Qualifications

B.S. or M.S. degree in Computer Science, Engineering, or a related discipline with 5-8 years of experience.
Strong experience in Site Reliability Engineering, production engineering, or a related reliability-focused role supporting critical systems.
Demonstrated ability to manage complex incident escalations and coordinate across engineering, product, operations, and stakeholder groups.
Strong communication skills, including translating technical issues into clear narratives for senior stakeholders and client-facing partners.
Experience driving operational readiness, onboarding readiness, or production supportability reviews for high-scale systems or strategic initiatives.
Strong troubleshooting and problem-solving skills, identifying immediate remediation paths and underlying systemic issues.
Passion for improving the reliability, resilience, and supportability of highly available systems.
Experience with observability, monitoring, and telemetry tools for incident detection, diagnosis, and prevention.
Ability to build strong cross-functional relationships and influence outcomes without direct authority.
Self-motivated, highly accountable, and comfortable operating in ambiguous, fast-moving environments.
Knowledge of software development methodologies, release processes, and operational support models.
Strong analytical thinking and a bias toward proactive risk identification and prevention.
Experience working closely with client-facing engineering, support, or relationship teams (Good to Have).
Familiarity with onboarding processes, change governance, and operational readiness frameworks (Good to Have).
Exposure to cloud ecosystems such as AWS or Azure (Good to Have).
Experience with relational databases and distributed systems (Good to Have).
Familiarity with automation, scripting, and modern DevOps / CI/CD practices (Good to Have).
Experience defining or supporting critical user journey monitoring, SLOs, SLIs, or service health reporting (Good to Have).
Exposure to large-scale enterprise platforms where reliability, stakeholder coordination, and operational rigor are critical (Good to Have).

Responsibilities

Act as a client-facing reliability partner, coordinating during incidents, escalations, onboarding, and major operational events.
Assist with incident management, including technical coordination, issue narrative, stakeholder communication, and resolution.
Partner with Technology Client Experience, engineering, and platform teams to address reliability issues.
Proactively support onboarding and operational readiness for top-tier clients by identifying systemic risks and validating supportability.
Translate recurring client pain points and onboarding learnings into actionable systemic reliability improvements.
Engage early in new client onboarding, change planning, and design discussions to proactively surface risks.
Navigate the organization to unblock remediation actions and accelerate resolution of high-priority client reliability issues.
Improve engineering culture by reinforcing a deliberate, consistent, and non-reactive approach to client reliability partnership.
Contribute to architectural, operational readiness, and observability discussions with a focus on client impact, resilience, and supportability.
Design and improve monitoring, telemetry, and operational visibility for client-critical workflows and journeys.
Drive detailed root cause investigations for significant client-impacting incidents, focusing on prevention.
Create and coordinate retrospectives for key incidents and onboarding events, capturing learnings and follow-up actions.
Anticipate opportunities to strengthen the resiliency profile of systems and workflows important to priority clients.
Act as a culture carrier for SRE principles, connecting engineering decisions to client experience and trust.

Skills

AWSAzure

Degrees

B.S. in Computer ScienceM.S. in Computer ScienceB.S. in EngineeringM.S. in Engineering

Industry

Investment managementRisk managementAdvisory services

Relocation

Apply Now