Jobs / Ricoh

Senior Platform Reliability Engineer (Operations)

Apply Now

Ricoh · London, ENG, United Kingdom

London, ENG, United KingdomHybrid

Apply Now

Remuneration

Not specified

Location

London, ENG, United Kingdom

Visa sponsorship

Not specified

Job summary

Ricoh is seeking a Senior Platform Reliability Engineer (Operations) in London. This senior, hands-on role focuses on the reliability, resilience, and operational integrity of Ricoh’s hybrid and cloud platforms. The position involves operating, stabilizing, and continuously improving live platforms to meet availability, performance, security, and compliance standards.

Benefits

Competitive salary packageIndustry leading benefits

Qualifications

Strong business awareness alongside technical capability, with a clear understanding of how infrastructure services support business operations, customer experience, and strategic objectives within a Cloud-First strategy.
Ability to make decisions that prioritise cloud and managed services, balancing technical quality, cost efficiency, risk, security, and service reliability.
Previous experience in a similar role with strong practical knowledge of infrastructure, cloud, and operations.
Strong background in Azure (IaaS, PaaS, networking, identity, storage) and on-prem data centre operations.
Hands-on skills with infrastructure-as-code (Terraform, ARM/Bicep) and configuration management (Ansible, PowerShell DSC).
Experience with CI/CD pipelines (Azure DevOps, GitHub Actions).
Experience with monitoring/observability tools, alert design, and dashboarding.
Commercial awareness, including experience working with vendors and partners, understanding cloud consumption models, licensing and support contracts.
Ability to contribute to cost optimisation, business cases, and risk assessments across cloud and hybrid environments.
Service-focused mindset with knowledge of IT service management, clear communication, and documentation.
Experience with effective incident and problem support, and continuous improvement of platforms and services in line with Cloud-First and automation-led objectives.
Knowledge of networking, security, and OS fundamentals (Windows/Linux).
Experience operating in ISO 27001 or similar regulated environments.
Experience integrating with ITSM platforms (e.g., ServiceNow) and aligning with ITIL processes.
Understanding of business and application impact of infrastructure decisions.
Working knowledge of security, architecture, and vendor/commercial considerations to support informed decision-making.

Responsibilities

Deliver standards for availability, latency, performance, capacity, and scalability.
Participate in root-cause analysis and problem management for major incidents.
Champion blameless post-mortem culture and ensure actions are tracked and closed.
Drive infrastructure-as-code and automation across Azure and co-lo environments.
Evolve image bakery pipeline for secure, repeatable server images.
Embed observability using metrics, logs, traces, and alerting tools.
Partner with SRE and helpdesk teams to deliver service.
Oversee automated patching, vulnerability remediation, and configuration compliance.
Introduce KPIs and dashboards for reliability, incident trends, MTTR, change failure rate, and capacity.

Skills

AnsibleAzureAzure DevOpsBicepGitHubGitHub ActionsLinuxPowerShellServiceNowTerraformWindows

Travel

Occasional travel to datacentres and offices

Relocation

Apply Now