Remuneration
Not specified
Location
Reigate, ENG, United Kingdom
Visa sponsorship
Not specified
Job summary
Seeking a DevOps/SRE Engineer to develop and support operationally resilient cloud infrastructure. The role involves working with product and engineering teams to deliver highly scalable and reliable infrastructure, pipelines, and support tools. The ideal candidate will have experience with Microsoft Azure and Observability platforms in complex SaaS environments.
Qualifications
- Solid experience in DevOps and interest in Site Reliability Engineering
- Experience building and running 24x7 services in Microsoft Azure
- Strong scripting and Infrastructure as Code skills (PowerShell, Terraform, ARM, Pulumi, Bicep)
- Experience with Microsoft Azure in areas such as networking, storage, integration, compute, and analytics
- Experience with cloud observability concerns (logging, tracing, metrics, monitoring, alerting)
- Experience with Windows & Linux containers and orchestration platforms (Docker, Kubernetes)
- Strong interpersonal skills to work effectively with stakeholders
- Solid verbal and written communication skills to present technical information clearly and concisely
- Confidence in making decisions and taking ownership of projects
Responsibilities
- Collaborate with product and engineering teams on design, build, and operational management of client-facing services
- Champion and implement best practice solutions for reliable, performant, and observable SaaS products
- Build and improve CI/CD pipelines for product teams with focus on high cadence and cost effectiveness
- Implement infrastructure as code
- Support the team in infrastructure and networking related issues
- Maintain and configure observability platforms such as Datadog
- Proactively monitor production and other environments to ensure stability, availability, security, and integrity
- Participate in incident response, troubleshooting, and root cause analysis to mitigate and prevent future issues
- Work closely with engineering, support, and operations teams to upskill and promote knowledge transfer, producing training materials and articles
- Participate in on-call rotation to provide support and ensure system uptime
Skills
AzureAzure DevOpsBicepC#DatadogDockerHarnessKubernetesLinuxOpenShiftPowerShellPulumiRHELTerraformWindows
Certifications
Azure AdministratorAzure DeveloperAzure DevOps Engineer
Work schedule
On-call rotation
Industry
InsurTech
Relocation
No