Jobs / Advanced Solutions International
DevOps Reliability Engineer
Advanced Solutions International · United Kingdom · Remote
United KingdomExp: 8+ yrsRemote
Remuneration
competitive base salary and bonuses for eligible positions
Location
United Kingdom · Remote
Visa sponsorship
Not specified
Job summary
Advanced Solutions International is seeking a DevOps Reliability Engineer to enhance the performance, scalability, reliability, and cost efficiency of their Azure-based SaaS platform. This role involves proactively identifying improvement opportunities, tuning Azure services, and collaborating with engineering teams to ensure system efficiency and resilience. The ideal candidate will possess strong Azure skills, the ability to troubleshoot code, and a continuous improvement mindset.
Benefits
Wellness BenefitsOpportunities for Professional Growth and DevelopmentFlexible Remote WorkVolunteer Time OffStudy LeaveEmployee Assistance Program
Qualifications
- Bachelors degree in Computer Science, Information Technology or related degree or relevant experience.
- 8+ years of experience in DevOps, Site Reliability Engineering, Cloud Engineering, or similar roles.
- Strong hands-on experience with Microsoft Azure, especially: Azure SQL, Azure Functions, Azure App Services, and Azure Containers (AKS, Container Apps, or similar).
- Ability to read and interpret telemetry, logs, metrics, and resource usage data and explain what’s wrong and how to fix it.
- Experience working with production systems that require high availability and reliability.
- Comfort owning work end-to-end, from identifying issues to executing improvements.
- Experience adjusting pipelines, hosting configurations, and deployment processes.
- Solid understanding of cloud cost drivers and usage optimization.
- Strong problem-solving skills and the ability to work collaboratively across engineering and support team.
- Ability to read and interpret application code to support troubleshooting, root cause analysis, and identification of performance improvement opportunities.
Responsibilities
- Monitor and improve the health, availability, performance, and cost efficiency of Azure-based production systems.
- Use application, database, and infrastructure telemetry to identify performance issues, bottlenecks, and reliability risks.
- Tune Azure services and platform configurations to maximize performance, resilience, and resource efficiency.
- Partner with engineering teams to recommend and implement practical, data-driven improvements to reliability, scalability, and operational effectiveness.
- Create and maintain operational documentation, runbooks, and troubleshooting guides to support consistent incident response and ongoing operations.
- Support Tech Support and Sustained Engineering by executing approved SQL queries and completing database backups and restores for troubleshooting purposes.
- Analyze how partner integrations and customer usage patterns impact system performance and cloud spend.
- Investigate complex production issues, perform root cause analysis, and drive resolution of reliability and performance problems.
- Contribute to continuous improvement across deployment processes, system stability, and operational readiness.
- Perform other job-related duties and responsibilities as assigned.
Skills
AKSAzure
Degrees
Bachelors degree in Computer ScienceBachelors degree in Information Technology
Relocation
No