Jobs / Spectrum Life
Dev-Ops Lead Engineer
Spectrum Life · Manchester, ENG, United Kingdom
Manchester, ENG, United KingdomFull timeRemote
Remuneration
Not specified
Location
Manchester, ENG, United Kingdom
Visa sponsorship
Not specified
Job summary
Spectrum.Life is seeking an experienced and forward-thinking DevOps Lead to own the infrastructure, security, and continuous delivery pipelines for their AI-driven projects. This hands-on role involves building and maintaining a resilient, secure, and highly automated environment across cloud platforms, with a focus on AWS. The ideal candidate will be an expert in cloud architecture, security principles, and CI/CD, passionate about automation and leveraging AI to create intelligent, self-healing systems.
Benefits
Competitive salaryEmployee benefitsContinuous professional development and training opportunities25 days of annual leave24/7 EAPHealth and wellbeing supportsEmployee perks and benefits
Qualifications
- Proven experience in a DevOps/Infrastructure Engineering role with a focus on automation.
- Proficiency in managing cloud infrastructure on AWS.
- Experience supporting infrastructure for ML/AI projects (MLOps) is highly desirable.
- Deep, hands-on experience with Infrastructure-as-Code using Terraform.
- Hands-on experience with containerization technologies (Docker, Kubernetes) and networking (VPCs, Load Balancers).
- Expert-level knowledge of building and managing complex CI/CD pipelines, with a strong preference for GitHub Actions.
- Strong understanding of system architecture, security best practices, and compliance standards.
- Comfortable with scripting languages (e.g., Python, Bash) to build automation and tooling.
- Experience with the operational lifecycle of APIs and databases from an infrastructure perspective.
- Hands-on experience with modern observability and error tracking tools such as Sentry, Datadog, Prometheus, or Grafana.
- Deep technical curiosity and a passion for automation.
- Ability to act as a force multiplier, empowering the engineering team with the tools and processes needed to succeed.
- Ability to take complete ownership of your domain and be a reliable partner to engineering teams.
- Strategic thinker who can balance speed and safety, enabling developers to move fast without compromising on security or stability.
- Strong communicator who can explain complex technical concepts to a variety of audiences.
- Proactive in identifying potential issues, reducing technical debt, and improving the overall development lifecycle.
Responsibilities
- Architect, build, and manage scalable and secure infrastructure on AWS using Infrastructure-as-Code principles, primarily with Terraform.
- Develop and maintain bespoke automation scripts to accelerate project setup, on-demand environment creation, and other operational tasks.
- Champion and implement solutions like LocalStack to streamline local development and testing workflows for engineers.
- Provide expert guidance on systems architecture, ensuring infrastructure is designed for performance, scalability, and resilience.
- Collaborate with engineering teams to manage and automate the infrastructure for services, including APIs and databases, ensuring their performance and reliability.
- Develop and improve CI/CD pipelines using GitHub Actions, from code commit to production deployment.
- Integrate and manage automated testing, dependency updates, and security scans within the pipelines to ensure code quality and security.
- Empower engineers with tools and automation to reduce friction, manage technical debt, and focus on building products.
- Define and continuously improve release processes, ensuring smooth and predictable deployments.
- Act as the subject matter expert for security, compliance, and data flows within cloud infrastructure.
- Implement and manage security best practices and automated tooling (SAST/DAST, dependency scanning) to protect applications and data.
- Oversee the security and compliance of AI-related data flows, ensuring data sent to third-party services is minimized, anonymized, and not used for external training purposes.
- Ensure all infrastructure and processes adhere to legal and regulatory requirements, maintaining customer trust and data privacy.
- Implement and manage a robust observability strategy using tools like Sentry, Datadog, and native cloud services.
- Configure critical alerting and monitoring (e.g., AWS CloudWatch Alarms) and integrate them with notification services to ensure rapid response.
- Lead the incident management process for infrastructure-related issues, with a focus on root cause analysis and proactive prevention to minimize hotfixes.
- Champion the use of AI in operations, exploring and implementing tools for anomaly detection, predictive analysis, and automated remediation.
- Collaborate with the existing Core infrastructure engineer on business-as-usual projects to ensure strategic alignment across the company, while maintaining a primary focus on AI project initiatives.
Skills
AWSBashCloudWatchDatadogDockerGitHubGitHub ActionsGrafanaKubernetesPrometheusPythonSentryTerraform
Languages
PythonBash
Relocation
No