Jobs / ClearRoute
AI Platform Engineer
ClearRoute · Edinburgh, SCT, United Kingdom
Edinburgh, SCT, United KingdomHybrid
Remuneration
Not specified
Location
Edinburgh, SCT, United Kingdom
Visa sponsorship
Not specified
Job summary
ClearRoute is seeking a Platform Engineer to design, build, and operate modern developer platforms across various industries. The role involves leading technical workstreams, managing CI/CD pipelines, and integrating AI/ML workloads into existing systems.
Qualifications
- Solid hands-on experience with at least one major cloud provider (AWS preferred; GCP or Azure accepted).
- Production experience with Kubernetes and familiarity with the surrounding ecosystem (Helm, ArgoCD / Flux, Karpenter, Cilium, etc.).
- Strong infrastructure-as-code skills, Terraform is the baseline; other tools are a bonus.
- Practical experience designing or operating CI/CD systems (GitHub Actions, GitLab CI, Tekton, or similar).
- Comfort working in Linux environments and writing automation in Python, Bash, or Go.
- The ability to explain complex technical concepts clearly to a mixed audience.
- A consulting mindset: you care about solving the client's actual problem, not just delivering a deliverable.
Responsibilities
- Design and deliver internal developer platforms (IDPs) that improve developer experience and accelerate software delivery.
- Build and maintain infrastructure-as-code using Terraform, Pulumi, or CDK and enforce code review and testing standards.
- Manage and optimise Kubernetes clusters (EKS, GKE, AKS) including multi-tenancy, networking, RBAC, and cost controls.
- Own CI/CD pipelines end-to-end: from source control policies through build, test, security scanning, artefact management, and deployment.
- Implement secrets management and certificate lifecycle automation using HashiCorp Vault or equivalent.
- Embed SRE practices: SLOs, error budgets, runbooks, on-call design, and blameless post-mortems.
- Integrate security tooling (SAST, DAST, dependency scanning, policy-as-code) into delivery pipelines.
- Design and test disaster-recovery strategies; automate them where possible.
- Ensure compliance with client security standards and relevant regulatory frameworks.
- Design and operate infrastructure for AI/ML workloads: GPU node pools, model-serving runtimes (Triton, vLLM, BentoML), and vector database deployments (pgvector, Weaviate, Qdrant).
- Build and maintain MLOps pipelines model training, versioning, evaluation, and promotion to production using platforms such as Kubeflow, MLflow, or cloud-native equivalents.
- Integrate LLM APIs and AI agent frameworks into existing developer platforms, including prompt management, observability, cost controls, and rate-limit guardrails.
- Advise clients on AI readiness: data infrastructure, governance, security controls (model access policies, output filtering), and the organisational changes that sit alongside the technical work.
- Stay current with the fast-moving AI tooling landscape and bring relevant ideas back to the team and to clients.
- Lead technical discovery sessions and platform assessments with client engineering and architecture teams.
- Translate client requirements into clear technical plans and communicate trade-offs to both technical and non-technical stakeholders.
- Coach and upskill client platform and application engineers through pairing, workshops, and code review.
- Produce high-quality documentation, architecture decision records (ADRs), and runbooks that clients can own after the engagement.
Skills
AKSAnsibleArgo CDAWSAzureBashAWS CDKChefCiliumEKSFluxGCPGitHubGitHub ActionsGitLabGitLab CIGKEGoHelmKafkaKubernetesLinuxPulumiPythonTektonTerraformVault
Relocation
No