/
Jobs / ClearRoute

AI Platform Engineer

ClearRoute · London, ENG, United Kingdom
London, ENG, United KingdomHybrid
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Not specified

Job summary

ClearRoute is seeking an AI Platform Engineer to design, build, and operate modern developer platforms across various industries. The role involves leading technical workstreams, managing CI/CD pipelines, and integrating AI workloads into existing developer workflows.

Qualifications

  • Solid hands-on experience with at least one major cloud provider (AWS preferred; GCP or Azure accepted).
  • Production experience with Kubernetes and familiarity with the surrounding ecosystem (Helm, ArgoCD / Flux, Karpenter, Cilium, etc.).
  • Strong infrastructure-as-code skills, Terraform is the baseline; other tools are a bonus.
  • Practical experience designing or operating CI/CD systems (GitHub Actions, GitLab CI, Tekton, or similar).
  • Comfort working in Linux environments and writing automation in Python, Bash, or Go.
  • The ability to explain complex technical concepts clearly to a mixed audience.
  • A consulting mindset: you care about solving the client's actual problem, not just delivering a deliverable.

Responsibilities

  • Design and deliver internal developer platforms (IDPs) that improve developer experience and accelerate software delivery.
  • Build and maintain infrastructure-as-code using Terraform, Pulumi, or CDK and enforce code review and testing standards.
  • Manage and optimise Kubernetes clusters (EKS, GKE, AKS) including multi-tenancy, networking, RBAC, and cost controls.
  • Own CI/CD pipelines end-to-end: from source control policies through build, test, security scanning, artefact management, and deployment.
  • Implement secrets management and certificate lifecycle automation using HashiCorp Vault or equivalent.
  • Embed SRE practices: SLOs, error budgets, runbooks, on-call design, and blameless post-mortems.
  • Integrate security tooling (SAST, DAST, dependency scanning, policy-as-code) into delivery pipelines.
  • Design and test disaster-recovery strategies; automate them where possible.
  • Ensure compliance with client security standards and relevant regulatory frameworks.
  • Design and operate infrastructure for AI/ML workloads: GPU node pools, model-serving runtimes (Triton, vLLM, BentoML), and vector database deployments (pgvector, Weaviate, Qdrant).
  • Build and maintain MLOps pipelines model training, versioning, evaluation, and promotion to production using platforms such as Kubeflow, MLflow, or cloud-native equivalents.
  • Integrate LLM APIs and AI agent frameworks into existing developer platforms, including prompt management, observability, cost controls, and rate-limit guardrails.
  • Advise clients on AI readiness: data infrastructure, governance, security controls (model access policies, output filtering), and the organisational changes that sit alongside the technical work.
  • Stay current with the fast-moving AI tooling landscape and bring relevant ideas back to the team and to clients.
  • Lead technical discovery sessions and platform assessments with client engineering and architecture teams.
  • Translate client requirements into clear technical plans and communicate trade-offs to both technical and non-technical stakeholders.
  • Coach and upskill client platform and application engineers through pairing, workshops, and code review.
  • Produce high-quality documentation, architecture decision records (ADRs), and runbooks that clients can own after the engagement.

Skills

AKSAnsibleArgo CDAWSAzureBashAWS CDKChefCiliumEKSFluxGCPGitHubGitHub ActionsGitLabGitLab CIGKEGoHelmKafkaKubernetesLinuxPulumiPythonTektonTerraformVault

Relocation

No