Jobs / vCluster Labs

AI Infrastructure Engineer

Apply Now

vCluster Labs · Deutschland

DeutschlandExp: 5+ yrs150,000-200,000 EUR/yearlyRemote

Apply Now

Remuneration

150,000-200,000 EUR/yearly

Location

Deutschland

Visa sponsorship

Not specified

Job summary

As vCluster’s AI Infrastructure Specialist, you will work directly with customers at the earliest and most critical stage of their journey: from bare metal GPU nodes through to a production-ready deployment. This is not a traditional professional services role; you operate pre-sale as part of a proof of value engagement scoped to reach production. You will be one of the first team members a neocloud or AI Factory engages with at a technical depth, and the playbooks you develop will scale the motion for the next hire and customer.

Qualifications

5+ years of experience deploying and operating Kubernetes in production, ideally on bare metal or in high-complexity environments.
Practical knowledge of NVIDIA GPU Operators, CUDA tooling, and systems-level configuration for GPU nodes.
Deep understanding of CNI plugins, overlay networks, load balancing, and connectivity diagnosis in layered environments.
Experience with persistent volume configuration, CSI drivers, and distributed systems like Ceph, Rook, Weka, or Longhorn.
Comfort operating in ambiguous, fast-moving environments.
Thrive in environments that reject legacy tech and prefer a modern stack.
Experience writing automation scripts with Bash, Python, or Go (bonus).
Relevant certifications such as CKA or experience writing Kubernetes Operators (bonus).
Experience with inference serving, GPU scheduling, and tooling around LLM deployment (bonus).
Experience building AI Automation in documentation to contribute to a shared knowledge base (bonus).

Responsibilities

Lead technical deployments for GPU neocloud and AI Factory customers, from bare metal configuration to a validated vCluster environment.
Configure and troubleshoot bare metal GPU node infrastructure, including CNI configuration, GPU Operator setup, distributed storage backends, and RDMA/InfiniBand.
Deploy and validate Kubernetes and vCluster to provide GPU-powered managed Kubernetes.
Work alongside customer teams to build self-sufficiency, ensuring independent operation and growth of the platform.
Document reusable playbooks and deployment architectures.
Collaborate with Engineering and Product to surface recurring infrastructure challenges, providing direct feedback into the roadmap.
Join Sales in the pre-sales process where deep infrastructure work is required for proof of value.

Skills

BashCephGitHubGitLabGoKubernetesMakePython

Certifications

CKA (Certified Kubernetes Administrator)

Relocation

Apply Now