Jobs / University of Bristol
AI Supercomputing Infrastructure Engineer (x3 FTE)
University of Bristol · Bristol, ENG, United Kingdom
Bristol, ENG, United KingdomFull time50,253-58,225 GBP/yearlyHybrid
Remuneration
50,253-58,225 GBP/yearly
Location
Bristol, ENG, United Kingdom
Visa sponsorship
Not specified
Job summary
The Bristol Centre for Supercomputing (BriCS) is seeking an AI Supercomputing Infrastructure Engineer to join their team. This role involves developing and operating compute and software infrastructure for the Isambard-AI National Artificial Intelligence Research Resource and Isambard3 Tier-2 Supercomputer. The successful candidate will build and operate infrastructure and compute platforms for researchers, focusing on large, highly available supercomputing services.
Qualifications
- Ability to quickly obtain deep technical understanding of new domains.
- Self-directed in identifying important problems.
- Experience building, maintaining, and securing modern software-defined supercomputing systems.
- Enjoy working with world-class domain and AI researchers.
- Experience building small to large clusters or physical/software-defined systems.
- Motivation to scale systems to national impact.
- Experience building large distributed, highly available systems.
- Desire to support open national-scale research in a cybersecurity compliant manner.
- Domain expertise in NetOps or DevOps.
- Degree or equivalent practical experience in computer science, computational or ML/AI research, or natural science with strong computer science/computational research competence.
- Good organizational skills to manage own workload and mentor less experienced team members.
Responsibilities
- Source hardware and design systems.
- Deploy software-defined infrastructure using Kubernetes and Terraform/OpenTofu.
- Build and operate platforms for researchers to conduct leading-edge research.
- Optimize and refine software for environmental and economic efficiency.
- Build and operate infrastructure and compute platforms for researchers.
- Utilize Python, Rust, Terraform/OpenTofu, Kubernetes, Git, and Bash.
- Design and operate large, highly available supercomputing services as software-defined infrastructures.
- Integrate computational experiments.
- Design and operate massive-scale GPU and combined CPU/GPU workloads.
- Design and debug platforms.
- Collaborate with researchers to co-design solutions for new algorithms and software.
Skills
BashGitKubernetesOpenTofuPythonRustTerraform
Degrees
Computer ScienceComputational ResearchML/AI ResearchNatural Science
Work schedule
Monday - Friday, 35 hours per week
Contract length
Open ended with fixed funding until August 2030
Relocation
No