Jobs / OVHcloud
Site Reliability Engineer (m/f/x)
OVHcloud · Köln, NW, Deutschland
Köln, NW, DeutschlandContractHybrid
Remuneration
Not specified
Location
Köln, NW, Deutschland
Visa sponsorship
Not specified
Job summary
As a Site Reliability Engineer, you will be part of a team responsible for building the framework to package and deploy software solutions on top of OPCP. You will work on the conception, automation, and operation of our platform and drive continuous improvements. You will help new teams onboard with the framework and OPCP. We are looking for someone who is comfortable in a security-oriented environment with a high degree of automation (GitOps). You are used to maintaining an overview of ambiguous situations, analyzing systems, and making well-founded decisions based on this analysis.
Qualifications
- Deep knowledge of Linux/Unix system administration and internals
- Hands-on experience with cloud platforms, especially OpenStack, including infrastructure provisioning and management
- Proficiency with Infrastructure as Code tools such as Terraform and Ansible
- Experience with containerization and orchestration technologies, especially Kubernetes
- Skilled in building and maintaining CI/CD pipelines
- Hands-on experience with modern observability stacks, including metrics, logs and traces
- Proficiency in scripting and automation using Python, Bash, and Go
- Understanding of security fundamentals including IAM, secrets management, hardening and compliance
- Understanding of distributed systems concepts such as the CAP theorem, consensus and fault tolerance
- Strong problem-solving skills, adaptability and strict documentation discipline
- Ability to thrive in collaborative product-team environments with a strong ownership mentality and a blameless culture mindset
- Strong communication skills and a passion for continuous improvement
Responsibilities
- Build the framework for building and deploying Cloud Store packages
- Drive the ongoing development of our Kubernetes stack and implement GitOps workflows (e.g., with FluxCD)
- Support other teams and help them onboard with OPCP
- Actively participate in system analysis and derive improvements, even when initial requirements are unclear
- Develop and maintain our Infrastructure as Code using tools like Ansible and Terraform
- Participate in an on-call rotation
Skills
AnsibleBashFluxGoIAMKubernetesLinuxOpenStackPythonTerraform
Languages
PythonBashGo
Work schedule
On-call rotation
Industry
ITTechnologyProduct
Relocation
No