Jobs / gridscale GmbH
Site Reliability Engineer (m/f/d)
gridscale GmbH · Köln, NW, Deutschland
Köln, NW, DeutschlandRemote
Remuneration
Not specified
Location
Köln, NW, Deutschland
Visa sponsorship
Not specified
Job summary
As a Site Reliability Engineer, you will be part of a team responsible for building the framework to package and deploy software solutions on top of OPCP. You will work on the conception, automation, and operation of the platform and drive continuous improvements. The role involves working in a security-oriented environment with a high degree of automation and maintaining an overview of ambiguous situations to make well-founded decisions.
Qualifications
- Deep knowledge of Linux/Unix system administration and internals.
- Hands-on experience with cloud platforms, especially OpenStack, including infrastructure provisioning and management.
- Proficiency with Infrastructure as Code tools such as Terraform and Ansible.
- Experience with containerization and orchestration technologies, especially Kubernetes.
- Skilled in building and maintaining CI/CD pipelines.
- Hands-on experience with modern observability stacks, including metrics, logs, and traces.
- Proficiency in scripting and automation using Python, Bash, and Go.
- Understanding of security fundamentals including IAM, secrets management, hardening, and compliance.
- Understanding of distributed systems concepts such as the CAP theorem, consensus, and fault tolerance.
- Strong problem-solving skills, adaptability, and strict documentation discipline.
- Ability to thrive in collaborative product-team environments with a strong ownership mentality and a blameless culture mindset.
- Strong communication skills and a passion for continuous improvement.
Responsibilities
- Build the framework for building and deploying Cloud Store packages.
- Drive the ongoing development of the Kubernetes stack and implement GitOps workflows.
- Support other teams and help them onboard with OPCP.
- Actively participate in system analysis and derive improvements.
- Develop and maintain Infrastructure as Code.
- Participate in an on-call rotation.
Skills
AnsibleBashFluxGoGrafanaIAMKubernetesLinuxOpenStackOVHPrometheusPythonTerraform
Relocation
No