Jobs / thyssenkrupp
IT System Specialist – Platform Engineering & Operations (m/w/d)
thyssenkrupp · Kiel, SH, Deutschland
Kiel, SH, DeutschlandRemote
Remuneration
Not specified
Location
Kiel, SH, Deutschland
Visa sponsorship
Not specified
Job summary
thyssenkrupp Marine Systems (TKMS) is seeking an IT System Specialist for Platform Engineering & Operations. This role involves designing, installing, and automating a highly available Kubernetes platform, providing reliable PaaS and SaaS services, and implementing secure build pipelines. The specialist will also be responsible for lifecycle management of Linux hosts and ensuring security compliance.
Qualifications
- Completed IT training or a degree in computer science.
- Ideally CKA/CKAD or LFCS certifications.
- Several years of professional experience with Kubernetes environments.
- Ability to plan, install, and operate productive on-premise Kubernetes clusters (bare metal or virtualized).
- Experience with smooth upgrades and reliable backup and disaster recovery strategies.
- Ability to take project or product responsibility for building and operating internal PaaS/SaaS services.
- Define SLAs and monitor ongoing operations for end-user teams.
- Consistent use of Infrastructure as Code tools and GitOps principles.
- Experience building and operating Observability Stacks.
- Ability to perform incident management (alert handling, root cause analysis, runbook creation).
- Utilize metrics for proactive service optimization.
- In-depth knowledge of Linux system administration (RHEL).
- Knowledge of container security, IT service management, and customer-oriented communication.
- Analytical and structured work approach.
- Ability to identify and solve complex infrastructure problems.
- Flexible, resilient, and proactive work style.
- Continuously optimize the platform.
Responsibilities
- Design, install, and automate a highly available Kubernetes platform.
- Plan upgrades and implement backup and disaster recovery strategies.
- Ensure system readiness at all times.
- Provide reliable PaaS and SaaS services to internal end-user teams.
- Advise on architecture and deployment strategies.
- Proactively respond to end-user requirements.
- Continuously monitor compliance with Service Level Agreements (SLAs).
- Develop declarative Infrastructure as Code (IaC) playbooks.
- Utilize modern GitOps pipelines (Argo CD/Flux) for traceable and automated infrastructure changes.
- Establish secure build pipelines.
- Operate a private container registry.
- Implement runtime security according to best practices.
- Implement and operate a complete Observability Stack.
- Set up meaningful metrics and Service Level Agreements (SLAs).
- Process alerts in real-time.
- Conduct in-depth root cause analyses.
- Document results in comprehensive reports and runbooks.
- Manage the lifecycle of Linux hosts.
- Automate patch and hardening processes according to CIS benchmarks.
- Ensure continuous security compliance through proactive monitoring.
Skills
Argo CDFluxKubernetesLinuxRHEL
Relocation
No