Jobs / Impower GmbH

Senior Site Reliability Engineer (m/w/d)

Apply Now

Impower GmbH · München, BY, Deutschland

München, BY, DeutschlandExp: 5+ yrsHybrid

Apply Now

Remuneration

Not specified

Location

München, BY, Deutschland

Visa sponsorship

Not specified

Job summary

Impower is seeking a Senior Site Reliability Engineer to own the reliability and operational foundations of their AI-driven ERP platform. This role involves working with Kubernetes, AWS, CI/CD, observability, and security to ensure scalable, resilient, and secure systems. The ideal candidate will have 5+ years of experience building and operating production systems in cloud environments.

Benefits

Hybrid setupFlexible hoursOwnershipMeaningful impactModern tech stackGrowth opportunitiesSupportive cultureAutonomyTrustCollaborationOnboarding guidance

Qualifications

5+ years building and operating production systems in cloud environments
Real ownership of non-trivial systems at scale
Deep, hands-on production Kubernetes experience, including operators, networking, autoscaling, and debugging
Strong working knowledge of EKS, RDS, ALB, IAM, VPC, S3, and operational realities of running services on AWS
Solid Terraform experience with disciplined IaC practices
Hands-on experience with ArgoCD, Helm, or equivalent declarative deployment tooling
Security expertise in cloud-native environments: IAM best practices, secrets management, secure network architecture, container and dependency vulnerability scanning, secure SDLC principles
Familiarity with compliance frameworks (e.g., ISO 27001, SOC 2)
Proactively identify risks and contribute to incident response and audit readiness
Experience building dashboards, defining SLOs, running incidents, and using learning to improve systems
Comfortable scripting and building tooling in Python, Go, Bash, or similar
Excellent written and verbal English (C1+)
Ability to document decisions, write effective runbooks, and clearly explain tradeoffs

Responsibilities

Own platform reliability end-to-end
Co-own Kubernetes-based platform on AWS, including ingress, autoscaling, service mesh, config, and secrets
Ensure platform scales with growth
Drive CI/CD excellence
Evolve GitLab, Terraform, ArgoCD/Helm pipelines for faster, safer delivery of Java/Spring Boot and React applications
Provide self-service capabilities for product teams
Manage cloud infrastructure
Design and operate scalable AWS infrastructure (EKS, RDS, ALB, IAM, VPC, S3) using Infrastructure as Code
Maintain strong IaC discipline and clear change management
Strengthen observability
Improve Sentry, Grafana, Prometheus, and Loki setup for SLO definition, fast debugging, and confident service operation
Lead on security
Own security posture across infrastructure and application layers (IAM, secrets management, network segmentation, container and dependency scanning, vulnerability management, supply chain security, audit readiness)
Embed security as a design constraint
Improve incident response
Strengthen on-call practices, runbooks, and post-incident learning
Enable product teams
Provide tooling, guidance, and self-service capabilities for better operational and deployment practices
Support broader platform surface, including Temporal workflows, PostgreSQL operations, S3, Estuary CDC pipeline, and AI service infrastructure on GCP/Azure

Skills

Argo CDAWSAzureBashEKSGCPGitLabGoGrafanaHelmIAMJavaKubernetesLokiPostgreSQLPrometheusPythonS3SentryTerraformTypeScriptGitLab CI

Languages

English

Industry

Property managementSaaS

Relocation

Apply Now