Jobs / Flexential

Sr Manager, Platform Engineering

Flexential · Centennial, CO, United States
Centennial, CO, United StatesExp: 12+ yrs170,000-200,000 USD/yearlyOnsite
Remuneration
170,000-200,000 USD/yearly
Location
Centennial, CO, United States
Visa sponsorship
Not specified

Job summary

Flexential is seeking a Platform Engineering leader to plan roadmaps, establish requirements, develop, and operationally manage platform technologies including Observability, DevOps, ITSM, and Integrations. This role involves engineering management and hands-on technical work, focusing on building foundational platforms for Flexential's IT services and enabling AIOps and AI infrastructure.

Benefits

Medical insuranceTelehealthDental insuranceVision insurance401(k)Health Savings Accounts (HSA)Flexible Spending Accounts (FSA)Life and AD&D insuranceShort Term DisabilityLong-Term DisabilityFlexible Paid Time Off (PTO)Leave of AbsenceEmployee Assistance ProgramWellness ProgramRewards and Recognition Program

Qualifications

  • 12+ years of relevant technical experience.
  • 4+ years in a management or Principal-level role leading an engineering team.
  • 8+ years of DevOps or Platform Engineering experience with end-to-end ownership of developer/infrastructure platforms, Kubernetes, Helm, ArgoCD, service-mesh, and containerized workloads.
  • 5+ years of GitOps or CI/CD experience, including GitLab CI/CD, pipeline authoring, and infrastructure-as-code delivery.
  • 8+ years of expert-level automation frameworks experience with Python, Terraform, and Ansible.
  • 8+ years of Infrastructure (Linux/VM) experience, including Linux systems administration, VM lifecycle (VMware vCenter/VCF), Netapp storage, and compute provisioning.
  • 3+ years of working knowledge of Networking, including TCP/IP, BGP/OSPF, and SNMP protocol.
  • Strong understanding or 1+ years of experience with AI tooling, MCP, Agentic workflows, and SRE workflows (e.g., AIOps for anomaly detection, event correlation, alert noise reduction on Prometheus and Grafana stack).
  • 4+ years of experience with Secrets & Security, including CyberArk, Conjur, Vault, RBAC design, and compliance boundary architecture.
  • 4+ years of Engineering Management experience, including hiring, team building, performance management, and roadmap ownership for teams of 5+ engineers.
  • Hands-on experience or working knowledge of Boomi integrations PaaS (iPaaS) technologies.
  • Experience with design and development of DR test application/automation and process workflows for corporate BCP execution.
  • Hands-on experience working with AWS products in a Well-architected Framework and multi-account model to develop compute, storage, network IaaS and PaaS services for IT applications.
  • Hands-on experience working with BAS/BMS systems in a Datacenter/OT environment.

Responsibilities

  • Lead the design, development, deployment, and operational management of automated, resilient, high-availability, self-healing, secure platforms with native-AI capabilities for IT needs.
  • Lead, build, and manage the Platform Engineering team, including hiring, mentoring, performance management, and technical roadmap ownership.
  • Plan, build, and operate an OpenTelemetry Observability platform using Grafana, Mimir, Loki, Tempo, and Alertmanager on Kubernetes/RKE2 with Helm and ArgoCD.
  • Build an automated federated Observability Edge Stack, including Prometheus, OTel collector nodes, Zabbix auto-discovery, and Prometheus scrape profile library.
  • Design, develop, and manage engineering lifecycle platforms for high-velocity secure SDLC using Gitlab.
  • Build and operate Infrastructure as Code (IaC) and CI/CD platforms, including GitLab CI/CD, Terraform, Ansible AWX, Helm, and ArgoCD.
  • Own, enhance, and operate critical IT platform technologies such as Boomi for integrations and AWS for Cloud environments.
  • Establish and enforce platform security posture, including secrets management via CyberArk/Conjur, RBAC, mTLS, compliance boundary design, and zero inbound telemetry architecture.
  • Build and integrate ITSM capabilities for various platforms, such as automated incident creation, CI enrichment, and CMDB correlation.
  • Define and implement extensibility patterns, including AIOps for anomaly detection hooks, event correlation pipeline design, and integration with ML/AI tooling.
  • Partner with IT and business teams for application development, requirements capture, delivery validation, and integration needs.
  • Represent platform engineering in cross-functional architecture reviews and executive-level program updates.
  • Perform management and technical duties for team and operational resilience, including team building and on-call rotation.

Skills

AnsibleArgo CDAWSGitLabGitLab CIGrafanaHelmKubernetesLinuxLokiMimirOpenTelemetryPrometheusPythonTempoTerraformVaultVMware

Travel

Travel may be required for team or project events

Relocation

No