Jobs / URBN
URBN Senior DevOps Engineer
URBN · Philadelphia, PA, United States
Philadelphia, PA, United StatesExp: 5+ yrsRemote
Remuneration
Not specified
Location
Philadelphia, PA, United States
Visa sponsorship
Not specified
Job summary
URBN is hiring a Senior DevOps Engineer to own and evolve the platform layer for ecommerce sites, order management systems, and integration pipelines. This role involves ensuring high availability of customer-facing systems and building infrastructure for enterprise AI and agentic automation. The engineer will partner with engineering teams to translate ambiguous problems into stable, scalable systems.
Benefits
MedicalDentalVisionPaid time offEmployee discountsRetirement savings
Qualifications
- 5+ years in DevOps, Platform Engineering, or SRE roles.
- Deep expertise with Kubernetes (EKS, GKE, or AKS) in production.
- Experience with Infrastructure-as-Code using Terraform.
- Experience with cloud-native architecture on AWS, GCP, or Azure.
- Experience with CI/CD tooling such as GitHub Actions, ArgoCD, or Tekton.
- Strong scripting and automation skills in Python or Go.
- Experience with observability stacks like OpenTelemetry, Prometheus, Grafana, or New Relic.
- Demonstrated experience shipping and operating production ML or AI systems.
- Hands-on experience with LLM APIs (Anthropic, OpenAI, or AWS Bedrock) is preferred.
- MCP (Model Context Protocol) server development or deployment experience is preferred.
- Experience with vector database operations (Pinecone, Weaviate, Qdrant, or pgvector) is preferred.
- Experience building platforms at enterprise scale (1,000+ internal users) is preferred.
- Developer experience (DX) engineering background is preferred.
Responsibilities
- Support and evolve infrastructure for ecommerce sites, ensuring high availability, performance, and resilience during peak traffic events.
- Operate and improve infrastructure for Order Management System (OMS) and integration layers connecting commerce, fulfillment, ERP, and third-party services.
- Design and maintain deployment pipelines for Java, Node, and Python with standards for testing, rollback, and release safety.
- Build tooling and abstractions for engineering teams to self-serve infrastructure, standardize deployments, and ship faster.
- Architect and operate the platform layer for enterprise AI initiatives, including LLM orchestration, MCP server deployments, and agent execution runtimes, with a focus on security and cost efficiency.
- Partner with product and data teams to build infrastructure making AI capabilities accessible, including RAG pipelines, vector search, embedding services, and API gateways.
- Own the observability stack across commerce and AI systems, including distributed tracing, metrics, alerting, and on-call support.
- Implement and maintain controls for access governance, secrets management, PII handling, and audit logging across commerce-critical and AI workloads.
- Monitor and optimize cloud spend across ecommerce infrastructure, integration services, and AI inference using autoscaling, rightsizing, and intelligent traffic routing.
Skills
AKSArgo CDAWSAzureEKSGCPGitHubGitHub ActionsGKEGoGrafanaJavaKubernetesNew RelicOpenTelemetryPrometheusPythonTektonTerraform
Certifications
CKSCISSPCCSP
Languages
PythonGoJavaNode
Relocation
No