Jobs / American Express

Site Reliability Engineer II

American Express · London, ENG, United Kingdom
London, ENG, United KingdomHybrid
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
No visa sponsorship
Employment eligibility to work with American Express in the UK is required as the company will not pursue visa sponsorship for these positions.

Job summary

The Site Reliability Engineer II collaborates with engineering teams to enhance system resilience, scalability, and performance through feature development, automation, architectural design, resiliency testing, and disaster recovery planning, while promoting best practices for continuous improvement. This role is part of the Enterprise Technology Services organization, which supports technology, digital, and data capabilities globally for American Express.

Benefits

Competitive base salariesBonus incentivesSupport for financial well-being and retirementComprehensive medical, dental, vision, life insurance, and disability benefitsFlexible working model with hybrid, onsite or virtual arrangementsGenerous paid parental leave policiesFree access to global on-site wellness centers staffed with nurses and doctorsFree and confidential counseling support through Healthy Minds programCareer development and training opportunities

Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or comparable experience
  • Advanced degree preferred
  • Knowledge of modern observability stack including Splunk, Elastic Search, Prometheus, Grafana
  • Knowledge of containerization technologies such as Kubernetes and Docker
  • Knowledge of microservices architecture
  • Knowledge of observability tools and methodologies, including logging, monitoring, tracing, and performance analysis platforms
  • Knowledge of cloud-based Site Reliability Engineering (SRE) practices
  • Experience with public cloud platforms such as AWS, Azure, or Google Cloud
  • Experience in software development or technology operations with a focus on Site Reliability Engineering
  • Experience in Linux/Unix systems
  • Experience with object-oriented programming languages such as Java
  • Experience with scripting languages such as Python and Bash
  • Experience with cloud platforms such as AWS, Azure, GCP

Responsibilities

  • Collaborate with engineering teams to enhance system resilience, scalability, and performance
  • Perform feature development, automation, architectural design, resiliency testing, and disaster recovery planning
  • Promote best practices for continuous improvement

Skills

AWSAzureBashDockerElasticsearchGCPGrafanaJavaKubernetesLinuxPrometheusPythonSplunk

Degrees

Bachelor’s degree in Computer ScienceBachelor’s degree in Information TechnologyBachelor’s degree in EngineeringAdvanced degree

Relocation

No