Jobs / JPMorganChase

Lead Site Reliability Engineer Market Risk

JPMorganChase · Houston, TX, United States
Houston, TX, United StatesExp: 5+ yrsOnsite
Remuneration
Not specified
Location
Houston, TX, United States
Visa sponsorship
Not specified

Job summary

As a Lead Site Reliability Engineer at JPMorgan Chase within Market Risk Technology, you will hold a leadership role, demonstrating strong knowledge across multiple technical domains and advising others on technical and business issues. You will lead resiliency design reviews, break down complex problems, act as a technical lead for products, and mentor other engineers.

Qualifications

  • Formal training or certification in software engineering concepts with 5+ years of applied experience.
  • Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and site reliability best practices.
  • Ability to implement site reliability practices within an application or platform.
  • Fluency in at least one programming language such as Python, Java Spring Boot, or .Net.
  • Deep knowledge of software applications and technical processes with emerging depth in one or more technical disciplines.
  • Proficiency and experience in observability including white and black box monitoring, SLO alerting, and telemetry collection.
  • Proficiency in continuous integration and continuous delivery tools.
  • Experience with container and container orchestration.
  • Experience with troubleshooting common technologies and issues.
  • Ability to identify and solve problems related to complex data structures and algorithms.
  • Drive to self-educate and evaluate new technology.
  • Ability to teach new programming languages to team members (preferred).
  • Ability to expand and collaborate across different levels and stakeholder groups (preferred).

Responsibilities

  • Demonstrate and champion site reliability culture and practices, exerting technical influence throughout the team.
  • Lead initiatives to improve reliability and stability of applications and platforms using data-driven analytics.
  • Collaborate with team members to identify service level indicators and establish service level objectives and error budgets.
  • Demonstrate technical expertise within technical domains and proactively identify and solve technology-related bottlenecks.
  • Act as the main point of contact during major incidents, identifying and solving issues quickly to avoid financial losses.
  • Document and share knowledge within the organization via internal forums and communities of practice.

Skills

DatadogDocker.NETDynatraceECSGitLabGrafanaJavaJenkinsKubernetesPrometheusPythonSplunkTerraform

Languages

PythonJava Spring BootNet

Relocation

No