Jobs / JPMorganChase

Lead Site Reliability Engineer Market Risk

Apply Now

JPMorganChase · Houston, TX, United States

Houston, TX, United StatesExp: 5+ yrsOnsite

Apply Now

Remuneration

Not specified

Location

Houston, TX, United States

Visa sponsorship

Not specified

Job summary

As a Lead Site Reliability Engineer at JPMorgan Chase within Market Risk Technology, you will hold a leadership role, demonstrating strong knowledge across multiple technical domains and advising others on technical and business issues. You will lead resiliency design reviews, break down complex problems, act as a technical lead for products, and mentor other engineers.

Qualifications

Formal training or certification in software engineering concepts with 5+ years of applied experience.
Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and site reliability best practices.
Ability to implement site reliability practices within an application or platform.
Fluency in at least one programming language such as Python, Java Spring Boot, or .Net.
Deep knowledge of software applications and technical processes with emerging depth in one or more technical disciplines.
Proficiency and experience in observability including white and black box monitoring, SLO alerting, and telemetry collection.
Proficiency in continuous integration and continuous delivery tools.
Experience with container and container orchestration.
Experience with troubleshooting common technologies and issues.
Ability to identify and solve problems related to complex data structures and algorithms.
Drive to self-educate and evaluate new technology.
Ability to teach new programming languages to team members (preferred).
Ability to expand and collaborate across different levels and stakeholder groups (preferred).

Responsibilities

Demonstrate and champion site reliability culture and practices, exerting technical influence throughout the team.
Lead initiatives to improve reliability and stability of applications and platforms using data-driven analytics.
Collaborate with team members to identify service level indicators and establish service level objectives and error budgets.
Demonstrate technical expertise within technical domains and proactively identify and solve technology-related bottlenecks.
Act as the main point of contact during major incidents, identifying and solving issues quickly to avoid financial losses.
Document and share knowledge within the organization via internal forums and communities of practice.

Skills

DatadogDocker.NETDynatraceECSGitLabGrafanaJavaJenkinsKubernetesPrometheusPythonSplunkTerraform

Languages

PythonJava Spring BootNet

Relocation

Apply Now