Jobs / JPMorganChase
Lead Site Reliability Engineer Market Risk
JPMorganChase · Houston, TX, United States
Houston, TX, United StatesExp: 5+ yrsOnsite
Remuneration
Not specified
Location
Houston, TX, United States
Visa sponsorship
Not specified
Job summary
As a Lead Site Reliability Engineer at JPMorgan Chase within Market Risk Technology, you will hold a leadership role, demonstrating strong knowledge across multiple technical domains and advising others on technical and business issues. You will lead resiliency design reviews, break down complex problems, act as a technical lead for products, and mentor other engineers.
Qualifications
- Formal training or certification in software engineering concepts with 5+ years of applied experience.
- Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and site reliability best practices.
- Ability to implement site reliability practices within an application or platform.
- Fluency in at least one programming language such as Python, Java Spring Boot, or .Net.
- Deep knowledge of software applications and technical processes with emerging depth in one or more technical disciplines.
- Proficiency and experience in observability including white and black box monitoring, SLO alerting, and telemetry collection.
- Proficiency in continuous integration and continuous delivery tools.
- Experience with container and container orchestration.
- Experience with troubleshooting common technologies and issues.
- Ability to identify and solve problems related to complex data structures and algorithms.
- Drive to self-educate and evaluate new technology.
- Ability to teach new programming languages to team members (preferred).
- Ability to expand and collaborate across different levels and stakeholder groups (preferred).
Responsibilities
- Demonstrate and champion site reliability culture and practices, exerting technical influence throughout the team.
- Lead initiatives to improve reliability and stability of applications and platforms using data-driven analytics.
- Collaborate with team members to identify service level indicators and establish service level objectives and error budgets.
- Demonstrate technical expertise within technical domains and proactively identify and solve technology-related bottlenecks.
- Act as the main point of contact during major incidents, identifying and solving issues quickly to avoid financial losses.
- Document and share knowledge within the organization via internal forums and communities of practice.
Skills
DatadogDocker.NETDynatraceECSGitLabGrafanaJavaJenkinsKubernetesPrometheusPythonSplunkTerraform
Languages
PythonJava Spring BootNet
Relocation
No