Jobs / JPMorganChase

Site Reliability Engineer III

JPMorganChase · Jersey City, NJ, United States
Jersey City, NJ, United StatesExp: 7+ yrs133,000-185,000 USD/yearlyRemote
Remuneration
133,000-185,000 USD/yearly
Location
Jersey City, NJ, United States
Visa sponsorship
Not specified

Job summary

Seeking a Site Reliability Engineer III to design, implement, and maintain CI/CD pipelines, manage AWS cloud infrastructure, and implement containerized workloads. Responsibilities include establishing monitoring, alerting, and SLOs, leading incident response, and applying security best practices. The role involves partnering with development teams, documenting processes, and fostering continuous learning.

Qualifications

  • 7+ years of experience in DevOps, SRE, or Cloud Automation.
  • Hands-on experience with AWS services (IAM, VPC, EC2, ALB/NLB, S3, RDS/Aurora, CloudWatch, EKS/ECS, Lambda, Route 53).
  • Experience building CI/CD with GitHub Actions, GitLab CI, Jenkins, or Azure DevOps.
  • Proficiency in at least one scripting or programming language (Python, Bash, Java, .NET).
  • Solid understanding of Linux/Unix systems and networking fundamentals.
  • Experience with secrets and configuration management tools (AWS Secrets Manager/SSM, Vault).
  • Experience with observability and monitoring tools (Grafana, Dynatrace, Prometheus, Datadog, Splunk).
  • Familiarity with container orchestration (Docker, Kubernetes, ECS).
  • Strong communication skills.
  • Ability to work independently or in teams.
  • Proactive, innovative, and passionate about learning.
  • Familiarity with modern front-end technologies (Preferred).
  • Experience with large-scale distributed systems (Preferred).
  • Knowledge of networking and security best practices (Preferred).
  • Strong collaboration and communication skills (Preferred).

Responsibilities

  • Design, implement, and maintain end-to-end CI/CD pipelines for application and infrastructure delivery.
  • Support release management and change control processes.
  • Build, manage, and govern AWS cloud infrastructure using Infrastructure as Code tools.
  • Ensure consistency across environments.
  • Implement and manage containerized workloads and deployment workflows using Docker, Kubernetes/EKS, and ECS.
  • Establish monitoring, alerting, and SLOs using service level indicators.
  • Lead incident response, root cause analysis, and postmortem processes to minimize customer impact.
  • Apply security best practices including IAM least privilege, secrets management, and policy-as-code.
  • Enforce governance and reduce risk across all environments.
  • Drive system reliability and cost efficiency through autoscaling strategies, right-sizing, and performance tuning.
  • Proactively resolve issues.
  • Standardize and automate environment management across dev, test, and production.
  • Enforce governance controls and ensure parity across stages.
  • Design and develop robust internal tooling and software solutions.
  • Enhance system performance, scalability, and operational efficiency.
  • Partner with development teams and stakeholders to identify reliability and scalability improvements.
  • Participate in on-call rotation and support cross-functional delivery.
  • Document processes and contribute to communities of practice.
  • Foster a team culture grounded in diversity, inclusion, respect, and continuous learning.
  • Automate provisioning, configuration management, patching, backups, and operational procedures.

Skills

AWSAzureAzure DevOpsBashAWS CDKCloudFormationCloudWatchDatadogDocker.NETDynatraceECSEKSGitHubGitHub ActionsGitLabGitLab CIGrafanaIAMJavaJenkinsKubernetesAWS LambdaLinuxPrometheusPythonRoute 53S3Secrets ManagerSplunkTerraformVault

Languages

PythonBashJavaNET

Work schedule

On-call rotation

Relocation

No