Jobs / CME Group

Site Reliability Engineer III (Tue - Sat)

CME Group · Belfast, NIR, United Kingdom
Belfast, NIR, United KingdomExp: 3-5 yrsHybrid
Remuneration
Not specified
Location
Belfast, NIR, United Kingdom
Visa sponsorship
Not specified

Job summary

CME Group is seeking a Site Reliability Engineer III to build, operate, and scale systems in their Markets portfolio, specifically focusing on the CME Globex trading platform. This role involves owning reliability initiatives, leading technical discussions, automating tasks, managing incidents, and mentoring junior SREs. The engineer will also contribute to the architectural design for the migration to Google Cloud Platform (GCP).

Benefits

Bonus ProgrammeGenerous shift allowanceEquity ProgrammeEmployee Stock Purchase Plan (ESPP)Private Medical and Dental coverageMental Health Benefit ProgrammeGroup Pension PlanIncome ProtectionLife AssuranceCycle To WorkEV Car Benefit SchemeGym MembershipFamily LeaveEducation AssistanceOngoing Employee Development Training/Certification

Qualifications

  • 3-5+ years of professional experience in Site Reliability, DevOps, Software, or Systems Engineering.
  • Strong, hands-on experience administering and troubleshooting Linux-based production systems.
  • Proficient programming skills in Python or Go.
  • Track record of automating complex operational tasks.
  • Proven ability to lead technical initiatives.
  • Ability to solve complex problems with a high degree of autonomy.
  • Excellent communication skills.
  • Ability to articulate complex technical concepts to diverse audiences.
  • Proactive and ownership-oriented mindset.

Responsibilities

  • Design, build, and refine monitoring, alerting, and observability solutions.
  • Drive continuous improvement of SLIs and SLOs for faster issue detection and resolution.
  • Take ownership of reliability-focused projects from design to implementation.
  • Collaborate with product teams to ensure new features are scalable, resilient, and safe.
  • Lead technical discussions, presenting solution options and proposals with clear trade-offs.
  • Proactively identify and eliminate toil through robust automation.
  • Improve system reliability and team velocity through automation.
  • Lead incident response, owning resolution of significant incidents.
  • Ensure rapid system recovery and drive meaningful action from blameless post-mortems.
  • Act as a technical mentor and point of escalation for L1 and L2 SREs.
  • Foster growth of junior SREs through code reviews and paired work.
  • Contribute ideas to the product backlog.
  • Play an active role in architectural design for migration to Google Cloud Platform (GCP).

Skills

DockerGCPGKEGoGrafanaKubernetesLinuxOpenTelemetryPrometheusPython

Languages

PythonGo

Work schedule

Tuesday - Saturday

Industry

Financial services technology

Relocation

No