Jobs / NavitasPartners

Site Reliability Engineer (SRE) - Montreal, QC - DESS

NavitasPartners · Saint-Jérôme, QC, Canada
Saint-Jérôme, QC, CanadaOnsite
Remuneration
Not specified
Location
Saint-Jérôme, QC, Canada
Visa sponsorship
Not specified

Job summary

Seeking an experienced Site Reliability Engineer (SRE) with a strong background in BFSI, Public Sector, or Telecom environments. The ideal candidate will support highly available, scalable, and resilient platforms while driving automation, observability, and operational excellence initiatives.

Qualifications

  • Experience as a Site Reliability Engineer (SRE) within BFSI, Public Sector, or Telecom environments.
  • Experience supporting banking systems, government platforms, OSS/BSS, billing, charging, or telecom customer platforms.
  • Strong experience with monitoring, observability, and incident management tools.
  • Experience supporting cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
  • Understanding of governance, compliance, security, and regulatory requirements.
  • Experience with automation, scripting, and Infrastructure as Code.
  • Experience supporting enterprise-scale and mission-critical environments.
  • Strong troubleshooting and analytical skills.
  • Experience with Linux, containers, and cloud-native technologies.

Responsibilities

  • Support reliability, availability, and performance of enterprise applications and platforms across BFSI, Public Sector, and Telecom environments.
  • Support modernization initiatives for core banking, payments, citizen services, OSS/BSS platforms, billing systems, charging platforms, and customer-facing applications.
  • Build and maintain monitoring, observability, alerting, and incident management solutions.
  • Support telecom network operations, 5G platforms, edge computing environments, and enterprise digital services.
  • Implement automation and self-healing capabilities to improve platform reliability.
  • Participate in incident response, root cause analysis, and problem management activities.
  • Monitor application performance, capacity, and service health metrics.
  • Ensure compliance with security, governance, and regulatory requirements.
  • Support disaster recovery, business continuity, and resiliency initiatives.
  • Drive continuous improvement and operational excellence programs.

Skills

AWSAzureGCPLinux

Certifications

SRECloudDevOps certifications

Industry

BFSIPublic SectorTelecom

Relocation

No