Jobs / NavitasPartners
Site Reliability Engineer (SRE) - Montreal, QC - DESS
NavitasPartners · Saint-Jérôme, QC, Canada
Saint-Jérôme, QC, CanadaOnsite
Remuneration
Not specified
Location
Saint-Jérôme, QC, Canada
Visa sponsorship
Not specified
Job summary
Seeking an experienced Site Reliability Engineer (SRE) with a strong background in BFSI, Public Sector, or Telecom environments. The ideal candidate will support highly available, scalable, and resilient platforms while driving automation, observability, and operational excellence initiatives.
Qualifications
- Experience as a Site Reliability Engineer (SRE) within BFSI, Public Sector, or Telecom environments.
- Experience supporting banking systems, government platforms, OSS/BSS, billing, charging, or telecom customer platforms.
- Strong experience with monitoring, observability, and incident management tools.
- Experience supporting cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
- Understanding of governance, compliance, security, and regulatory requirements.
- Experience with automation, scripting, and Infrastructure as Code.
- Experience supporting enterprise-scale and mission-critical environments.
- Strong troubleshooting and analytical skills.
- Experience with Linux, containers, and cloud-native technologies.
Responsibilities
- Support reliability, availability, and performance of enterprise applications and platforms across BFSI, Public Sector, and Telecom environments.
- Support modernization initiatives for core banking, payments, citizen services, OSS/BSS platforms, billing systems, charging platforms, and customer-facing applications.
- Build and maintain monitoring, observability, alerting, and incident management solutions.
- Support telecom network operations, 5G platforms, edge computing environments, and enterprise digital services.
- Implement automation and self-healing capabilities to improve platform reliability.
- Participate in incident response, root cause analysis, and problem management activities.
- Monitor application performance, capacity, and service health metrics.
- Ensure compliance with security, governance, and regulatory requirements.
- Support disaster recovery, business continuity, and resiliency initiatives.
- Drive continuous improvement and operational excellence programs.
Skills
AWSAzureGCPLinux
Certifications
SRECloudDevOps certifications
Industry
BFSIPublic SectorTelecom
Relocation
No