Jobs / Broadridge

Site Reliability Engineer

Broadridge · Toronto, ON, Canada
Toronto, ON, CanadaExp: 7+ yrs100,000-125,000 CAD/yearlyHybrid
Remuneration
100,000-125,000 CAD/yearly
Location
Toronto, ON, Canada
Visa sponsorship
Not specified

Job summary

As a Site Reliability Engineer, you will be responsible for the availability, performance, security, and scalability of our infrastructure and applications. You will work closely with development and operations teams to streamline the software development lifecycle, automate processes, and maintain systems to ensure they are reliable and scalable. This role offers the opportunity to work on a wide range of exciting projects, helping to drive our SRE practices within the organization.

Qualifications

  • Experience with monitoring, logging, and alerting tools.
  • Proficiency in automation tools for infrastructure and pipelines.
  • Strong hands-on experience with Linux system administration, including performance tuning, troubleshooting, and security hardening.
  • 7+ years of experience in a DevOps, SRE, or similar role.
  • Proficient with Docker and Kubernetes, including managing clusters in production environments.
  • Expertise in implementing Infrastructure as Code best practices.
  • Strong scripting skills in languages such as Bash, Python, or Go.
  • Expertise in Git for version control and code collaboration.
  • Experience in setting up and managing CI/CD pipelines.
  • Solid understanding of Agile methodologies and DevOps practices.
  • Cloud certification such as AWS Certified DevOps Engineer or DevOps Expert is preferred.
  • Understanding of security best practices in cloud environments, including vulnerability management, firewall management, and encryption.
  • Familiarity with networking concepts such as VPC, VPN, and load balancing.
  • Understanding of microservices architecture and deploying microservices at scale.

Responsibilities

  • Develop and enhance monitoring systems and lead incident response for production outages, including root cause analysis and prevention.
  • Design and maintain scalable infrastructure using Infrastructure as Code tools.
  • Ensure the stability, performance, and scalability of Linux-based infrastructure and services, leveraging SRE practices to achieve reliability targets.
  • Build, manage, and maintain CI/CD pipelines to automate code deployment and testing.
  • Develop and implement scripts and tooling to automate repetitive operational tasks.
  • Collaborate with security teams to ensure DevOps practices adhere to security and compliance standards.
  • Work closely with development, operations, and QA teams to foster a DevOps culture and contribute to a robust engineering process.
  • Design and manage cloud infrastructure in various cloud environments.
  • Identify and resolve performance bottlenecks in systems and applications.

Skills

AnsibleAWSAzureBashChefCircleCICloudFormationDatadogDockerGCPGitGitLabGitLab CIGoGrafanaJenkinsKubernetesLinuxPrometheusPythonTerraform

Certifications

AWS Certified DevOps EngineerDevOps Expert

Languages

BashPythonGo

Relocation

No