Jobs / Deloitte

Ingénierie des processus de développement et d'exploitation (DevOps) - Conseiller(ère) principal(e)

Deloitte · Toronto, ON, Canada
Toronto, ON, CanadaExp: 5+ yrs80,000-138,000 CAD/yearlyHybrid
Remuneration
80,000-138,000 CAD/yearly
Location
Toronto, ON, Canada
Visa sponsorship
Not specified

Job summary

As a Data and AI Platform Engineer, you will design, deploy, and operate adaptable, secure, and reliable platforms for data, analytics, and agentic AI tasks. You will focus on platform engineering, automation, and reliability, ensuring production-quality, governed, and performance-optimized platforms. This role involves working with modern data platforms in cloud and hybrid environments, implementing IaC tools, and managing CI/CD pipelines.

Benefits

Flexible, proactive, and practical benefitsExpert mentorshipOn-the-job coachingPaid time offMental health support benefits ($4,000 annually)Flexible spending account ($1,300)Company-wide closures (Deloitte Days)Dedicated learning days (Development and Innovation Days)Flexible work arrangementsHybrid work structure

Qualifications

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
  • Minimum of five years of experience in platform engineering, cloud engineering, DevOps, or Site Reliability Engineering.
  • Strong experience in deploying and operating data platforms, including Snowflake, Databricks, and/or SAS Viya.
  • In-depth expertise in Infrastructure as Code (IaC), with Terraform being an asset, and automated provisioning.
  • Experience with containerization and orchestration technologies (Docker, Kubernetes).
  • Hands-on experience with CI/CD tools (Azure DevOps, GitHub Actions, Jenkins) and DevOps practices.
  • Excellent understanding of cloud platforms (Azure, AWS), including networking, identity and access management, storage, and security services.
  • Experience implementing Policy as Code (PaC) frameworks, governance frameworks, and compliance standards.
  • Experience designing high availability, resilience, disaster recovery, and fault tolerance strategies.
  • Excellent knowledge of observability practices, including logging, tracing, and alerting frameworks.
  • Experience with performance improvement, scalability optimization, and cost management in cloud environments.
  • Excellent system engineering and troubleshooting skills in distributed systems.
  • Experience working in agile and cross-functional teams to build production-quality platforms.
  • Proficiency in French and English (written and spoken).
  • Relevant certifications (Azure, AWS, Kubernetes, Terraform, Snowflake, Databricks) are an asset.

Responsibilities

  • Design and deploy modern data platforms in cloud and hybrid environments, including Snowflake, Databricks, and SAS Viya.
  • Engineer platform infrastructure using Infrastructure as Code (IaC) tools like Terraform for reproducible and adaptable deployments.
  • Create and manage Continuous Integration/Continuous Deployment (CI/CD) pipelines to automate platform provisioning, testing, releases, and upgrades using tools like Azure DevOps, GitHub Actions, or Jenkins.
  • Deploy, manage, and scale Kubernetes environments to support distributed data and AI workloads.
  • Define and implement Policy as Code (PaC) frameworks for governance, compliance, and security standards across environments.
  • Design and implement secure cloud architectures, including identity and access management, networking, encryption, secret management, and data access controls.
  • Develop high availability and disaster recovery strategies, including multi-region deployments, failover mechanisms, and backup strategies.
  • Optimize platform performance, scalability, and cost-effectiveness through monitoring, improvement, and adaptation strategies.
  • Implement observability frameworks (logs, traces, alerts) using tools like Prometheus, Grafana, Azure Monitor, or Datadog.
  • Operate and maintain production platforms, including upgrades, patching, incident response, and root cause analysis.
  • Implement Site Reliability Engineering (SRE) principles, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), error budgets, automation, and reliability improvements.
  • Collaborate with data and AI engineers, architects, and business stakeholders to ensure platforms meet organizational needs.
  • Support agentic AI and advanced analytics workloads by ensuring scalable, secure, and performant infrastructure foundations.
  • Promote adoption of engineering best practices, including automation, standardization, and reproducible deployment patterns.

Skills

AWSAzureAzure DevOpsAzure MonitorDatabricksDatadogDockerGitHubGitHub ActionsGrafanaJenkinsKubernetesPrometheusSnowflakeTerraform

Certifications

AzureAWSKubernetesTerraformSnowflakeDatabricks

Degrees

Bachelor's degree in Computer ScienceBachelor's degree in EngineeringMaster's degree in Computer ScienceMaster's degree in Engineering

Relocation

No