Jobs / Wayve

Staff Data Platform Engineer

Wayve · London, ENG, United Kingdom
London, ENG, United KingdomFull timeHybrid
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Not specified

Job summary

Wayve is seeking a Staff Data Platform Engineer to lead the Data Transfer Hub project, a critical infrastructure initiative. This role involves designing and building a globally distributed hub-and-spoke ingestion system for petabytes of sensor data from partners. The engineer will drive technical direction and delivery, enabling Wayve's AI model training at scale.

Qualifications

  • Proven experience building and operating large-scale data transfer pipelines at multi-TB or PB scale.
  • Experience orchestrating parallelized transfer jobs, with Flyte preferred, or other workflow orchestration tools like Airflow or Prefect.
  • Experience building reliable, observable data systems, including retry strategies, backfills, data integrity checks, lifecycle/state tracking, and operational alerting.
  • Strong cloud platform skills, with Azure preferred, and experience with multiple cloud providers (AWS/GCP) is an advantage.
  • Proficiency in Python and Terraform.
  • Cloud-to-cloud and networking experience, including blob/object transfer and storage at scale, and cross-cloud data movement.
  • A product mindset, focusing on customer needs and solving real problems.
  • Comfort working in a fast-changing environment with evolving requirements.
  • High degree of autonomy and ownership, taking responsibility for outcomes.

Responsibilities

  • Set technical direction for the Data Transfer Hub, considering reliability, throughput, cost, partner constraints, observability, and operational support.
  • Lead technical design and delivery of the Data Transfer Hub project, ingesting large volumes of video, LiDAR, and sensor data from global partners at petabyte scale.
  • Design and build parallelized, distributed data transfer pipelines using Flyte for workflow orchestration, Kafka for event-driven lifecycle tracing, and Azure for storage and transfer infrastructure.
  • Build and operate a globally distributed hub-and-spoke data transfer model to retrieve and share sensor data at scale with partners.
  • Write infrastructure-as-code in Terraform and build pipeline logic primarily in Python.
  • Drive cloud-to-cloud and cross-cloud networking solutions, with Azure as primary.
  • Work in a fast-changing environment with evolving requirements and high comfort with ambiguity.

Skills

AirflowAWSAzureGCPKafkaMakePythonTerraform

Languages

Python

Work schedule

Core working hours

Industry

Autonomous vehiclesRobotics

Relocation

No