Jobs / Apple
Data Platform SRE, AI & Data Platforms (AiDP)
Apple · Austin, TX, United States
Austin, TX, United StatesExp: 3+ yrs147,400-220,900 USD/yearlyRemote
Remuneration
147,400-220,900 USD/yearly
Location
Austin, TX, United States
Visa sponsorship
Not specified
Job summary
The AI & Data Platforms (AiDP) team at Apple is seeking a Data Platform SRE to develop and operate large-scale big data platforms. This role involves optimizing performance and cost, automating operations, and resolving production errors to ensure a robust data platform experience for critical applications like analytics, reporting, and AI/ML apps.
Benefits
Comprehensive medical and dental coverageRetirement benefitsDiscounted products and free servicesReimbursement for certain educational expenses including tuition
Qualifications
- Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability.
- Experience with contribution to Open Source projects.
- Experience with multiple public cloud infrastructure.
- Managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues.
- Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT).
- Understanding of data modeling and data warehousing concepts.
- Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs).
- A learning attitude to continuously improve self, team, and organization.
- Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries.
- 3+ years of professional software engineering experience with large-scale big data platforms.
- Strong programming skills in Java, Scala, Python, or Go.
- Proven expertise in designing, building, and operating large-scale distributed data processing systems with a strong focus on Apache Spark.
- Hands-on experience with table formats and data lake technologies such as Apache Iceberg, ensuring scalability, reliability, and optimized query performance.
- Skilled at coding for distributed systems and developing resilient data pipelines.
- Strong background in incident management, including troubleshooting, root cause analysis, and performance optimization in complex production environments.
- Proficient with Unix/Linux systems and command-line tools for debugging and operational support.
Responsibilities
- Develop and operate large-scale big data platforms using open source and other solutions.
- Support critical applications including analytics, reporting, and AI/ML apps.
- Optimize platform performance and cost efficiency.
- Automate operational tasks for big data systems.
- Identify and resolve production errors and issues to ensure platform reliability and user experience.
Skills
AirflowdbtGoJavaKubernetesLinuxPythonScalaSpark
Languages
JavaScalaPythonGo
Relocation
Yes