Jobs / PayPal

Site Reliability Engineer

Apply Now

PayPal · Scottsdale, AZ, United States

Scottsdale, AZ, United StatesExp: 5+ yrs125,029-151,000 USD/yearlyHybrid

Apply Now

Remuneration

125,029-151,000 USD/yearly

Location

Scottsdale, AZ, United States

Visa sponsorship

No visa sponsorship

Must be legally authorized to work in the U.S. without sponsorship

Job summary

PayPal is seeking a Site Reliability Engineer in Scottsdale, AZ to monitor and analyze system metrics, diagnose and resolve complex system issues, and create automation scripts to enhance system reliability. The role involves designing and implementing highly reliable, fault-tolerant systems and partnering with various teams to improve system architecture.

Benefits

Annual performance bonusEquityIncentive compensationGenerous paid time offHealthcare coverageFinancial security resourcesMental health support

Qualifications

Bachelor’s degree, or foreign equivalent, in Computer Science, Engineering, or a closely related field
Five years of experience in site reliability engineering, systems administration, or a related technical field
Five years of experience in Production Support
Five years of experience in Incident Management
Five years of experience in System and Application Monitoring (DataDog, Splunk)
Five years of experience in Batch Systems and Tools (Control-M)
Five years of experience in System Administration
Five years of experience in Configuration Management
Five years of experience in Infrastructure Management
Five years of experience in UNIX/Linux Administration
Three years of experience in Technical Documentation (Confluence)
Three years of experience in Programming/Scripting (Shell, Perl, Python)
Three years of experience in Database Technologies (Oracle, PostgreSQL)
Two years of experience in Cloud Technologies (AWS, GoogleCloud)
Two years of experience in Containerization Tools and Technologies (Kubernetes, Docker)

Responsibilities

Monitor and analyze system metrics to ensure optimal availability, performance, and reliability of digital platforms and applications
Continuously assess system health indicators, performance benchmarks, and service-level objectives
Diagnose and resolve complex system issues through systematic troubleshooting methodologies
Perform comprehensive root cause analysis of system failures and implement sustainable long-term solutions
Create and maintain automation scripts, tools, and operational processes to streamline operations and enhance system reliability
Configure, maintain, and continuously improve monitoring and alerting systems to provide actionable insights
Analyze system usage patterns and trends to forecast future resource requirements and ensure system scalability
Implement preventive measures to avoid capacity-related performance degradation or service interruptions
Design and implement highly reliable, fault-tolerant systems incorporating industry best practices for high availability and disaster recovery
Ensure reliable software releases and perform systematic failure simulations to identify system weaknesses
Develop and maintain comprehensive documentation for system configurations, operational procedures, and incident handling workflows
Partner with development, operations, and product teams to enhance system architecture for improved performance, reliability, and scalability

Skills

AWSBashConfluenceDatadogDockerKubernetesLinuxPerlPostgreSQLPythonSplunkGCP

Degrees

Bachelor’s degree in Computer Science, Engineering, or closely related field

Work schedule

40 hours per weekMonday to Friday, 9:00 a.m. to 5:00 p.m.

Relocation

Apply Now