Jobs / Visa
Site Reliability Engineer / Performance Engineer
Visa · Cambridge, ENG, United Kingdom
Cambridge, ENG, United KingdomRemote
Remuneration
Not specified
Location
Cambridge, ENG, United Kingdom
Visa sponsorship
Not specified
Job summary
Visa is seeking a Systems Performance Engineer to optimize their real-time Financial Fraud Detection platform. The role involves tuning JVM, distributed systems, and AWS infrastructure to achieve ambitious performance and scalability goals. The engineer will participate in daily stand-ups, lead analysis of Engineering Triage tickets, and provide data-driven recommendations to development teams.
Qualifications
- Strong understanding of Linux internals, shell scripting, and system-level diagnostics.
- Foundational JVM knowledge (memory, threads, GC basics, configuration impacts).
- Experience interpreting logs, metrics, dashboards, and system graphs.
- Ability to understand and interpret test results, workload outputs, and benchmark data.
- Proven ability to troubleshoot complex technical problems by researching, analyzing, and synthesizing information.
- Demonstrated use of a home lab or personal systems projects (VMs, clusters, network setups, custom builds).
- Strong analytical and critical-thinking skills with high attention to detail.
- Ability to manage workload independently and drive investigations to completion.
- Familiarity with common tooling for monitoring, metrics, or system insights.
- Database knowledge (relational or NoSQL).
- Deeper JDK/JVM tuning or diagnostic experience.
- Basic Kubernetes setup or cluster experimentation experience.
- Understanding of networking fundamentals.
- Experience with open-source tools for diagnostics, observability, or system performance.
- Basic programming ability (any language) to support automation or small tooling improvements.
- Exposure to distributed systems concepts or cloud environments.
Responsibilities
- Participate in daily stand-ups and team meetings to synchronize activities and align investigations with priorities.
- Lead analysis of Engineering Triage tickets, diagnose customer issues, and identify root causes.
- Interpret logs, metrics, and customer data to understand system behavior and identify misconfigurations or bottlenecks.
- Set up, run, and evaluate Proofs of Concept (POCs) to validate configuration changes, new tools, or approaches.
- Analyze graphs, resource-usage patterns, and system-level data to draw evidence-based conclusions.
- Tune and enhance system configurations to improve reliability, scalability, and efficiency.
- Provide structured, data-driven recommendations to development teams.
- Investigate system-level issues in collaboration with engineering, support, and operations stakeholders.
- Conduct exploratory research, test hypotheses, and leverage resources to resolve complex problems.
- Contribute to continuous improvement of internal diagnostics, tooling, analysis processes, and investigation playbooks.
- Investigate and analyze issues using system-level data, metrics, logs, and customer information.
- Interpret graphs, measurements, and test results to identify patterns, anomalies, or bottlenecks.
- Set up, run, and compare POCs in controlled environments to validate hypotheses and guide technical decisions.
- Apply Linux and shell scripting knowledge to gather data, run tests, and automate analysis.
- Use JVM knowledge to understand system behavior, resource utilization, and configuration impacts.
- Tune and enhance current system setup by recommending configuration, architectural, or operational adjustments.
- Research complex technical problems independently and propose practical, evidence-based solutions.
- Document findings clearly and communicate technical explanations to developers, support engineers, and stakeholders.
- Drive clarity on ambiguous or customer-reported system issues through high-quality investigations.
- Provide insights that directly influence system stability, performance, and customer satisfaction.
Skills
AWSBashKubernetesLinux
Industry
Payments technologyFinancial Fraud Detection
Relocation
No