Jobs / National Life Group
Senior Cloud Platform & Site Reliability Engineering Lead
National Life Group · Addison, TX, United States
Addison, TX, United StatesExp: 8+ yrs136,875-200,750 USD/yearlyOnsite
Remuneration
136,875-200,750 USD/yearly
Location
Addison, TX, United States
Visa sponsorship
No visa sponsorship
Please note that we do not offer visa sponsorship for this position.
Job summary
The Senior Cloud Platform & Site Reliability Engineering Lead partners with business and technical stakeholders to lead cloud platform design, engineering, and integration efforts. This individual drives continuous improvement across infrastructure, platform services, and application reliability. This role is a senior technical leadership position within the National Life Group IT Shared Services team, representing cloud and platform engineering in project initiatives, architecture design, and operational excellence.
Benefits
Flexible and customizable benefits starting day one
Qualifications
- Bachelor's degree in Computer Science or relevant work experience
- 8+ years of hands-on experience with CI/CD pipelines, APM tools, source code management, infrastructure, cloud, or platform engineering
- 4+ years in a technical leadership role
- Open-minded with proven ability to work collaboratively
- Consistent positive attitude
- Demonstrated ability to lead in a highly matrixed, heavily outsourced organization through influence and partnership
- Ability and interest in mentoring technical staff and raising collective technical competencies
- Ability to manage change and operate effectively in complex environments
- Strong expertise in Microsoft Azure (compute, storage, networking, identity, monitoring)
- Strong expertise in Infrastructure as Code (Bicep, ARM)
- Strong expertise in Pipeline Technologies (Azure DevOps, GitHub Actions, CI/CD)
- Strong expertise in Code Repository Management (GIT)
- Strong expertise in Monitoring & Instrumentation Tools (Azure Monitor, Application Insights, New Relic or similar)
- Strong expertise in Containers & Serverless (AKS, Azure Functions)
- Proficiency in scripting and automation (Python, PowerShell)
- Strong expertise in DevSecOps tools (SonarQube, JFrog, SwaggerHub or similar)
- Experience with AWS or other cloud platforms is a plus
- Exposure to AI technologies in development and application performance (AIOps)
Responsibilities
- Lead design and delivery of cloud-native solutions within Microsoft Azure
- Lead development and enhancement of CI/CD, SRE, and observability capabilities
- Drive infrastructure automation using Infrastructure as Code (Bicep, ARM)
- Serve as a subject matter expert in pipeline configuration, code repository management, and related controls
- Lead incident resolution and business issue remediation
- Lead identifying and managing capacity, performance, and risk
- Adhere to and promote firm Change Management and Problem Management policies (ITIL)
- Lead adoption of SRE practices including SLIs/SLOs and reliability engineering principles
- Drive improvements in system reliability, scalability, and performance
- Leverage Azure Monitor, Application Insights, and APM tools such as New Relic for observability
- Promote automation and AI-assisted operations to reduce manual effort and improve outcomes
- Drive innovation and remain current on Azure and emerging cloud technologies
- Define and promote Azure architecture standards (networking, identity, security)
- Lead adoption of cloud-native patterns (AKS, serverless, PaaS services)
- Collaborate with development, QA, and operations teams to integrate CI/CD and observability practices
- Communicate effectively with stakeholders across engineering and business teams
Skills
AKSAWSAzureAzure DevOpsAzure MonitorBicepGitGitHubGitHub ActionsMakeNew RelicPowerShellPythonSonarQubeARM Templates
Degrees
Bachelor's degree in Computer Science
Languages
PythonPowerShell
Work schedule
Onsite four days per weekMondayTuesdayWednesdayThursday
Industry
Insurance
Relocation
No