Jobs / Mad***
Sr SRE/Dev Ops Engineer
Mad*** · United States · Remote
Visa sponsorship details are locked. Unlock company name and apply link with .
United StatesExp: 5+ yrs170,000-175,000 USD/yearlyRemote
Remuneration
170,000-175,000 USD/yearly
Location
United States · Remote
Eastern Daylight Time (UTC-4)
Visa sponsorship
Sponsors visa
Job summary
Mad*** is seeking a Senior SRE / AI Platform DevOps Engineer to build and scale infrastructure for AI-powered services, ensuring reliability, security, and performance while automating deployment workflows and monitoring frameworks.
Qualifications
- 5+ years in DevOps, Site Reliability Engineering, or related roles.
- Hands-on experience with cloud infrastructure, preferably AWS.
- Experience with CI/CD pipelines and automated workflows.
- Proficiency in infrastructure-as-code tools like Terraform.
- Experience with monitoring, alerting, and incident response.
- Scripting or programming skills in Python, Bash, or Go.
- Experience designing reliable and scalable infrastructure.
Responsibilities
- Design and manage cloud infrastructure for AI services.
- Automate environment setup across development and production.
- Build reusable infrastructure-as-code patterns.
- Ensure production systems are resilient and cost-efficient.
- Participate in on-call support and incident response.
- Maintain and optimize CI/CD pipelines.
- Implement automated testing and validation in workflows.
- Design safe deployment patterns and rollback mechanisms.
- Monitor AI service performance and reliability signals.
- Implement operational controls for AI systems.
- Design scalable telemetry pipelines for operational signals.
- Enable observability for AI services and orchestration.
- Implement intelligent monitoring and automated incident response.
- Define reliability standards for production systems.
Skills
AWSBashCloudFormationDatadogGoGrafanaNew RelicOpenTelemetryPrometheusPythonSplunkTerraform
Relocation
No