Jobs / Mad***

Sr SRE/Dev Ops Engineer

Mad*** · United States · Remote
Visa sponsorship details are locked. Unlock company name and apply link with .
United StatesExp: 5+ yrs170,000-175,000 USD/yearlyRemote
Remuneration
170,000-175,000 USD/yearly
Location
United States · Remote
Eastern Daylight Time (UTC-4)
Visa sponsorship
Sponsors visa

Job summary

Mad*** is seeking a Senior SRE / AI Platform DevOps Engineer to build and scale infrastructure for AI-powered services, ensuring reliability, security, and performance while automating deployment workflows and monitoring frameworks.

Qualifications

  • 5+ years in DevOps, Site Reliability Engineering, or related roles.
  • Hands-on experience with cloud infrastructure, preferably AWS.
  • Experience with CI/CD pipelines and automated workflows.
  • Proficiency in infrastructure-as-code tools like Terraform.
  • Experience with monitoring, alerting, and incident response.
  • Scripting or programming skills in Python, Bash, or Go.
  • Experience designing reliable and scalable infrastructure.

Responsibilities

  • Design and manage cloud infrastructure for AI services.
  • Automate environment setup across development and production.
  • Build reusable infrastructure-as-code patterns.
  • Ensure production systems are resilient and cost-efficient.
  • Participate in on-call support and incident response.
  • Maintain and optimize CI/CD pipelines.
  • Implement automated testing and validation in workflows.
  • Design safe deployment patterns and rollback mechanisms.
  • Monitor AI service performance and reliability signals.
  • Implement operational controls for AI systems.
  • Design scalable telemetry pipelines for operational signals.
  • Enable observability for AI services and orchestration.
  • Implement intelligent monitoring and automated incident response.
  • Define reliability standards for production systems.

Skills

AWSBashCloudFormationDatadogGoGrafanaNew RelicOpenTelemetryPrometheusPythonSplunkTerraform

Relocation

No