Jobs / RS21: A Data Science and Visualization Company

Senior Reliability Engineer

RS21: A Data Science and Visualization Company · United States
United StatesExp: 5+ yrs145,000-175,000 USD/yearlyHybrid
Remuneration
145,000-175,000 USD/yearly
Location
United States
Visa sponsorship
Not specified

Job summary

The Senior Reliability Engineer at RS21 supports space systems programs by ensuring the reliability, deployment, and operational continuity of cloud and hybrid infrastructure for real-time satellite data processing, telemetry pipelines, and ML-driven anomaly detection systems. This role leads SRE and DevOps practices for assigned space programs, designing monitoring and alerting architecture, owning deployment pipelines, defining SLOs and error budgets, and partnering with software and data engineering teams to ensure systems are built to operate reliably from day one. The engineer understands and applies constraints of classified and ops-floor environments in architectural and operational decisions.

Qualifications

  • Bachelor's degree or equivalent experience in computer science, systems engineering, or a related technical field.
  • 5+ years of experience in site reliability engineering, DevOps, or cloud infrastructure roles.
  • At least 2 years of experience supporting operationally sensitive or regulated environments.
  • Deep experience with AWS services relevant to reliability, security, and operations: CloudWatch, CloudTrail, IAM, Lambda, ECS, EKS, Kinesis, MSK, and related services.
  • Strong proficiency with Docker and Kubernetes, including Helm chart development and cluster management.
  • Experience designing and maintaining CI/CD pipelines using GitHub Actions, GitLab CI, or Azure DevOps.
  • Solid understanding of infrastructure-as-code using Terraform, CDK, or equivalent.
  • Demonstrated experience with SLO definition, error budget management, and blameless post-mortem culture.
  • Familiarity with zero-trust architecture, STIG compliance, and FedRAMP requirements in cloud deployments.
  • Active security clearance or ability to obtain one, Top Secret preferred.
  • AWS certifications: DevOps Engineer Professional, Solutions Architect Associate or Professional, Security Specialty, or Advanced Networking Specialty.
  • Experience supporting ATO processes, RMF documentation, or deployment into classified operational environments.
  • Background in DoD, Space Force, AFRL, or satellite operations environments.
  • Experience with real-time telemetry ingestion and streaming pipeline operations supporting ML inference.
  • Familiarity with MLOps practices and operational requirements of deployed ML anomaly detection systems.
  • CompTIA Security+ or CISSP certification, particularly for DoD 8570 compliance contexts.

Responsibilities

  • Define and maintain SLOs, SLAs, and error budgets for space systems platforms in collaboration with engineering and government stakeholders.
  • Lead incident response for operational platform failures, including triage, root cause analysis, blameless post-mortems, and corrective actions.
  • Architect and implement monitoring, alerting, and observability solutions using CloudWatch, CloudTrail, and custom telemetry pipelines.
  • Continuously improve system reliability through load testing, failure injection, chaos engineering practices, and proactive capacity planning.
  • Ensure operational requirements including latency, throughput, and sustainment are reflected in platform architecture and delivery plans.
  • Design, implement, and maintain cloud and hybrid deployment architectures for space systems platforms, including real-time ML inference pipelines, telemetry ingestion systems, and anomaly detection services.
  • Own the deployment pipeline for space systems software across AWS GovCloud and on-premise or edge-adjacent environments.
  • Architect containerized workloads using Docker and Kubernetes, including Helm chart development, cluster management, and workload scheduling.
  • Contribute to and enforce infrastructure-as-code practices using Terraform or CDK.
  • Support classified and operationally sensitive deployments, applying zero-trust architecture principles and STIG compliance requirements.
  • Lead security architecture reviews for cloud and hybrid infrastructure supporting DoD space programs, applying zero-trust principles and hardening against STIG and FedRAMP requirements.
  • Support ATO processes, RMF documentation, and accreditation activities.
  • Implement IAM policies, cross-account access controls, and audit logging architectures using AWS IAM, CloudTrail, and Macie.
  • Ensure all deployment environments maintain continuous compliance posture and flag deviations proactively.
  • Design, implement, and maintain CI/CD pipelines for space systems software using GitHub Actions, GitLab CI, or Azure DevOps.
  • Establish and enforce branching strategies, deployment promotion gates, and rollback procedures for operationally sensitive space environments.
  • Partner with software and data engineering teams to embed reliability and security practices into the development lifecycle.
  • Lead the adoption of DataOps and MLOps pipeline standards for ML-based anomaly detection and predictive maintenance systems.
  • Own the operational reliability of real-time data pipelines ingesting satellite telemetry, including Kinesis, MSK/Kafka, Lambda, and custom streaming architectures.
  • Monitor and optimize pipeline performance, latency, and throughput to meet real-time processing requirements.

Skills

AWSAzureAzure DevOpsAWS CDKCloudWatchDockerECSEKSGitHubGitHub ActionsGitLabGitLab CIHelmIAMKafkaKinesisKubernetesAWS LambdaAmazon MSKTerraform

Certifications

AWS DevOps Engineer ProfessionalAWS Solutions Architect AssociateAWS Solutions Architect ProfessionalAWS Security SpecialtyAWS Advanced Networking SpecialtyCompTIA Security+CISSP

Degrees

Bachelor's degree in computer scienceBachelor's degree in systems engineeringBachelor's degree in a related technical field

Industry

Space systemsDefenseCommercial satelliteDoD space systems

Security clearance

Active security clearance or ability to obtain oneTop Secret preferred

Relocation

No