Jobs / 4P Consulting Inc.

DevOps Engineer IV- 4P/702

4P Consulting Inc. · Atlanta, GA, United States
Atlanta, GA, United StatesOnsite
Remuneration
Not specified
Location
Atlanta, GA, United States
Visa sponsorship
Not specified

Job summary

Seeking an experienced DevOps Engineer IV / Site Reliability Engineer (SRE) with strong hands-on experience in observability, telemetry, monitoring, and service reliability. The ideal candidate will have deep knowledge of Grafana, OpenTelemetry (OTEL), PromQL, and application/system instrumentation. This role involves partnering with engineering, operations, and application teams to improve service reliability, telemetry quality, alerting maturity, and operational visibility.

Qualifications

  • Strong experience as a DevOps Engineer, Site Reliability Engineer, or Observability Engineer.
  • Hands-on experience with Grafana, OpenTelemetry, and PromQL.
  • Experience with application and system instrumentation.
  • Strong understanding of logs, metrics, traces, alerting, and service reliability.
  • Ability to design monitoring solutions across complex environments.
  • Strong troubleshooting, analytical, communication, and collaboration skills.
  • Experience with Prometheus, Loki, Tempo, Kubernetes, containers, cloud platforms, or microservices.
  • Familiarity with CI/CD, automation, infrastructure-as-code, incident response, SLIs, SLOs, and reliability metrics.

Responsibilities

  • Design, implement, and support monitoring and observability solutions.
  • Build dashboards, alerts, and telemetry solutions using Grafana and related tools.
  • Implement OpenTelemetry standards for application and system instrumentation.
  • Write and optimize PromQL queries for monitoring and reliability insights.
  • Improve alerting quality, reduce noise, and create actionable alerts.
  • Troubleshoot application and infrastructure issues using logs, metrics, and traces.
  • Support incident response, root cause analysis, and reliability improvements.
  • Collaborate with engineering, operations, and application teams.

Skills

GrafanaKubernetesLokiOpenTelemetryPrometheusTempo

Contract length

3 years

Relocation

No