Senior Reliability Engineer (SRE)

Posted Jan 13

We are looking for an experienced, self-motivated, highly productive Site Reliability Engineer (SRE) to build and scale services in a cloud environment within our Infrastructure team.

Requirements

Key Responsibilities

  • Building, deploying, improving, and maintaining infrastructure (in on-premise, Azure)
  • Managing operations and tooling around compute infrastructure
  • Building/optimizing monitoring and alerting (maintaining customer SLAs, RTO/RPO target requirements)
  • Managing operations on additional infrastructure components such as monitoring, alerting and databases
  • Build tools and automations
  • Be on call, respond to incidents and conduct root-cause analysis on customer-impacting issues
  • Define and manage SLO, SLI and error budgets
  • Leading new projects and initiatives around site reliability, developing or finding net-new solutions, evaluating products, and leading discussions on technology topics
  • Mentoring team members, showing thought-leadership, and helping educate the team on best practices
  • Remote desktop/laptop assistance may be required from time to time
  • Customer interaction may be required from time to time
  • Other duties as assigned

Candidate Criteria

Required Experience

  • Either a B.S. degree or equivalent in Computer Science or a minimum of 7 years’ experience in Infrastructure-as-code, deployment systems and have experience writing automation in a modern programming language
  • Experience with monitoring, metrics, logs
  • Cloud computing (Azure)
  • Understanding of distributed systems and their commonly associated problems
  • Experience with CI/CD systems (Preferred)
  • Experience writing infrastructure as a code (Terraform, Ansible, Puppet, etc.) (Preferred)
  • Experience working with containers and Kubernetes (Preferred)
  • Experience utilizing enterprise system monitoring tools (i.e. PRTG, Elasticsearch, etc) (Preferred)

Critical Skills & Qualifications

  • Strong networking fundamentals
  • Belief in automating the problems
  • Strong communication & analytical skills.
  • Curiosity, adaptability, and a willingness to learn.
  • Experience with managing measurable goals and metrics.
  • Previous experience in remote work settings preferred.

Benefits

Salary: $100k/y

Payment: Monthly

Paid days off: 10/y

50% overlap with ET time zone mandatory