Site Reliability Engineer 2
As a Site Reliability Engineer on the Conversica team, you will utilize software and systems engineering to implement resilient production systems. We use common DevOps tools like Terraform, Gitlab, Gitlab Pipelines, Kubernetes (EKS) and AWS as our hosting platform. We’re looking for a technically curious Site Reliability Engineer who thrives on driving innovation & best practices within a fast-paced, highly dynamic environment. You should be a strong multi-tasker and a highly collaborative teammate. The ideal candidate has a strong passion for technology, continuous improvement, stability, and transforming platforms. This position is remote.
Responsibilities
- Engage in and improve the life cycle of services—from inception and design, through deployment, operation, and refinement over time
- Engage in and help to continuously improve our team processes including our DevSecOps team working agreement
- Support services before launch through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and deployment review
- Participate in an on-call rotation with a sense of urgency and contribute to the continuous improvement of on-call (reduction in after-hour alerts etc)
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health based on SLA
- Practice sustainable incident response and participate in our blameless postmortem process
- Develop and maintain effective instrumentation of monitoring tools & dashboards
- Help improve and maintain high service up-time using AWS while developing and evangelizing company-wide standards for services
- Build and scale highly available, distributed services with high-quality of service for customers
- Assist in troubleshooting failures and performance issues across all services, while suggesting and applying preventive measures
- Maintain infrastructure owned by the DevSecOps team, EKS clusters, RDS Aurora DB Clusters, etc
- Support Conversica Development teams to allow them to focus on roadmap initiatives
Qualifications
- BS degree in Computer Science / Engineering or related technical field involving coding or equivalent practical experience
- 3+ years of managing distributed SaaS systems in public and private cloud environments on AWS
- 3+ years of experience in Kubernetes / Docker environment
- 3+ years of experience with Unix / Linux system administration
- 3+ years of practical experience building continuous delivery pipelines
- Familiarity with at least one of the following: Python, Go, PHP, Ruby, Java, C, C++
- Familiarity with algorithms, data structures, complexity analysis, and software design
- Interest in designing, analyzing, and troubleshooting large-scale distributed systems
- A systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- Highly analytical, detail-oriented, with the ability to work with complex logic to debug & optimize code and automate routine tasks while working under pressure to meet tight deadlines
- Experience with MySQL, AWS Aurora, or other RDBs
- Experience supporting applications according to privacy by design and security by design principles
- Experience with configuration management tools like Terraform, Ansible, or Puppet is required
- Experience with full-stack development is preferred
- Bring a growth mindset, customer orientation, and a bias for automation
- Team communication and collaboration are both critical traits for this role
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $110,000/year to $145,000/year. Pay is based on a number of factors including market location and job-related knowledge, skills, and experience.
Conversica offers comprehensive health, dental, and vision benefits, flex time PTO, 401k plus company match, and equity. Further details can be provided upon request.