Site Reliability Engineering
We are looking for full-time software engineers to build reliable software infrastructure for our system. You will work on end-to-end automation, including testing, model training and web deployment scaling, to ensure our system has a robust base to grow and serve users.
Useful Experience
- Scalable web applications
- High performance computing
- Machine learning infrastructure
- AWS, GitLab CI, Terraform, Kubernetes
Responsibilities
- Create automated test and deployment pipelines
- Ensure observability of components
- Stay prepared to scale
- Ensure data persistence and recovery