Senior SRE/DevOps Engineer

Posted Apr 7

If this role seems interesting, irrespective of your location or identities, please reach out.

Even if you don't think you meet all of the criteria but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join our team. We'd love to hear from you.

Metabase is the easiest way for people to get insights from their data, from tiny startups who get up and running quickly to major corporations with tens of thousands of users. That's why people love us.

We bring data tools with the elegance and simplicity of consumer products to the crufty world of enterprise business intelligence. We provide an opinionated open source starting point for how companies should measure, analyze and share their data, which is used by tens of thousands of companies.

Tens of thousands of companies use Metabase every day to answer questions about their data. While we seek to become the de-facto self-managed open source analytics software for organizations everywhere, many customers want an ability to use Metabase without worrying about the operational details of self-hosting. That’s why we recently launched our Metabase Cloud product. We’re looking for operations engineers to help build out and run our new and quickly growing ‘Metabase Cloud’ hosted product.

You will:

Own and operate our application stack and AWS infrastructure to orchestrate and manage our hosted customer instances of Metabase
Debug runtime issues across the different levels of our application stack and hosting stack.
Develop and build our internal tooling and automation to manage the lifecycle of a hosted Metabase installation, from purchase to deployment, zero-downtime upgrades, and general operational health
Continuously improve our automated deployments and testing

We're looking for someone who:

Is thoughtful and careful
Compulsively automates everything and documents it
Is able to make solid technical judgements and back them up articulately
Has at least 5 years of experience building and operating production infrastructure, ideally on public cloud
Strong Kubernetes and AWS experience
Strong experience with IaC and Terraform
Can write high quality and readable code in a modern language (e.g. Python, Go, etc.)
Experience with modern monitoring stacks (e.g Prometheus/Grafana/Datadog)

Projects you could work on:

Multi-region hosting
Automate EKS cluster provisioning
Extend our CRDs and Operators
Improve the RDS sharding strategy for our multi-tenant platform
Unify and improve our CI/CD platforms
Collaborate with core application developers on changes to improve our application metrics, deployment speeds and CI integration.
Maintain our SOC2 compliance and security posture

We're a global team (50% outside the US), fully distributed (from Thailand to California), who get things done asynchronously, with plenty of uninterrupted time, supporting each other to do the best work of our careers. We offer flexibility (define your own schedule and work from wherever you want), autonomy, and an environment that fosters growth, learning, and development. We're relentlessly user-focused and believe in building long-term value, not short-term hacks. And we raised a $30M Series B to take our approach to the next level for years to come.