Senior Site Reliability Engineer

Posted Sep 28

Who we are

Jellyvision ALEX®, is on a mission to improve lives by helping people choose and use their benefits. We are raising the bar—for benefits and the employee experience (for our employees and those of the customers we serve) – by scaling personalization, compassion and an earnest intent to be helpful in all that we do.

Jellyvision people are a group of creative problem solvers who use good judgment, give each other honest feedback, engage in real debate, and snack frequently. We are curious, hungry, and humble—because we know this is how we’ll continue to make an impact. We’re kind, biased towards action, and sweat the details to create great experiences for those we serve.

We are an inclusive, human-first workplace. Respect and trust for each other are foundational, and our equitable total rewards offerings support the lives and holistic well-being of our unique people. At Jellyvision, expect career experiences that challenge you, empower you to have a direct impact on our mission, and enable you to learn, try, and do while having fun along the way.

What’s the role?

On any given day as a Senior Site Reliability Engineer, you will be busy doing everything from writing application code and managing infrastructure-as-code to architecting cloud applications for high-availability. You will be expected to interface with application development and data teams on a frequent basis helping to troubleshoot issues, develop pipeline features, and reviewing merge requests. Expect to be involved in alert and incident remediation both during business hours and after hours when on-call. As a Senior SRE you will need to be knowledgeable, patient, and prompt while always seeking to be helpful not only to your team but to all of the Jellyvision teams.

What you’ll do to be successful

1. Design Applications

Advise application development teams on best practices for building modern, containerized, or cloud-based applications and infrastructure.
Architect applications using industry best practices and data-driven decision-making to ensure optimal performance and scalability.
We will measure dev cycle time, code quality, application performance, and cost efficiency.

2. Optimize & Design CI/CD pipelines

Develop and deliver strategic guidance on best practices for CI/CD pipeline implementation and optimization.
Ensure pipelines and deployment processes are streamlined, minimizing manual intervention and repetitive tasks.
Uphold infrastructure and code security standards by implementing automated testing protocols.
Optimize pipeline performance to achieve faster and more efficient execution times.
Continuously identify process improvement opportunities and plan initiatives to enhance operational efficiency.
We will measure pipeline execution time, automated testing coverage, security vulnerabilities detect, pipeline success rate, and manual intervention incidents.

3. Monitor & Support Systems

Efficiently resolve alerts and incidents with a calm, methodical approach to ensure timely resolution.
Participate in the on-call rotation to triage and address incidents outside of regular business hours.
Provide comprehensive support to development teams in managing their application ecosystems and CI/CD pipelines.
We will measure mean time to resolution (MTTR), incident resolution rate, on-call response time, and support request resolution time.

4. Serve as a Mentor

Serve as a positive and supportive team member, offering guidance and expertise to both development teams and fellow SRE colleagues.
Proactively seek opportunities to enhance technical knowledge and skills through continuous professional development.
Deliver constructive and actionable feedback on merge requests to ensure code quality and best practices.
Lead by example by upholding and adhering to established standards and practices, demonstrating accountability and integrity.
We will measure peer feedback scores, collaboration frequency, merge request review quality, and self-learning hours.

Experience & skills you’ll need

Demonstrated experience with cloud computing platforms, particularly AWS, and the ability to manage and scale applications effectively.
Proficient in multiple programming languages, including Ruby, Python, and JavaScript.
Experienced in utilizing configuration management tools such as Ansible, Packer, CloudFormation, and Terraform, with a strong emphasis on Terraform.
Skilled in container technologies and orchestration tools, including Docker, ECS, and Kubernetes.
Experienced with continuous integration tools, such as GitLab, GitHub, and Jenkins.
Well-versed in best practices for monitoring and alerting to ensure system reliability and performance.
Exceptional communication skills, with the ability to clearly and effectively convey information to various stakeholders.
Strong aptitude for data-driven decision-making, leveraging analytical insights to guide strategic choices.

Core Competencies

Courage
Decision Quality
Persuades
Manages Ambiguity
Manages Complexity
Optimizes Work Processes
Tech Savvy

The Details

Location: Remote
Starting Salary: $145,000 - $175,000

What Jellyvision will give you

Check out our benefits here!

Jellyvision is committed to continuous evolution and fostering a more diverse and inclusive workplace where everyone is welcomed, valued, and respected. It doesn’t matter your race, ethnicity, religion, age, disability, sexual orientation, gender, gender identity/expression, country of origin, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), criminal histories consistent with legal requirements or any other basis protected by law...we just want amazing people who are willing to grow along with us.

Although we have a Chicago-based HQ that employees are welcome to work out of whether they’re local or just visiting, this position is also eligible for work by a remote employee out of CA, CO, CT, FL, GA, IL, IN, KY, MA, MI, MN, NC, NJ, NY, OH, OR, PA, SC, TN, TX, UT, VA, WA or WI.