Site Reliability Engineer

Posted Apr 5

Description

Planet Argon provides dependable support and maintenance of existing Ruby on Rails apps for a variety of clients in different industries. We take care of small feature updates, bug fixes, and performance improvements.

We are currently looking for an experienced Site Reliability Engineer that can provide part-time (15-20 hours/week) support for our clients and our team on a contract basis. The relationship would start as a three-month contract engagement; should both parties want to continue working together after those 90 days, we can agree to either a longer-term contract or a month-to-month arrangement.

Learn more about Planet Argon here

We're looking for contractors that embody our core values:

  • PROACTIVE - We actively seek opportunities to improve our client’s products, our processes, and our abilities.
  • CURIOUS - A natural curiosity for the undiscovered results in remarkable work for our clients – and stronger connections for our team. We ask questions, learn, and aren't afraid to fail.
  • DEPENDABLE - We are invested in our work. We manage expectations. We support our clients and teammates. We hold ourselves, our teammates, and our clients accountable.
  • VERSATILE - We readily adapt to change and encourage innovation because our team and work are transparent and flexible.
  • DELIGHTFUL - We choose to set a mindful, positive tone that allows everyone to flourish.

Requirements

As our Part-Time Contract Site Reliability Engineer, you will play a pivotal role in maintaining the resilience and optimal performance of our clients’ software systems. Your responsibilities will include responding to and resolving application outages and other incidents, identifying their root causes, and devising robust solutions to prevent their recurrence. You will assist team members in interpreting system monitoring alerts and messages, and actively participate in retrospectives to analyze incidents and develop strategies to avoid future issues. Additionally, you will integrate monitoring tools into client applications, apply patches and system upgrades across various client environments, and debug build failures in CI/CD pipelines.

To excel in this role, you should have experience in identifying and addressing scalability issues related to system architecture and in pinpointing security vulnerabilities in servers. Proficiency in scripting and automation with languages like Python or Bash is essential, as is experience with configuration management tools such as Terraform, Ansible or Chef. A strong understanding of Linux/Unix systems, networking fundamentals, and experience with cloud platforms like AWS or Google Cloud are crucial. You should also be adept at auditing third-party services and tools for efficiency and cost-effectiveness. While not mandatory, familiarity with Ruby on Rails will be considered an advantage.

Role responsibilities will include:

  • Responding to application outages and other incidents, identifying root causes, and implementing solutions to prevent recurrence 
  • Helping other team members understand system monitoring alerts and messages
  • Taking part in retrospectives after incidents to document what went wrong and make plans how best to avoid those situations in the future
  • Integrating monitoring tools into client applications
  • Patching and applying system upgrades across different client systems
  • Debugging build failures in CI/CD pipelines
  • Managing database migrations
  • Managing deployments and CI/CD pipelines across multiple projects for various clients

The idea contractor has an understanding of and experience with:

  • identifying and solving scalability issues having to do with poor system architecture
  • identifying security vulnerabilities in servers
  • scripting and automation using languages such as Python or Bash
  • containerization tools like Docker
  • configuration management tools like Terraform, Ansible or Chef
  • Linux/Unix systems and networking fundamentals
  • cloud platforms such as AWS or Google Cloud
  • auditing third-party services and tools for efficiency and cost-effectiveness 
  • Pingdom, Bugsnag, Rollbar, New Relic, Honeycomb, and other monitoring tools
  • Rspec, Test Unit, Cypress, and other testing tools
  • SQL databases
  • CI/CD tools such as CircleCI, Github Actions, GoCD, Jenkins, and Travis
  • Familiarity with Ruby on Rails is a plus!

Availability and Location

The ideal contractor is comfortable working remotely within 1-2 hours of EST timezone.

South and Latin American contractors are highly encouraged to apply.

This is a remote part-time contract position. The ideal candidate will be available for periodic meetings during EST business hours.

Benefits

Our ideal contractor has an hourly rate of $45.00 - $65.00 USD.

We address monthly or bi-weekly invoices within 10 business days.

If you're passionate about ensuring the reliability and availability of high-profile applications and working with a team of skilled developers, we'd love to hear from you!