Site Reliability Engineering

The SRE team at Litmus is a bit of a hybrid between Site Reliability Engineering and Platform Engineering. The SRE team is focused on designing and managing our AWS cloud platform's core infrastructure, as well as building abstractions and automation to enable our engineering teams to deploy and operate their software and services in production safely and reliably. Within SRE we aim to design our infrastructure and implementations for the safety, contentment, knowledge and freedom of both our peers and our customers.

The SRE team is involved with most every aspect of our systems, but we take primary responsibility for the low-level complexity of our AWS environments, for the build and continuous deployment and integration services, VPNs, Terraform automation, and a long list of other services that enable our high-velocity engineering organization. We work closely with engineering to continuously improve these abstractions and services.

Terraform Packer Ansible Amazon Web Services Docker Octopus Deploy Concourse Golang

Our Values

The SRE team has a set of shared values that we have adopted over time as the team has formed, and that we practice and hold each other accountable for.

Our Team

You're only as good as the company you keep. Our SRE team is a motley crew of talented engineers with a diverse set of skills and backgrounds. Interested in working with us? We're hiring!