IT Site Reliability Engineer
GitLab (View all Jobs)
1. A series of video calls 2. Coding exercise involving working on a Merge Request that is like a real work task
Programming Languages Mentioned
The GitLab DevOps platform empowers 100,000+ organizations to deliver software faster and more efficiently. We are one of the world’s largest all-remote companies with 1,800+ team members and values that guide a culture where people embrace the belief that everyone can contribute.
IT Site Reliability Engineer
At GitLab, the IT Infrastructure team is responsible for Site Reliability Engineering for our tech stack applications and cloud infrastructure that supports corporate initiatives across many of our departments. In addition to traditional AWS and GCP administration, we also provide escalation engineering support for departments that manage their respective SaaS tech stack applications (vendor-hosted). Another of our functions is to provide DevOps Engineering for several internally built applications that power our business operations and automation.
The IT team collaborates closely with the Engineering Infrastructure Reliability team that is responsible for our GitLab.com SaaS platform (our product infrastructure). The IT, Engineering, and Infrastructure Security teams collaborate to architect, implement, and manage our AWS and GCP infrastructure policies and collectively manage all related services.
- Lead the handling of ticket queue (GitLab issues) for AWS and GCP corporate infrastructure requests from team members. This ranges from simple IAM and DNS requests to designing and deploying new scalable application infrastructure.
- Design, build and maintain core infrastructure that enables GitLab can scale to support 2,000+ team members and the applications and services that they use day-to-day.
- Implement and maintain system logging and monitoring to alert on problems and prevent outages, and get ahead of customer needs.
- Maintain the corporate AWS and GCP infrastructure utilizing Ansible, Terraform, GitLab CI/CD, and Kubernetes
- Gather and analyze operating system and application metrics to assist in performance tuning and fault finding
- Create sustainable systems and services through patching, automation, and upgrades
- Document every action so your findings turn into repeatable actions and then into automation.
- Provide mentorship to IT System Administrators and IT Analysts who have an interest in infrastructure and IaC.
- Collaborate with other teams to improve services and help with system design, platform management, and capacity planning
- LevelsAWS and GCP - At least 2 years managing applications in AWS and/or GCP. An AWS and/or GCP professional certification is nice to have, however practical experience is more important in conjunction with Terraform experience for deploying applications and services using infrastructure-as-code with security best practices.
- Security - Strong understanding of security best practices, network design, and how AWS/GCP roles should be used for IAM/RBAC least privilege.
- Infrastructure-as-Code - Configuration management experience with Terraform and/or Ansible to effectively manage our infrastructure. Previous experience with AWS
- CloudFormation, Chef, Pulumi, Puppet, etc. is acceptable, however strong Terraform experience is a requirement.
- Kubernetes - Experience with managing Kubernetes clusters and using kubectl, k9s, etc for managing helm chart deployments, ingress services, and troubleshooting pods.
- Previous experience with Docker and related technologies is acceptable since container concepts are transferable.
- Operating Systems - Experience with managing Alpine, Debian, or Ubuntu Linux systems. We do not use Windows at GitLab. Many services are deployed in containers.
- Cloud Services - Manage, configure and troubleshoot Linux operating system issues (Linux), storage (block and object), networking (VPCs, proxies and CDNs), and administer high-availability PostgreSQL and Redis clusters
- Monitoring and instrumentation - Implement metrics in Prometheus, Grafana, Elastic, log management and related systems, and Slack/PagerDuty/Sentry integrations
- Engineering practices - High availability, data security, reliability and scalability, as well as disaster recovery
- 5+ years of experience in IT in a high growth Software as a service (SaaS) environment
- Knowledge of configuration management tools like Ansible, Chef, or Terraform
- Hands-on experience working in GCP and AWS environments
- Experience working with CI/CD tools and Git
- Ability to use GitLab
For Colorado residents: The base salary range for this role’s listed level is currently $100,800-$151,200 for Colorado residents only. Grade level and salary ranges are determined through interviews and a review of education, experience, knowledge, skills, abilities of the applicant, equity with other team members, and alignment with market data. See more information on our benefits and equity. Sales roles are also eligible for incentive pay targeted at up to 100% of the offered base salary. Disclosure as required by the Colorado Equal Pay for Equal Work Act, C.R.S. § 8-5-101 et seq.
To view the full job description and its compensation calculator, view our handbook. The compensation calculator can be found towards the bottom of the page.
Additional details about our process can be found on our hiring page.
Country Hiring Guidelines: GitLab hires new team members in countries around the world. All of our roles are remote, however some roles may carry specific location-based eligibility requirements. Our Talent Acquisition team can help answer any questions about location after starting the recruiting process.
GitLab is proud to be an equal opportunity workplace and is an affirmative action employer. GitLab’s policies and practices relating to recruitment, employment, career development and advancement, promotion, and retirement are based solely on merit, regardless of race, color, religion, ancestry, sex (including pregnancy, lactation, sexual orientation, gender identity, or gender expression), national origin, age, citizenship, marital status, mental or physical disability, genetic information (including family medical history), discharge status from the military, protected veteran status (which includes disabled veterans, recently separated veterans, active duty wartime or campaign badge veterans, and Armed Forces service medal veterans), or any other basis protected by law. GitLab will not tolerate discrimination or harassment based on any of these characteristics. See also GitLab’s EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know during the recruiting process.
Please mention No Whiteboard if you apply!
I'm a one-man team looking to improve tech interviews, and could use any support! 😄