Extend (View all Jobs)
1. Phone call 2. A take home project similar to making a PR at work 3. Then a meet & greet with the team.
Programming Languages Mentioned
Extend is modernizing the $100 billion-per-year extended warranty and protection plan industry using cutting-edge technology, and top-notch customer service. After a $260M Series C financing round, we are ready to continue to scale our organization and grow beyond our $1.6B valuation.
Our API-first solution allows any merchant to offer extended warranties and protection plans, both online and offline, while also providing a merchant's end customers with a vastly improved and modern support experience that eliminates many of the issues customers face today with legacy underwriters.
We are a venture-backed startup based in downtown San Francisco that is led by founders who have previously had multiple successful exits. Extend is simplifying the technology stack for the extended warranty industry.
What You’ll Do:
- Lead the SRE team through complex projects and daily interruptions utilizing Agile best practices
- Build the SRE team by hiring the right talent and providing inspiring leadership
- Collaborate with key stakeholders across Engineering, Architecture and InfoSec teams on initiatives and capabilities related to the operational health, security, growth, usability, and design of our applications.
- Set strategy and develop a roadmap for the team aimed towards reducing the operational overhead of keeping Extend applications healthy, secure, and available for our customers.
- Collaborate across domains to drive ownership of production systems, enable faster decision making and transparent observability into system health.
- Drive service reliability by developing methodology around metric visibility using SLIs, SLOs, and SLAs. Evangelize SRE best practices across the organization and promote better monitoring practices and a proactive approach to reliability and observability.
- Advocate for and drive the implementation of reliable design patterns.
- Build and implement incident management - alerting, triage, run books, post mortem and RCA.
- Promote simplicity in solving complex problems across our technology footprint.
- Lead and focus teams on root cause analysis, pattern identification and continuous improvement in order to optimize application performance, resiliency and reliability.
What We Are Looking For:
- 5+ years overall experience, including 2 years experience in a technical or people leadership role working in a product technology organization focused on SRE, DevOps, or automation.
- Experience with building and expanding APM systems like DataDog (preferred), New Relic, Dynatrace, etc.
- Strong understanding of modern cloud-native architecture and applications performance on AWS (preferred) or other cloud providers.
- Experience with building/deploying/scaling/managing distributed systems on AWS at an enterprise level.
- Experience with CI/CD Tooling such as GitHub Actions (preferred) CircleCI, Jenkins, or others and proven ability to lead an organization through the CI/CD journey.
- Ability to perform in a high energy environment with dynamic job responsibilities and priorities
Nice to Haves:
- Experience with AWS Cloud Development Kit(CDK)
- Experience with TypeScript
Life at Extend:
- Working with a great team from diverse backgrounds in a collaborative and supportive environment.
- Competitive salary based on experience, with full medical and dental & vision benefits.
- Stock in an early-stage startup growing quickly.
- Unlimited vacation policy.
- 401(k) with Financial Guidance from Morgan Stanley.
Please mention No Whiteboard if you apply!
I'm a one-man team looking to improve tech interviews, and could use any support! 😄