NetlifyVisit company site
#golang#aws#kubernetes

Engineering Manager, SRE

Netlify wishes to hire a new engineering manager. If you have experience working across teams to meet their goals for service reliability, availability, and efficiency, consider applying.
Job post found at boards.greenhouse.ioApply for position

Description

Company Overview

With over 2 million developers worldwide, Netlify is leading the transition to modern Jamstack-based web development. By uniting the ecosystem of developer tools and technologies, Netlify makes it easier than ever to build, deploy, and scale web applications

We are excited to announce our most recent Series D fundraise of $105M led by Bessemer Venture Partners, with participation from our existing investors Andreessen Horowitz, BOND, EQT Ventures, Kleiner Perkins, Mango Capital and Menlo Ventures with an overall $2B valuation.

Though our team is growing fast, we’ve managed to stay tight-knit while welcoming newcomers to the fold. We hail from around the globe with diverse backgrounds, we’re ~40% woman or non-binary, and are composed of 29 different nationalities.

We aim to create a company culture of empowerment where the best idea can come from anywhere, as we believe that empowered and engaged team members do the best work. We strive to be thoughtful, caring, and collaborative in our work within and across teams.We’ll be giving you the tools you need to succeed and looking to you for suggestions for improvement—not just in your daily job, but in many other aspects of building a company.

About the Opportunity

The mission of our SRE team is to use a software engineering approach to architect, design, monitor and scale Netlify’s infrastructure for the next million users. We’re building out the foundation of reliability and observability to help provide global availability for our users. Our team is dedicated to ensuring application resiliency and delivering the compute and network platform at scale.

As the manager of our SRE team, your focus will be on leading a team of senior engineers to help build and grow our next generation platform, which has critical infrastructure serving globally distributed storage and compute demands and will be scaling to handle our growth while meeting our expectations of high availability. Your team will be navigating the challenges and complexity of heterogeneous tools and environments, including multiple cloud platforms. You’ll be a hands-on manager designing, developing and delivering solutions that enhance the scalability, availability, and efficiency of our products.  Our tech stack includes Kubernetes, AWS, GCP, Kafka, and Golang based microservices. With our team, you’ll identify opportunities to increase efficiency, eliminate downtime, optimize costs, and maintain performance at scale.

We are a remote-first, globally distributed team and biased towards asynchronous planning and communication, meaning less meetings and more execution. We take documentation seriously and place our values of transparency, empowerment, and commitment at the forefront of everything we do.

We’re driven by passion and we make sure that everyone on the team knows their value, feels ownership over their work, and can quickly see the impact of their efforts. Beyond just hiring smart, empathetic team members, we foster a culture where there are no dumb questions and our team can get access to the resources that they need to continue to learn. As a remote-first company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Netlify is the type of company where you can balance great work with great life.

Responsibilities:

  • Create and own the execution of the team’s roadmap
  • Own all infrastructure for our platform, working closely with the SRE team, ensuring its availability, reliability, scalability, fault tolerance and performance
  • Drive improvements to our incident response and on-call procedures
  • Collaborate across Product & Engineering to establish SLOs, SLIs, and SLAs
  • Drive adoption of best practices in monitoring, alerting, chaos testing, and resiliency
  • Partner with our development teams to help them build reliable and scalable services, and resolve any production issues as quickly as possible
  • project planning and task prioritization
  • Help engineers grow their skills and experience
  • Keep regular syncs with all members
  • Strong sense of ownership, and drive
  • Grow a global team

What you’ll bring: 

  • Experience leading engineering teams responsible for 24x7 high volume, highly available systems
  • Proficiency with one or more cloud providers and ability plan the growth of our infrastructure
  • Experience setting strategic vision, owning and resolving issues that impact design, product success, or address future concepts, products, or technologies
  • Experience working across teams to meet their goals for service reliability, availability, and efficiency
  • Demonstrate a solid understanding of logging platforms and application performance metrics
  • Passion for mentoring, nurturing, and growing a team of SREs
  • Security and compliance experience (SOC, PCI, GDPR)
  • History of managing globally distributed teams
  • Located in North/South America hours (UTC -4 to -7)

Within 1 month, you’ll:

  • You’ll begin the journey of understanding the complexities around our business, customer, and engineering needs. We believe strongly that it’s essential for you to take the time to become familiar with our space & how we operate!
  • Have one-on-ones and pairing sessions with some of the people that you’ll be working closely with, including members of the Platform, Product, and leadership teams.
  • Identifying opportunities for improvement and defining a roadmap of how to solve any gaps

Within 3 months, you’ll:

  • Be a trusted contributor within the SRE team
  • Partner with our internal customers to align on cross-team objectives
  • Have gained a deeper understanding of the needs of the platform and become more comfortable with diagnosing issues
  • Contribute to the team’s roadmap on product reliability and cloud infrastructure
  • Lead multiple cross-team projects scaling our cloud infrastructure and building product observability and reliability

Within 6 months, you’ll:

  • Elevate the work of the team and become a subject matter expert in the reliability roadmap for the product
  • Introduce new frameworks and tools to help optimize and elevate the work of the team
  • Work across teams to manage SLOs
  • Participate in helping us grow the team by conducting interviews and partnering with leadership to strategize future hiring needs
  • Ensure a sustainable team pace for the long haul, working on growth planning as needed

At Netlify, we are a growing company that is constantly evolving so this timeline is intended to show you an example of what you can expect from the role. Keep in mind we're always iterating, learning, and growing, thus expect these guidelines to continue to evolve as we expand. We're excited for you to join us on the journey!

About Netlify

Of everything we've ever built at Netlify, we are most proud of our team.

We believe that empowered and engaged colleagues do their best work. We’ll be giving you the tools you need to succeed and looking to you for suggestions to improve not just in your daily job, but every aspect of building a company. As a distributed-first organization we want to make sure wherever our team is we find inventive ways to collaborate, debate, and learn from each other.

To learn a bit more about our team and who we are, make sure to visit our about page.

Applying

Not sure you meet 100% of our qualifications? Please apply anyway!

When applying please include: A resume or short listing of your job history & skills. (A link to a LinkedIn profile would be fine). A cover letter explaining why you would enjoy working in this role and why you’d like to work at Netlify would be great, though not required & will not impact your application. When we receive your application we’ll get back to you about the next steps.

Netlify is an Equal Opportunity Employer. We are devoted to building a team of people with diverse backgrounds and lifestyles. We believe that the unique contributions of all Netlifolks is the driver of our success. We are all responsible for bringing on people from all walks of life. Driving equality empowers our team, enables us to innovate, and helps us maintain a more inclusive environment. We don’t discriminate against employees or applicants based on gender identity or expression, sexual orientation, religion, age, race, military/veteran status, citizenship, pregnancy status, or any other differences. If we can do anything to provide a better interview, i.e. accommodate a disability, then please let us know.

Apply for position