NetlifyVisit company site

Principal Infrastructure Engineer

If you have experience of designing compute infrastructure in AWS, managing multi-cloud environments and conducted service migrations, consider applying to Netlify's job post for a new Principal Infrastructure Engineer.
Job post found at boards.greenhouse.ioApply for position


Company Overview

At Netlify, we’re building a platform to empower digital designers and developers to build better, more elaborate web projects than ever before. We’re aiming to change the landscape of modern web development. Netlify currently serves more than 1,000,000 developers worldwide.

Netlify is a diverse group of incredible talent from all over the world. We’re ~44% woman or non-binary, and are composed of more than a fourth as many nationalities as we are team members.

We recently raised $63M in Series C funding to bring forward the next generation of tooling for a more accessible web. Among our investors are Andreessen Horowitz, Kleiner Perkins, EQT Ventures as well as the founders of GitHub, Slack, Figma  and Yelp. This latest round brings Netlify’s funding raised in total to $107M to date.

About the Opportunity

As a Principal Infrastructure Engineer, you’ll work closely with senior leadership and other principals to translate product strategy into a technical roadmap. You will design, develop and deliver solutions that enhance the scalability, availability, and efficiency of our products. You’ll be driving implementation and owning product reliability and cloud provider adoption projects, which require influencing teams and process the broader engineering organization.  You’ll be exposed to tremendous scale with the team, as we’re managing a multi-cloud environment where on average, our platform processes over a petabyte of data and we serve over 8% of the internet’s traffic. You’ll be informing Netlify’s infrastructure and product direction for years to come.

The competency to build performant multi-cloud systems is vast, we don't expect you'll be an expert in everything. You'll work closely with senior technical leads, bringing your enthusiasm to learn and being an advocate for creating scalable practices that avoid relying on reactive solutions. We prefer documentation and automation to short-term problem solving and are adept at sharing knowledge with others.

We are biased towards asynchronous planning and communication, meaning less meetings and more execution. We place our values of transparency, empowerment, and commitment at the forefront of everything we do. We’re driven by passion and we make sure that everyone on the team knows their value, feels ownership over their work, and can quickly see the impact of their efforts. Beyond just hiring smart, empathetic team members, we foster a culture where there are no dumb questions and our team can get access to the resources that they need to continue to learn.

Who you are:

  • You are well versed in a large number of technologies and welcome new tools and techniques. We work with Go, Kubernetes, Mongo, CDNs, AWS, GCP,  just to name a few!
  • Experienced designing compute infrastructure in AWS, managing multi-cloud environments and conducted service migrations
  • You are a software engineer at heart, with a compulsion to automate everything and have mastered one or more mainstream programming languages
  • You have outstanding interpersonal skills, and can effectively coordinate incident response across globally distributed teams spanning multiple time zones
  • You have production-level experience operating Linux systems and the ability to methodically diagnose the system, network, and application issues. You’ve supported infrastructure and services ranging from IaaS, PaaS, and SaaS in public cloud environments and have vast experience contributing to the architecture and design (design patterns, reliability and scaling) of new and existing systems.
  • You’re well versed in distributed systems, microservices architectures, content delivery networks, infrastructure tooling and automation, edge routing, traffic shaping, software development best practices, observability, and reliability engineering

What you'll do: 

  • Be an advocate of excellence in availability, scalability, security, performance
  • Define standards and approaches that drive product performance and reliability
  • Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.
  • Drive projects/technical initiatives and architectural/technical service improvements through to completion
  • Monitor and stress test systems to collect metrics for tuning and capacity planning
  • Apply a deep understanding of business and engineering requirements to drive the platform road map
  • Lead the design and definition of a Platform technical architecture and plan to achieve it
  • Grow our most senior engineers through mentoring
  • Architect and develop the network management platform that provides seamless integration of configuration capabilities for deployment and management of the CDN

Within 1 month you’ll: 

  • Understand the complexities around our business, customer, and engineering needs and define the problem spaces and gaps
  • Conduct a deep dive into our tech and application stack to understand our architecture and roadmap
  • Have one-on-ones and pairing sessions with some of the people that you’ll be working closely with, including members of the Platform, Data,  Product, Leadership and Site Reliability teams.
  • Begin identifying opportunities for improvement and defining a roadmap of how to solve any gaps
  • Learn from the team during weekly in design reviews, demos, and architectural meetings

Within 3 months, you’ll: 

  • Have gained a robust understanding of the needs of the platform and be working with principal engineers on our cloud roadmap
  • Be working on network capacity planning and scaling
  • Defining best current practices for production readiness and release engineering
  • Begin driving architectural and design decisions from the point of view of scalability and reliability, including our CDN, deployment tooling, testing, database design, and multi-cloud patterns

Within 6 months, you’ll: 

  • Be working on the migration of services between cloud providers
  • Working with product and infrastructure teams to design next generation products
  • Developing tooling to traffic patterns and anomalies in network traffic and apply traffic and firewall rules
  • You will design the systems and processes that our engineers use to manage and deploy their software in production
  • Participate in all phases of development of large projects, providing hardware, scalability, operability and performance perspectives to teams
  • Define system requirements and refine selected designs, balancing raw upfront cost with operability and total cost of ownership
  • Participate in the development and delivery of operability related features such as health monitoring, diagnostics, repair, and other self-healing automation
  • Have made a significant impact to our platform by architecting an extensive scalable solution to accommodate our rapidly growing user base
  • Play a significant role in implementing globally distributed, latency-sensitive, high throughput services

About Netlify: 

Of everything we've ever built at Netlify, we are most proud of our team.

We believe that empowered, engaged colleagues do their best work. We’ll be giving you the tools you need to succeed and looking to you for suggestions to improve not just in your daily job, but every aspect of building a company. Whether you work from our main office in San Francisco or you are a remote employee, we’ll be working together a lot—paring, collaborating, debating, and learning. We want you to succeed! About 60% of the company are remote across the globe, the rest are in our HQ in San Francisco.

To learn a bit more about our team and who we are, make sure to visit our about page.


Not sure you meet 100% of our qualifications? Please apply anyway!

When applying please include: A resume or short listing of your job history & skills. (A link to a LinkedIn profile would be fine). A cover letter explaining why you would enjoy working in this role and why you’d like to work at Netlify would be great, though not required & will not impact your application. When we receive your application we’ll get back to you about the next steps.

Netlify is an Equal Opportunity Employer. We are devoted to building a team of people with diverse backgrounds and lifestyles. We believe that the unique contributions of all Netlifolks is the driver of our success. We are all responsible for bringing on people from all walks of life. Driving equality empowers our team, enables us to innovate, and helps us maintain a more inclusive environment. We don’t discriminate against employees or applicants based on gender identity or expression, sexual orientation, religion, age, race, military/veteran status, citizenship, pregnancy status, or any other differences. If we can do anything to provide a better interview, i.e. accommodate a disability, then please let us know.

Apply for position