Why every software engineering interview should include ops questions

I’ve fallen way behind on my blog posts — my goal was to write one per month, and I haven’t published anything since MAY. Egads. So here I am dipping into the drafts archives! This one was written in April of 2016, when I was noodling over my CraftConf 2016 talk on “DevOps for Developers (see slides).”

So I got to the part in my talk where I’m talking about how to interview and hire software engineers who aren’t going to burn the fucking house down, and realized I could spend a solid hour on that question alone. That’s why I decided to turn it into a blog post instead.

Stop telling ops people to code better, start telling SWEs to ops better

Our industry has gotten very good at pressing operations engineers to get better at writing code, writing tests, and software engineering in general these past few years. Which is great! But we have not been nearly so good at pushing software engineers to level up their systems skills. Which is unfortunate, because it is just as important.

Most systems suffer from the syndrome of running too much software. Tossing more software into the heap is as likely to cause more problems as often as it solves them.

We see this play out at companies stacked with good software engineers who have built horrifying spaghetti messes of their infrastructure, and then commence paging themselves to death.

The only way to unwind this is to reset expectations, and make it clear that

  1. you are still responsible for your code after it’s been deployed to production, and 
  2. operational excellence is everyone’s job.

Operations is the constellation of tools, practices, policies, habits, and docs around shipping value to users, and every single one of us needs to participate in order to do this swiftly and safely.

Every software engineering interviewing loop should have an ops component.

Nobody interviews candidates for SRE or ops nowadays without asking some coding questions. You don’t have to be the greatest programmer in the world, but you can’t be functionally illiterate. The reverse is less common: asking software engineers basic, stupid questions about the lifecycle of their code, instrumentation best practices, etc. 

It’s common practice at lots of companies now to have a software engineer in the loop for hiring SREs to evaluate their coding abilities. It should be just as common to have an ops engineer in the loop for a SWE hire, especially for any SWE who is being considered for a key senior position. Those are the people you most rely on to be mentors and role models for junior hires. All engineers should embrace the ethos of owning their code in production, and nobody should be promoted or hired into a senior role if they don’t.

And yes, that means all engineers!  Even your iOS/Android engineers and website developers should be interested in what happens to their code after they hit deploy.  They should care about things like instrumentation, and what kind of data they may need later to debug their problems, and how their features may impact other infrastructure components.

You need to balance out your software engineers with engineers who don’t react to every problem by writing more code. You need engineers who write code begrudgingly, as a last resort. You’ll find these priceless gems in ops and SRE.

ops questions for software engineers

The best questions are broad and start off easy, with plenty of reasonable answers and pathways to explore. Even beginners can give a reasonable answer, while experts can go on talking for hours.

For example: give them the specs for a new feature, and ask them to talk through the infrastructure choices and dependencies to support that feature. Do they ask about things like which languages, databases, and frameworks are already supported by the team? Do they understand what kind of monitoring and observability tools to use, do they ask about local instrumentation best practices?

Or design a full deployment pipeline together. Probe what they know about generating artifacts, versioning, rollbacks, branching vs master, canarying, rolling restarts, green/blue deploys, etc. How might they design a deploy tool? Talk through the tradeoffs.

Some other good starting points:

  • “Tell me about the last time you caused a production outage. What happened, how did you find out, how was it resolved, and what did you learn?”
  • “What are some of your favorite tools for visibility, instrumentation, and debugging?
  • “Latency seems to have doubled over the last 6 hours. Where do you start looking, how do you start debugging?”
  • And this chestnut: “What happens when you type ‘google.com’ into a web browser?” You would be fucking *astonished* how many senior software engineers don’t know a thing about DNS, HTTP, SSL/TLS, cookies, TCP/IP, routing, load balancers, web servers, proxies, and on and on.

Another question I really like is: “what’s your favorite API (or database, or language) and why?” followed up by “… and what are the worst things about it?” (True love doesn’t mean blind worship.)

Remember, you’re exploring someone’s experience and depth here, not giving them a pass-fail quiz. It’s okay if they don’t know it all. You’re also evaluating them on communication skills, which is severely underrated by most people but is actually as a key technical skill.

Signals to look for

You’re not looking for perfection. You are teasing out signals for things like, how will this person perform on a team where software engineers are expected to own their code? How much do they know about the world outside the code they write themselves? Are they curious, eager, and willing to learn, or fearful, incurious and begrudging?

Do they expect networks to be reliable? Do they expect databases to respond, retries to succeed? Are they offended by the idea of being on call? Are they overly clever or do they look to simplify? (God, I hate clever software engineers 🙃.)

It’s valuable to get a feel for an engineer’s operational chops, but let’s be clear, you’re doing this for one big reason: to set expectations. By making ops questions part of the interview, you’re establishing from the start that you run an org where operations is valued, where ownership is non-optional. This is not an ivory tower where software engineers can merrily git push and go home for the day and let other people handle the fallout

It can be toxic when you have an engineer who thinks all ops work is toil and operations engineering is lesser-than. It tends to result in operations work being done very poorly. This is your best chance to let those people self-select out.

You know what, I’m actually feeling uncharacteristically optimistic right now. I’m remembering how controversial some of this stuff was when I first wrote it, five years ago in 2016. Nowadays it just sounds obvious. Like table stakes.

Hell yeah. 🤘

Why every software engineering interview should include ops questions

The (Real) 11 Reasons I Don’t Hire You

(With 🙏 to Joe Beda, whose brilliant idea for a blog post this was.  Thanks for letting me borrow it!)

Interviewing is hard and it sucks.

IMG_8461In theory, it really shouldn’t be.  You’re a highly paid professional and your skills are in high demand.  This ought to be a meeting between equals to mutually explore what a longer-term relationship might look like.  Why take the outcome personally?  There are at least as many reasons for you to decide not to join a company as for the company to decide not to hire you, right?

In reality, of course, all the situational cues and incentives line up to make you feel like the whole thing is a referendum on whether or not you personally are Good Enough (smart enough, senior enough, skilled enough, cool enough) to join their fancy club.

People stay at shitty jobs far, far longer than they ought to, just because interviews can be so genuinely crushing to your spirit and sense of self.  Even when they aren’t the worst, it can leave a lasting sting when they decline to hire you.

But there is an important asymmetry here.  By not hiring someone, I very rarely mean it as a rejection of that person.  (Not unless they were, like, mean to the office manager, or directed all their technical questions to the male interviewers.)  On the contrary, I generally hold the people we decline to hire — or have had to let go! — in extremely high opinion.

So if someone interviews at Honeycomb, I do not want them to walk away feeling stung, hurt, or bad about themselves.  I would like them to walk away feeling good about themselves and our interactions, even if one or both of us are disappointed by the outcome.  I want them to feel the same way about themselves as I feel about them, especially since there’s a high likelihood that I may want to work with them in the future.

So here are the real, honest-to-god most common reasons why I don’t hire someone.

1. Scarcity

IMG_7488If you’ve worked at a Google or Facebook before, you may have a certain mental model of how hiring works.  You ask the candidate a bunch of questions, and if they do well enough, you hire them.  This could not be more different from early stage startup hiring, which is defined in every way by scarcity.

I only have a few precious slots to fill this year, and every single one of them is tied to one or more key company initiatives or goals, without which we may fail as a company.  Emily and I spend hours obsessively discussing what the profile we are looking for is, what the smallest possible set of key strengths and skills that this hire must have, inter-team and intra-team dynamics and what elements are missing or need to be bolstered from the team as it stands.  And at the end of the day, there are not nearly as many slots to fill as there are awesome people we’d like to hire.  Not even close.  Having to choose between several differently wonderful people can be *excruciating*.

2.  Diversity.

No, not that kind.  (Yes, we care about cultivating a diverse team and support that goal through our recruiting and hiring processes, but it’s not a factor in our hiring decisions.)  I mean your level, stage in your career, educational background, professional background, trajectory, areas of focus and strengths.  We are trying to build radical new tools for sociotechnical systems; tools that are friendly, intuitive, and accessible to every engineer (and engineering-adjacent profession) in the world.

How well do you think we’re going to do at our goal if the people building it are all ex-Facebook, ex-MIT senior engineers?  If everyone has the exact same reference points and professional training, we will all have the same blind spots.  Even if our team looks like a fucking Benetton ad.

3.  We are assembling a team, not hiring individuals.

We spend at least as much time hashing out what the subtle needs of the team are right IMG_5072now as talking about the individual candidate.  Maybe what we need is a senior candidate who loves mentoring with her whole heart, or a language polyglot who can help unify the look and feel of our integrations across ten different languages and platforms.  Or maybe we have plenty of accomplished mentors, but the team is really lacking someone with expertise in query profiling and db tuning, and we expect this to be a big source of pain in the coming year.  Maybe we realize we have nobody on the team who is interested in management, and we are definitely going to need someone to grow into or be hired on as a manager a year or two from now.

There is no value judgment or hierarchy attached to any of these skills or particulars.  We simply need what we need, and you are who you are.

4.  I am not confident that we can make you successful in this role at this time.

We rarely turn people down for purely technical reasons, because technical skills can be learned.  But there can be some combination of your skills, past experience, geographical location, time zone, experience with working remotely, etc — that just gives us pause.  If we cast forward a year, do we think you are going to be joyfully humming along and enjoying yourself, working more-or-less independently and collaboratively?  If we can’t convince ourselves this is true, for whatever reasons, we are unlikely to hire you.  (But we would love to talk with you again someday.)

5.  The team needs someone operating at a different level.

IMG_4749Don’t assume this always means “you aren’t senior enough”.  We have had to turn down people at least as often for being too senior as not senior enough.  An organization can only absorb so many principal and senior engineers; there just isn’t enough high-level strategic work to go around.  I believe happy, healthy teams are comprised of a range of levels — you need more junior folks asking naive questions that give senior folks the opportunity to explain themselves and catch their dumb mistakes.  You need there to be at least one sweet child who is just so completely stoked to build their very first login page.

A team staffed with nothing but extremely senior developers will be a dysfunctional, bored and contentious team where no one is really growing up or being challenged as they should.

6.  We don’t have the kind of work you need or want.

The first time we tried hiring junior developers, we ran into this problem hardcore.  We simply didn’t have enough entry-level work for them to do.   Everything was frustratingly complex and hard for them, so they weren’t able to operate independently, and we couldn’t spare an engineer to pair with them full time.

This also manifests in other ways.  Like, lots of SREs and data engineers would LOVE to work at honeycomb.  But we don’t have enough ops engineering work or data problems to keep them busy full time.  (Well — that’s not precisely true.  They could probably keep busy.  But it wouldn’t be aligned with our core needs as a business, which makes them premature optimizations we cannot afford.)

7.  Communication skills.

IMG_6114We select highly for communication skills.  The core of our technical interview involves improving and extending a piece of code, then bringing it in the next day to discuss it with your peers.  We believe that if you can explain what you did and why, you can definitely do the work, and the reverse is not necessarily true.  We also believe that communication skills are at the foundation of a team’s ability to learn from its mistakes and improve as a unit.  We value high-performing teams, therefore we select for those skills.

There are many excellent engineers who are not good communicators, or who do not value communication the way we do, and while we may respect you very much, it’s not a great fit for our team.

8.  You don’t actually want to work at a startup.

“I really want to work at a startup.  Also the things that are really important to me are: work/life balance, predictability, high salary, gold benefits, stability, working from 10 to 5 on the dot, knowing what i’ll be working on for the next month, not having things change unexpectedly, never being on call, never needing to think or care about work out of hours …”

To be clear, it is not a red flag if you care about work/life balance.  We care about that too — who the hell doesn’t?  But startups are inherently more chaotic and unpredictable, and roles are more fluid and dynamic, and I want to make sure your expectations are aligned with reality.

9.  You just want to work for women.

I hate it when I’m interviewing someone and I ask why they’re interested in Honeycomb, IMG_3865and they enthusiastically say “Because it was founded by women!”, and I wait for the rest of it, but that’s all there is.  That’s it?  Nothing interests you about the problem, the competitive space, the people, the customers … nothing??  It’s fine if the leadership team is what first caught your eye.  But it’s kind of insulting to just stop there.  Just imagine if somebody asked you out on a date “because you’re a woman”.  Low. Fucking. Bar.

10.   I truly want you to be happy.

I have no interest in making a hard sell to people who are dubious about Honeycomb.  I don’t want to hire people who can capably do the job, but whose hearts are really elsewhere doing other things, or who barely tolerate going to work every day.  I want to join with people who see their labor as an extension of themselves, who see work as an important part of their life’s project.  I only want you to work here if it’s what’s best for you.

11.   I’m not perfect.

IMG_5224We have made the wrong decision before, and will do so again.  >_<

In conclusion…

As a candidate, it is tempting to feel like you will get the job if you are awesome enough, therefore if you do not get the job it must be because you were insufficiently awesome.  But that is not how hiring works — not for highly constrained startups, anyway.

If we brought you in for an interview, we already think you’re awesome.  Period.  Now we’re just trying to figure out if you narrowly intersect the skill sets we are lacking that we need to succeed this year.

If you could be a fly on the wall, listening to us talk about you, the phrase you would hear over and over is not “how good are they?”, but “what will they need to be successful?  can we provide the support they need?”  We know this is as much of a referendum on us as it is on you.  And we are not perfect.

But we are hiring.  ☺️

IMG_5114

charity.

The (Real) 11 Reasons I Don’t Hire You