The (Real) 11 Reasons I Don’t Hire You

(With 🙏 to Joe Beda, whose brilliant idea for a blog post this was.  Thanks for letting me borrow it!)

Interviewing is hard and it sucks.

IMG_8461In theory, it really shouldn’t be.  You’re a highly paid professional and your skills are in high demand.  This ought to be a meeting between equals to mutually explore what a longer-term relationship might look like.  Why take the outcome personally?  There are at least as many reasons for you to decide not to join a company as for the company to decide not to hire you, right?

In reality, of course, all the situational cues and incentives line up to make you feel like the whole thing is a referendum on whether or not you personally are Good Enough (smart enough, senior enough, skilled enough, cool enough) to join their fancy club.

People stay at shitty jobs far, far longer than they ought to, just because interviews can be so genuinely crushing to your spirit and sense of self.  Even when they aren’t the worst, it can leave a lasting sting when they decline to hire you.

But there is an important asymmetry here.  By not hiring someone, I very rarely trashmean it as a rejection of that person.  (Not unless they were, like, mean to the office manager, or directed all their technical questions to the male interviewers.)  On the contrary, I generally hold the people we decline to hire — or have had to let go! — in extremely high opinion.

So if someone interviews at Honeycomb, I do not want them to walk away feeling stung, hurt, or bad about themselves.  I would like them to walk away feeling good about themselves and our interactions, even if one or both of us are disappointed by the outcome.  I want them to feel the same way about themselves as I feel about them, especially since there’s a high likelihood that I may want to work with them in the future.

So here are the real, honest-to-god most common reasons why I don’t hire someone.

1. Scarcity

IMG_7488If you’ve worked at a Google or Facebook before, you may have a certain mental model of how hiring works.  You ask the candidate a bunch of questions, and if they do well enough, you hire them.  This could not be more different from early stage startup hiring, which is defined in every way by scarcity.

I only have a few precious slots to fill this year, and every single one of them is tied to one or more key company initiatives or goals, without which we may fail as a company.  Emily and I spend hours obsessively discussing what the profile we are looking for is, what the smallest possible set of key strengths and skills that this hire must have, inter-team and intra-team dynamics and what elements are missing or need to be bolstered from the team as it stands.  And at the end of the day, there are not nearly as many slots to fill as there are awesome people we’d like to hire.  Not even close.  Having to choose between several differently wonderful people can be *excruciating*.

2.  Diversity.

No, not that kind.  (Yes, we care about cultivating a diverse team and support that goal through our recruiting and hiring processes, but it’s not a factor in our hiring decisions.)  I mean your level, stage in your career, educational background, professional background, trajectory, areas of focus and strengths.  We are trying to build radical new tools for sociotechnical systems; tools that are friendly, intuitive, and accessible to every engineer (and engineering-adjacent profession) in the world.

How well do you think we’re going to do at our goal if the people building it are all ex-Facebook, ex-MIT senior engineers?  If everyone has the exact same reference points and professional training, we will all have the same blind spots.  Even if our team looks like a fucking Benetton ad.

3.  We are assembling a team, not hiring individuals.

We spend at least as much time hashing out what the subtle needs of the team are right IMG_5072now as talking about the individual candidate.  Maybe what we need is a senior candidate who loves mentoring with her whole heart, or a language polyglot who can help unify the look and feel of our integrations across ten different languages and platforms.  Or maybe we have plenty of accomplished mentors, but the team is really lacking someone with expertise in query profiling and db tuning, and we expect this to be a big source of pain in the coming year.  Maybe we realize we have nobody on the team who is interested in management, and we are definitely going to need someone to grow into or be hired on as a manager a year or two from now.

There is no value judgment or hierarchy attached to any of these skills or particulars.  We simply need what we need, and you are who you are.

4.  I am not confident that we can make you successful in this role at this time.

We rarely turn people down for purely technical reasons, because technical skills can be learned.  But there can be some combination of your skills, past experience, geographical location, time zone, experience with working remotely, etc — that just gives us pause.  If we cast forward a year, do we think you are going to be joyfully humming along and enjoying yourself, working more-or-less independently and collaboratively?  If we can’t convince ourselves this is true, for whatever reasons, we are unlikely to hire you.  (But we would love to talk with you again someday.)

5.  The team needs someone operating at a different level.

IMG_4749Don’t assume this always means “you aren’t senior enough”.  We have had to turn down people at least as often for being too senior as not senior enough.  An organization can only absorb so many principal and senior engineers; there just isn’t enough high-level strategic work to go around.  I believe happy, healthy teams are comprised of a range of levels — you need more junior folks asking naive questions that give senior folks the opportunity to explain themselves and catch their dumb mistakes.  You need there to be at least one sweet child who is just so completely stoked to build their very first login page.

A team staffed with nothing but extremely senior developers will be a dysfunctional, bored and contentious team where no one is really growing up or being challenged as they should.

6.  We don’t have the kind of work you need or want.

The first time we tried hiring junior developers, we ran into this problem hardcore.  We simply didn’t have enough entry-level work for them to do.   Everything was frustratingly complex and hard for them, so they weren’t able to operate independently, and we couldn’t spare an engineer to pair with them full time.

This also manifests in other ways.  Like, lots of SREs and data engineers would LOVE to work at honeycomb.  But we don’t have enough ops engineering work or data problems to keep them busy full time.  (Well — that’s not precisely true.  They could probably keep busy.  But it wouldn’t be aligned with our core needs as a business, which makes them premature optimizations we cannot afford.)

7.  Communication skills.

IMG_6114We select highly for communication skills.  The core of our technical interview involves improving and extending a piece of code, then bringing it in the next day to discuss it with your peers.  We believe that if you can explain what you did and why, you can definitely do the work, and the reverse is not necessarily true.  We also believe that communication skills are at the foundation of a team’s ability to learn from its mistakes and improve as a unit.  We value high-performing teams, therefore we select for those skills.

There are many excellent engineers who are not good communicators, or who do not value communication the way we do, and while we may respect you very much, it’s not a great fit for our team.

8.  You don’t actually want to work at a startup.

“I really want to work at a startup.  Also the things that are really important to me are: work/life balance, predictability, high salary, gold benefits, stability, working from 10 to 5 on the dot, knowing what i’ll be working on for the next month, not having things change unexpectedly, never being on call, never needing to think or care about work out of hours …”

To be clear, it is not a red flag if you care about work/life balance.  We care about that too — who the hell doesn’t?  But startups are inherently more chaotic and unpredictable, and roles are more fluid and dynamic, and I want to make sure your expectations are aligned with reality.

9.  You just want to work for women.

I hate it when I’m interviewing someone and I ask why they’re interested in Honeycomb, IMG_3865and they enthusiastically say “Because it was founded by women!”, and I wait for the rest of it, but that’s all there is.  That’s it?  Nothing interests you about the problem, the competitive space, the people, the customers … nothing??  It’s fine if the leadership team is what first caught your eye.  But it’s kind of insulting to just stop there.  Just imagine if somebody asked you out on a date “because you’re a woman”.  Low. Fucking. Bar.

10.   I truly want you to be happy.

I have no interest in making a hard sell to people who are dubious about Honeycomb.  I don’t want to hire people who can capably do the job, but whose hearts are really elsewhere doing other things, or who barely tolerate going to work every day.  I want to join with people who see their labor as an extension of themselves, who see work as an important part of their life’s project.  I only want you to work here if it’s what’s best for you.

11.   I’m not perfect.

IMG_5224We have made the wrong decision before, and will do so again.  >_<

In conclusion…

As a candidate, it is tempting to feel like you will get the job if you are awesome enough, therefore if you do not get the job it must be because you were insufficiently awesome.  But that is not how hiring works — not for highly constrained startups, anyway.

If we brought you in for an interview, we already think you’re awesome.  Period.  Now we’re just trying to figure out if you narrowly intersect the skill sets we are lacking that we need to succeed this year.

If you could be a fly on the wall, listening to us talk about you, the phrase you would hear over and over is not “how good are they?”, but “what will they need to be successful?  can we provide the support they need?”  We know this is as much of a referendum on us as it is on you.  And we are not perfect.

But we are hiring.  ☺️

IMG_5114

charity.

The (Real) 11 Reasons I Don’t Hire You

Love (and Alerting) in the Time of Cholera (and Observability)

I made a vow this year to post one blog post a month, then I didn’t post anything at all from May to September.  I have some catching up to do.  😑   I’ve also been meaning to transcribe some of the twitter rants that I end up linking back to into blog posts, so if Graph Everything, Kittensthere’s anything you especially want me to write about, tell me now while I’m in repentance mode.

This is one request I happened to make a note of because I can’t believe I haven’t already written it up!  I’ve been saying the same thing over and over in talks and on twitter for years, but apparently never a blog post.

The question is: what is the proper role of alerting in the modern era of distributed systems?  Has it changed?  What are the updated best practices for alerting?

It’s a great question.  I want to wax philosophically about some stuff, but first let me briefly outline the way to modernize your alerting best practices:

  1. implement observability
  2. implement SLOs and/or end-to-end checks that traverse key code paths and correlate to user-impacting events
  3. create a secondary channel (tasks, ticketing system, whatever) for “things that on call should look at soon, but are not impacting users yet” which does not page anyone, but which on call is expected to look at (at least) first thing in the morning, last thing in the evening, and midday
  4. move as many paging alerts as possible to the secondary channel, by engineering your services to auto-remediate or run in degraded mode until they can be patched up
  5. wake people up only for SLOs and health checks that correlate to user-impacting events

Or, in an even shorter formulation: delete all your paging alerts, then page only on e2e alerts that mean users are in pain.  Rely on debugging tools for debugging, and paging only when users are in pain.

To understand why I advocate deleting all your paging alerts, and when it’s safe to delete them, first we need to understand why have we accumulated so many crappy paging alerts over the years.

Monoliths, LAMP stacks, and death by pagebomb

Here, let’s crib a couple of slides from one of my talks on observability.  Here are the characteristics of older monolithic LAMP-stack style systems, and best practices for running them:

 

The sad truth is, that when all you have is time series aggregates and traditional monitoring dashboards, you aren’t really debugging with science so much as you are relying on your gut and a handful of dashboards, using intuition and scraps of data to try and reconstruct an impossibly complex system state.

This works ok, as long as you have a relatively limited set of failure scenarios that happen over and over again.  You can just pattern match from past failures to current data, and most of the time your intuition can bridge the gap correctly.  Every time there’s Graph Everything Unicorn 2x2an outage, you post mortem the incident, figure out what happened, build a dashboard “to help us find the problem immediately next time”, create a detailed runbook for how to respond to it, and (often) configure a paging alert to detect that scenario.

Over time you build up a rich library of these responses.  So most of the time when you get paged you get a cluster of pages that actually serves to help you debug what’s happening.  For example, at Parse, if the error graph had a particular shape I immediately knew it was a redis outage.  Or, if I got paged about a high % of app servers all timing out in a short period of time, I could be almost certain the problem was due to mysql connections.  And so forth.

Things fall apart; the pagebomb cannot stand

However, this model falls apart fast with distributed systems.  There are just too many failures.  Failure is constant, continuous, eternal.  Failure stops being interesting.  It has to stop being interesting, or you will die.

 

 

 

Instead of a limited set of recurring error conditions, you have an infinitely long list of things that almost never happen …. except that one time they do.  If you invest your time into runbooks and monitoring checks, it’s wasted time if that edge case never happens again.

Frankly, any time you get paged about a distributed system, it should be a genuinely new failure that requires your full creative attention.  You shouldn’t just be checking your phone, going “oh THAT again”, and flipping through a runbook.  Every time you get paged it should be genuinely new and interesting.

And thus you should actually have drastically fewer paging alerts than you used to.

A better way: observability and SLOs.

Instead of paging alerts for every specific failure scenario, the technically correct answer is to define your SLOs (service level objectives) and page only on those, i.e. when you are going to run out of budget ahead of schedule.  But most people aren’t yet operating at this level of sophistication.  (SLOs sound easy, but are unbelievably challenging to do well; many great teams have tried and failed.  This is why we have built an SLO feature into Honeycomb that does the heavy lifting for you.  Currently alpha testing with users.)

If you haven’t yet caught the SLO religion, the alternate answer is that “you should only page on high level end-to-end alerts, the ones which traverse the code paths that make you money and correspond to user pain”.  Alert on the three golden signals: request rate, latency, and errors, and make sure to traverse every shard and/or storage type in your critical path.

That’s it.  Don’t alert on the state of individual storage instances, or replication, or anything that isn’t user-visible.

(To be clear: by “alert” I mean “paging humans at any time of day or night”.  You might reasonably choose to page people during normal work hours, but during sleepy hours most errors should be routed to a non-paging address.  Only wake people up for actual user-visible problems.)

Here’s the thing.  The reason we had all those paging alerts was because we depended on them to understand our systems.

Once you make the shift to observability, once you have rich instrumentation and the ability to swiftly zoom in from high level “there might be a problem” to identifying specifically what the errors have in common, or the source of the problem — you no longer need to lean on that scattershot bunch of pagebombs to understand your systems.  You should be able to confidently ask any question of your systems, understand any system state — even if you have never encountered it before.

With observability, you debug by systematically following the trail of crumbs back to their source, whatever that is.  Those paging alerts were a crutch, and now you don’t need them anymore.

Everyone is on call && on call doesn’t suck.

I often talk about how modern systems require software ownership.  The person who is writing the software, who has the original intent in their head, needs to shepherd that code out into production and watch real users use it.  You can’t chop that up into multiple roles, dev and ops.  You just can’t.  Software engineers working on highly available systems need to be on call for their code.Graph Unicorn 4_x4_

But the flip side of this responsibility belongs to management.  If you’re asking everyone to be on call, it is your sworn duty to make sure that on call does not suck.  People shouldn’t have to plan their lives around being on call.  People shouldn’t have to expect to be woken up on a regular basis.  Every paging alert out of hours should be as serious as a heart attack, and this means allocating real engineering resources to keeping tech debt down and noise levels low.

And the way you get there is first invest in observability, then delete all your paging alerts and start over from scratch.

It works.  It really does. 🌈

 

 

Love (and Alerting) in the Time of Cholera (and Observability)

17 Reasons NOT To Be A Manager

Yesterday we had a super fun meetup here at Intercom in Dublin.  We split up into small discussion groups and talked about things related to managing teams and being a senior individual contributor (IC), and going back and forth throughout your career.

One interesting question that came up repeatedly was: “what are some reasons that someone might not want to be a manager?”

Fascinatingly, I heard it asked over the full range of tones from extremely positive (“what kind of nutter wouldn’t want to manage a team?!”) to extremely negative (“who would ever want to manage a team?!”).  So I said I would write a piece and list some reasons.

Point of order: I am going to focus on intrinsic reasons, not external ones.  There are lots of toxic orgs where you wouldn’t want to be a manager for many reasons — but that list is too long and overwhelming, and I would argue you probably don’t want to work there in ANY capacity.  Please assume the surroundings of a functional, healthy org (I know, I know — whopping assumption).

1. You love what you do.

Never underestimate this one, and never take it for granted.  If you look forward to work and even miss it on vacation; if you occasionally leave work whistling with delight and/or triumph; if your brain has figured out how to wring out regular doses of dopamine and serotonin while delivering ever-increasing value; if you look back with pride at what you have learned and built and achieved, if you regularly tap into your creative happy place … hell, your life is already better than 99.99% of all the humans who have ever labored and lived.  Don’t underestimate the magnitude of your achievement, and don’t assume it will always be there waiting for you to just pick it right back up again.

2. It is easy to get a new engineering job.  Really, really easy.

Getting your first gig as an engineer can be a challenge, but after that?  It is possibly easier for an experienced engineer to find a new job than anyone else on the planet. There is so much demand this skill set that we actually complain about how annoying it is being constantly recruited!  Amazing.

It is typically harder to find a new job as a manager.  If you think interview processes for engineers are terrible (and they are, honey), they are even weirder and less predictable (and more prone to implicit bias) for managers.  So much of manager hiring is about intangibles like “culture fit” and “do I like you” — things you can’t practice or study or know if you’ve answered correctly.  And soooo much of your skill set is inevitably bound up in navigating the personalities and bureaucracies of particular teams and a particular company.  A manager’s effectiveness is grounded in trust and relationships, which makes it much less transferrable than engineering skills.

3. There are fewer management jobs.

I am not claiming it is equally trivial for everyone to get a new job; it can be hard if you live in an out-of-the-way place, or have an unusual skill, etc.  But in almost every case, it becomes harder if you’re a manager.  Besides — given that the ratio of engineers to line managers is roughly 7 to one — there will be almost an order of magnitude fewer eng manager jobs than engineering jobs.

4. Manager jobs are the first to get cut.

Engineers (in theory) add value directly to the bottom line.  Management is, to be brutally frank, overhead.  Middle management is often the first to be cut during layoffs

Remember how I said that creation is the engineering superpower?  That’s a nicer way of saying that managers don’t directly create any value.  They may indirectly contribute to increased value over time — the good ones do — but only by working through other people as a force multiplier, mentor etc.  When times get tough, you don’t cut the people who build the product, you cut the ones whose value-added is contingent or harder to measure.

Another way this plays out is when companies are getting acquired.  As a baseline for acquihires, the acquiring company will estimate a value of $1 million per engineer, then deduct $500k for every other role being acquired.  Ouch.

5. Managers can’t really job hop.

Where it’s completely normal for an engineer to hop jobs every 1-3 years, a manager who does this will not get points for learning a wide range of skills, they’ll be seen as “probably difficult to work with”.  I have no data to support this, but I suspect the job tenure of a successful manager is at least 2-3x as long as that of a successful IC.  It takes a year or two just to gain the trust of everyone on your team and the adjacent teams, and to learn the personalities involved in navigating the organization.  At a large company, it may take a few times that long.  I was a manager at Facebook for 2.5 years and I still learned some critical new detail about managing teams there on a weekly basis.  Your value to the org really kicks in after a few years have gone by, once a significant part of the way things get done resides in your cranium.

6) Engineers can be little shits.

You know the type.  Sneering about how managers don’t do any “real work”, looking down on them for being “less technical”.  Basically everyone who utters the question “.. but how technical are they?” in that particular tone of voice is a shitbird.  Hilariously, we had a great conversation about whether a great manager needs to be technical or not — many people sheepishly admitted that the best managers they had ever had knew absolutely nothing about technology, and yet they gave managers coding interviews and expected them to be technical.  Why?  Mostly because the engineers wouldn’t respect them otherwise.

7.  As a manager, you will need to have some hard conversations.  Really, really hard ones.

Do you shy away from confrontation?  Does it seriously stress you out to give people feedback they don’t want to hear?  Manager life may not be for you.  There hopefully won’t be too many of these moments, but when they do happen, they are likely to be of outsized importance.  Having a manager who avoids giving critical feedback can be  really damaging, because it deprives you of the information you need to make course corrections before the problem becomes really big and hard.

8)  A manager’s toolset is smaller than you think.

As an engineer, if you really feel strongly about something, you just go off and do it yourself.  As a manager, you have to lead through influence and persuasion and inspiring other people to do things.  It can be quite frustrating.  “But can’t I just tell people what to do?” you might be thinking.  And the answer is no.  Any time you have to tell someone what to do using your formal authority, you have failed in some way and your actual influence and power will decrease.  Formal authority is a blunt, fragile instrument.

9) You will get none of the credit, and all of the blame.

When something goes well, it’s your job to push all the credit off onto the people who did the work.  But if you failed to ship, or and, or hire, or whatever?  The responsibility is all on you, honey.

10)  Use your position as an IC to bring balance to the Force.

I LOVE working in orgs where ICs have power and use their voices.  I love having senior ICs around who model that, who walk around confidently assuming that their voice is wanted and needed in the decision-making process.  If your org is not like that, do you know who is best positioned to shift the balance of power back?  Senior ICs, with some behind-the-scenes support from managers.  For this reason, I am always a little sad when a vocal, powerful IC who models this behavior transitions to management.  If ALL of the ICs who act this way become managers, it sends a very dismaying message to the ranks — that you only speak up if you’re in the process of converting to management.

11)  Management is just a collection of skills, and you should be able to do all the fun ones as an IC.

Do you love mentoring?  Interviewing, constructing hiring loops, defining the career ladder?  Do you love technical leadership and teaching other people, or running meetings and running projects?  Any reasonably healthy org should encourage all senior ICs to participate and have leadership roles in these areas.  Management can be unbundled into a lot of different skills and roles, and the only ones that are necessarily confined to management are the shitty ones, like performance reviews and firing people.  I LOVE it when an engineer expresses the desire to start learning more management skills, and will happily brainstorm with them on next steps — get an intern? run team meetings?  there are so many things to choose from!  When I say that all engineers should try management at some point in their career, what I really mean is these are skills that every senior engineer should develop.  Or as Jill says:

12) Joy is much harder to come by.

That dopamine drip in your brain from fixing problems and learning things goes away, and it’s … real tough.  This is why I say you need to commit to a two year stint if you’re going to try management: that, plus it takes that long to start to get your feet under you and is hard on your team if they’re switching managers all the time.  It usually takes a year or two to rewire your brain to look for the longer timeline, less intense rewards you get from coaching other people to do great things.  For some of us, it never does kick in.  It’s genuinely hard to know whether you’ve done anything worth doing.

13) It will take up emotional space at the expense of your personal life.

When I was an IC, I would work late and then go out and see friends or meet up at the pub almost every night.  It was great for my dating life and social life in general.  As a manager, I feel like curling up in a fetal position and rolling home around 4 pm.  I’m an introvert, and while my capacity has increased a LOT over the past several years, I am still sapped every single day by the emotional needs of my team.

14) Your time doesn’t belong to you.

It’s hard to describe just how much your life becomes not your own.

15) Meetings.

16) If technical leadership is what your heart loves most, you should NOT be a manager.

If you are a strong tech lead and you convert to management, it is your job to begin slowly taking yourself out of the loop as tech lead and promoting others in your place.  Your technical skills will stop growing at the point that you switch careers, and will slowly decay after that.  Moreover, if you stay on as tech lead/manager you will slowly suck all the oxygen from the room.  It is your job to train up and hand over to your replacements and gradually step out of the way, period.

17) It will always be there for you later.

In conclusion

Given all this, why should ANYONE ever be a manager?  Shrug.  I don’t think there’s any one good or bad answer.  I used to think a bad answer would be “to gain power and influence” or “to route around shitty communication systems”, but in retrospect those were my reasons and I think things turned out fine.  It’s a complex calculation.  If you want to try it and the opportunity arises, try it!  Just commit to the full two year experiment, and pour yourself into learning it like you’re learning a new career — since, you know, you are.

But please do be honest with yourself.  One thing I hate is when someone wants to be a manager, and I ask why, and they rattle off a list of reasons they’ve heard that people SHOULD want to become managers (“to have a greater impact than I can with just myself, because I love helping other people learn and grow, etc”) but I am damn sure they are lying to themselves and/or me.

Introspection and self-knowledge are absolutely key to being a decent manager, and lord knows we need more of those.  So don’t kick off your grand experiment by lying to yourself, ok?

 

17 Reasons NOT To Be A Manager

Friday Deploy Freezes Are Exactly Like Murdering Puppies

VOICEOVER: “Previously, on twitter …”

So, that happened.

I hadn’t seen anyone say something like this in quite a while.  I remember saying things like this myself as recently as, oh, 2016, but I thought the zeitgeist had moved on to continuous delivery.

Which is not to say that Friday freezes don’t happen anymore, or even that they shouldn’t; I just thought that this was no longer seen as a badge of responsibility and honor, rather a source of mild embarrassment.  (Much like the fact that you still don’t automatedly restore your db backups and verify them every night.  Do you.)

So I responded with an equally hyperbolic and indefensible claim:

Now obviously, OBVIOUSLY, reassigning all your developer cycles is probably a terrible idea.  You don’t get 100x parallel efficiency if you put 100 developers on a single problem.  So I thought it was clear that this said somewhat tongue in cheek, serious-but-not-really.  I was wrong there too.

So let me explain.

There’s nothing morally “wrong” with Friday freezes.  But it is a costly and cumbersome bandage for a problem that you would be better served to address directly.  And if your stated goal is to protect people’s off hours, this strategy is likely to sabotage that goal and cause them to waste far more time and get woken up much more often, and it stunts your engineers’ technical development on top of that.

Fear is the mind-killer.

Fear of deploys is the ultimate technical debt.  How much time does your company waste, between engineers:

  • waiting until it is “safe” to deploy,
  • batching up changes into bigger changes that are decidedly unsafe to deploy,
  • debugging broken deploys that had many changes batched into them,Does Not Kill Us Puppy UPDATED
  • waiting nervously to get paged after a deploy goes out,
  • figuring out if now is a good time to deploy or not,
  • cleaning up terrible deploy-related catastrophuckes

Anxiety related to deploys is the single largest source of technical debt in many, many orgs.  Technical debt, lest we forget, is not the same as “bad code”.  Tech debt hurts your people.

Saying “don’t push to production” is a code smell.  Hearing it once a month at unpredictable intervals is concerning.  Hearing it EVERY WEEK for an ENTIRE DAY OF THE WEEK should be a heartstopper alarm.  If you’ve been living under this policy you may be numb to its horror, but just because you’re used to hearing it doesn’t make it any less noxious.

If you’re used to hearing it and saying it on a weekly basis, you are afraid of your deploys and you should fix that.

If you are a software company, shipping code is your heartbeat.  Shipping code should be as reliable and sturdy and fast and unremarkable as possible, because this is the drumbeat by which value gets delivered to your org.

Deploys are the heartbeat of your company.

Every time your production pipeline stops, it is a heart attack.  It should not be ok to go around nonchalantly telling people to halt the lifeblood of their systems based on something as pedestrian as the day of the week.

Why are you afraid to push to prod?  Usually it boils down to one or more factors:

  • your deploys frequently break, and require manual intervention just to get to a good state
  • your test coverage is not good, your monitoring checks are not good, so you rely on users to report problems back to you and this trickles in over daysfaith
  • recovering from deploys gone bad can regularly cause everything to grind to a halt for hours or days while you recover, so you don’t want to even embark on a deploy without 24 hours of work day ahead of you
  • your deploys are painfully slow, and take hours to run tests and go live.

These are pretty darn good reasons.  If this is the state you are in, I totally get why you don’t want to deploy on Fridays.  So what are you doing to actively fix those states?  How long do you think these emergency controls will be in effect?

The answers of “nothing” and “forever” are unacceptable.  These are eminently fixable problems, and the amount of drag they create on your engineering team and ability to execute are the equivalent of five-alarm fires.

Fix. That.  Take some cycles off product and fix your fucking deploy pipeline.

If you’ve been paying attention to the DORA report or Accelerate, you know that the way you address the problem of flaky deploys is NOT by slowing down or adding roadblocks and friction, but by shipping more QUICKLY.

Science says: ship fast, ship often.

Deploy on every commit.  Smaller, coherent changesets transform into debuggable, understandable deploys.  If we’ve learned anything from recent research, it’s that velocity of deploys and lowered error rates are not in tension with each other, they actually reinforce each other.  When one gets better, the other does too.

So by slowing down or batching up or pausing your deploys, you are materially contributing to the worsening of your own overall state.

If you block devs from merging on Fridays, then you are sacrificing a fifth of your velocity and overall output.  That’s a lot of fucking output.Screen Shot 2019-02-05 at 7.02.43 AM

If you do not block merges on Fridays, and only block deploys, you are queueing up a bunch of changes to all get shipped days later, long after the engineers wrote the code and have forgotten half of the context.  Any problems you encounter will be MUCH harder to debug on Monday in a muddled blob of changes than they would have been just shipping crisply, one at a time on Friday.  Is it worth sacrificing your entire Monday?  Monday-Tuesday?  Monday-Tuesday-Wednesday?

Good judgment matters more than rules.

I am not saying that you should make a habit of shipping a large feature at 4:55 pm on Friday and then sauntering out the door at 5.  For fucks sake.  Every engineer needs to learn and practice good technical judgment around deploy hygiene.  LIke,

  • icecream_ninesDon’t ship before you walk out the door on *any* day.
  • Don’t ship big, gnarly features right before the weekend, if you aren’t going to be around to watch them.
  • Instrument your code, and go and LOOK at the damn thing once it’s live.
  • Use feature flags and other tools that separate turning on code paths from deploys.

But you don’t need rules for this; in fact, rules actually inhibit the development of good judgment!

Most deploy-related problems are readily obvious, if the person who has the context for the change in their heads goes and looks at it.

But if you aren’t looking for them, then sure — you probably won’t find out until user reports start to trickle in over the next few days.

So go and LOOK.

Stop shipping blind.  Actually LOOK at what you ship.

I mean, if it takes 48 hours for a bug to show up, then maybe you better freeze deploys on Thursdays too, just to be safe!  🙄

I get why this seems obvious and tempting.  The “safety” of nodeploy Friday is realized immediately, while the costs are felt later later.  They’re felt when you lose Monday (and Tuesday) to debugging the big blob deplly.  Or they get amortized out over time.  Or you experience them as sluggish ship rates and a general culture of fear and avoidance, or learned helplessness, and the broad acceptance of fucked up situations as “normal”.

But if recovering from deploys is long and painful and hard, then you should fix that.  If you don’t tend to detect reliability events until long after the event, you should fix that.  If people are regularly getting paged on Saturdays and Sundays, they are probably getting paged throughout the night, too.  You should fix that.

On call paging events should be extremely rare.  There’s no excuse for on call being something that significantly impacts a person’s life on the regular.  None.Root Causes Dream Bunny 4x4

I’m not saying that every place is perfect, or that every company can run like a tech startup.  I am saying that deploy tooling is systematically underinvested in, and we abuse people far too much by paging them incessantly and running them ragged, because we don’t actually believe it can be any better.

It can.  If you work towards it.

Devote some real engineering hours to your deploy pipeline, and some real creativity to your processes, and someday you too can lift the Friday ban on deploys and relieve your oncall from burnout and increase your overall velocity and productivity.

On virtue signaling

Finally, I heard from a alarming number of people who admitted that Friday deploy bans were useless or counterproductive, but they supported them anyway as a purely symbolic gesture to show that they supported work/life balance.

This makes me really sad.  I’m … glad they want to support work/life balance, but surely we can come up with some other gestures that don’t work directly counter to their goals of  life/work balance.

Recovery: building a healthy deploy culture

Ways to begin recovering from a toxic deploy culture:

  • Have a deploy philosophy, make sure everybody knows what it is.  Be consistent.
  • Build and deploy on every set of committed changes.  Do not batch up multiple people’s commits into a deploy.
  • Train every engineer so they can run their own deploys, if they aren’t fully automated.  Make every engineer responsible for their own deploys.
  • (Work towards fully automated deploys.)
  • Every deploy should be owned by the developer who made the changes that are rolling out.  Page the person who committed the change that triggered the deploy, not whoever is oncall.
  • Set expectations around what “ownership” means.  Provide observability tooling so they can break down by build id and compare the last known stable deploy with the one rolling out.
  • Never accept a diff if there’s no explanation for the question, “how will you know Graph Everything, Kittenswhen this code breaks?  how will you know if the deploy is not behaving as planned?”  Instrument every commit so you can answer this question in production.
  • Shipping software and running tests should be fast.  Super fast.  Minutes, tops.
  • It should be muscle memory for every developer to check up on their deploy and see if it is behaving as expected, and if anything else looks “weird”.
  • Practice good deploy hygiene using feature flags.  Decouple deploys from feature releases.  Empower support and other teams to flip flags without involving engineers.

Each deploy should be owned by the developer who made the code changes.  But your deploy pipeline needs to have a team that owns it too.  I recommend putting your most experienced, senior developers on this problem to signal its high value.

You can find more tips for boring deploys in my piece on why shipping software should not be scary.

Good teams ship often.

Ultimately, I am not dogmatic about Friday deploys.  Truly, I’m not.  If that’s the only lever you have to protect your time, use it.  But call it and treat it like the hack it is.  It’s a gross workaround, not an ideal state.

Don’t let your people settle into the idea that it’s some kind of moral stance instead of a butt-ugly hack.  Because if you do you will never, ever get rid of it.

Remember: a team’s maturity and efficiency can be represented by how long it takes to get their shit into users’ hands after they write it.  Ship it fast, while it’s still fresh in your developers’ heads.  Ship one change set at a time, so you can swiftly debug and revert them.  I promise your lives will be so much better.  Every step helps.  ❤

charity.

IMG_3768

Friday Deploy Freezes Are Exactly Like Murdering Puppies

On pain, careers, and doing things the hard way.

Part 1

Seven years ago I was working on backend infra for mobile apps at Parse, resenting MongoDB and its accursed single write lock per replica with all my dirty, blackened soul.  That’s when Miles Ward asked me to give a customer testimonial for MongoDB at AWS reinvent.

It was my first time EVER speaking in public, and I had never been more terrified.  I have always been a writer, not a talker, and I was pathologically afraid of speaking in public, or even having groups of people look at me.  I scripted every word, memorized my lines, even printed it all out just in case my laptop didn’t work.  I had nightmares every night.  For three months I woke up every night in a cold sweat, shaking.

And I bombed, completely and utterly.  The laptop DIDN’T work, my limbs and tongue froze, I was shaking so badly I could hardly read my printout, and after I rushed through the last sentences I turned and stumbled robotically off the stage, fully unaware that people were raising their hands and asking questions.  I even tripped over the microphone cord in my haste to escape the stage.

Afterwards I burned with unpleasantries — fear, anger, humiliation, rage at being so bad at anything.  It was excruciating.  For the next two years I sought out every opportunity I could get to talk at a meetup, conference, anything.  I got a prescription for propranolol to help manage the physical symptoms of panic.   I gave 17 more talks that year, spending most nights and weekends working on them or rehearsing, and 21 the year after that.  I hated every second of it.

I hated it, but I burned up my fear and aversion as fuel.  Until around 18 months later, when I realized that I no longer had nightmares and had forgotten to pack my meds for a conference.  I brute forced my way through to the other side, and public speaking became just an ordinary skill or a tool like any other.

part 2

I was on a podcast last week where the topic was career journeys.  They asked me what piece of career advice I would like to give to people.  I promptly said that following your bliss is nice, but I think it’s important to learn to lean into pain.

“Pain is nature’s teacher,” I said.  Feedback loops train us every day, mostly unconsciously.  We feel aversion for pain, and we enjoy dopamine hits, and out of those and other brain chemicals our habits are made.  All it takes is a little tolerance for discomfort and a some conscious tweaking of those feedback loops, and you can train yourself to achieve big things without even really trying.

But then I hesitated.  Yes, leaning in to pain has done well for me in my career.  But that is not the whole story, it leaves off some important truths.  It has also hurt me and held me back.

Misery is not a virtue.  Pain is awful.  That’s why it’s so powerful and primal.  It’s a pre-conscious mechanism, an acute response that kicks in long before your conscious mind.  Even just the suggestion of pain (or memory of past trauma) will train you to twist and contort around to avoid it.

When you are in pain, your horizons shrink.  Your vision narrows, you curl inward. You have to expend enormous amounts of energy just moving forward through the day inch by inch.

Everything is hard when you’re in pain.  Your creative brain shuts down.  Basic life functions become impossible tests.  You have to spend so much time compensating for your reduced capacity that learning new things is nearly impossible.  You can’t pick up on subtle signals when your nerves are screaming in agony.  And you grow numb over time, as they die off from sheer exhaustion.

part 3

I am no longer the CEO of honeycomb.

I never wanted to be CEO; I always fiercely wanted a technical role.  But it was a matter of company survival, and I did my best.  I wasn’t a great CEO, although we did pretty well at the things I am good at or care about.  But I couldn’t expand past them.

I hated every second of it.  I cried every single day for the first year and a half.  I tried to will myself into loving a role I couldn’t stand, tried to brute force my way to success like I always do.  It didn’t get better.  My ability to be present and curious and expansive withered.  I got numb.

Turns out not every problem can be powered through on a high pain tolerance.  The collateral damage starts to rack up.  Sometimes the only way to succeed is to redefine success.

Pain is a terrific teacher, but pain is an acute response.  Chronic pain will hijack your reward pathways, your perspective, your relationships, and every other productive system and leave them stunted.

Leaning in to pain can be powerful if you have the agency and ability to change it, or practice it to mastery, or even just adapt your own emotional responses to it.  If you don’t or you can’t, leaning in to pain will kill you.  Having the wisdom to know the difference is everything.  Or so I’m learning.

From here on out I’ll be in the CTO seat.  I don’t know what that even means yet, but I guess we’ll find out.  Stay tuned.  ❤

charity

img_7678

On pain, careers, and doing things the hard way.

Outsource Your O11y: Now Roll It Out And Keep Them Happy (part 3/3)

This is part three of a three-part series of guest posts:

  1. How To Be A Champion, on how to choose a third-party vendor and champion them successfully to your security team.  (George Chamales)
  2. Get Aligned With Security, how to work with your security team to find the best possible outcome for all sides (Lilly Ryan)
  3. Now Roll It Out And Keep Them Happy, on how to operationalize your service by rolling out the integration and maintaining it — and the relationship with your security team — over the long run (Andy Isaacson)

All this pain will someday be worth it.  🙏❤️  charity + friends


“Now Roll It Out And Keep Them Happy”

This is the third in a series of blog posts; previously we analyzed the security challenges of using a third party service, and we worked together with the security team to build empathy to deliver the project.  You might want to read those first, since we are going to build on a lot of the ideas there to ship and maintain this integration.

Ready for launch

You’ve convinced the security team and other stakeholders, you’ve gotten the integration running, you’re getting promising results from dev-test or staging environments… now it’s time to move from proof-of-concept to full implementation.  Depending on your situation this might be a transition from staging to production, or it might mean increasing a feature flipper flag from 5% to 100%, or it might mean increasing coverage of an integration from one API endpoint to cover your entire developer footprint.

Taking into account Murphy’s Law, we expect that some things will go wrong during the rollout.  Perhaps during coverage, a developer realizes that the schema designed to handle the app’s event mechanism can’t represent a scenario, requiring a redesign or a hacky solution.  Or perhaps the metrics dashboard shows elevated error rates from the API frontend, and while there’s no smoking gun, the ops oncall decides to rollback the integration Just In Case it’s causing the incident.

This gives us another chance to practice empathy — while it’s easy, wearing the champion hat, to dismiss any issues found by looking for someone to blame, ultimately this poisons trust within your organization and will hamper success.  It’s more effective, in the long run (and often even in the short run), to find common ground with your peers in other disciplines and teams, and work through to solutions that satisfy everybody.

Keeping the lights on

In all likelihood as integration succeeds, the team will rapidly develop experts and expertise, as well as idiomatic ways to use the product.  Let the experts surprise you; folks you might not expect can step up when given a chance.  Expertise flourishes when given guidance and goals; as the team becomes comfortable with the integration, explicitly recognize a leader or point person for each vendor relationship.  Having one person explicitly responsible for a relationship lets them pay attention to those vendor emails, updates, and avoid the tragedy of the “but I thought *you* were” commons.  This Integration Lead is also a center of knowledge transfer for your organization — they won’t know everything or help every user come up to speed, but they can help empower the local power users in each team to ramp up their teams on the integration.

As comfort grows you will start to consider ways to change your usage, for example growing into new kinds of data.  This is a good time to revisit that security checklist — does the change increase PII exposure to your vendor?  Would the new data lead to additional requirements such as per-field encryption?  Don’t let these security concerns block you from gaining valuable insight using the new tool, but do take the chance to talk it over with your security experts as appropriate.

Throughout this organic growth, the Integration Lead remains core to managing your changing profile of usage of the vendor they shepherd; as new categories of data are added to the integration, the Lead has responsibility to ensure that the vendor relationship and risk profile are well matched to the needs that the new usage (and presumably, business value) is placing on the relationship.

Documenting the Intergation Lead role and responsibilities is critical. The team should know when to check in, and writing it down helps it happen.  When new code has a security implication, or a new use case potentially amplifies the cost of an integration, bringing the domain expert in will avoid unhappy surprises.  Knowing how to find out who to bring in, and when to bring them in, will keep your team getting the right eyes on their changes.

Security threats and other challenges change over time, too.  Collaborating with your security team so that they know what systems are in use helps your team take note of new information that is relevant to your business. A simple example is noting when your vendors publish a breach announcement, but more complex examples happen too — your vendor transitions cloud providers from AWS to Azure and the security team gets an alert about unexpected data flows from your production cluster; with transparency and trust such events become part of a routine process rather than an emergency.

It’s all operational

Monitoring and alerting is a fact of operations life, and this has to include vendor integrations (even when the vendor integration is a monitoring product.)  All of your operations best practices are needed here — keep your alerts clean and actionable so that you don’t develop pager fatigue, and monitor performance of the integration so that you don’t get blindsided by a creeping latency monster in your APIs.

Authentication and authorization are changing as the threat landscape evolves and industry moves from SMS verification codes to U2F/WebAuthn.  Does your vendor support your SSO integration?  If they can’t support the same SSO that you use everywhere else and can’t add it — or worse, look confused when you mention SSO — that’s probably a sign you should consider a different vendor.

A beautiful sunset

Have a plan beforehand for what needs to be done should you stop using the service.  Got any mobile apps that depend on APIs that will go away or start returning permission errors?  Be sure to test these scenarios ahead of time.

What happens at contract termination to data stored on the service?  Do you need to explicitly delete data when ceasing use?

Do you need to remove integrations from your systems before ending the commercial relationship, or can the technical shutdown and business shutdown run in parallel?

In all likelihood these are contingency plans that will never be needed, and they don’t need to be fully fleshed out to start, but a little bit of forethought can avoid unpleasant surprises.

Year after year

Industry best practice and common sense dictate that you should revisit the security questionnaire annually (if not more frequently). Use this chance to take stock of the last year and check in — are you getting value from the service?  What has changed in your business needs and the competitive landscape? 

It’s entirely possible that a new year brings new challenges, which could make your current vendor even more valuable (time to negotiate a better contract rate!) or could mean you’d do better with a competing service.  Has the vendor gone through any major changes?  They might have new offerings that suit your needs well, or they may have pivoted away from the features you need. 

Check in with your friends on the security team as well; standards evolve, and last year’s sufficient solution might not be good enough for new requirements.

 

Andy thinks out loud about security, society, and the problems with computers on Twitter.


 

❤️ Thanks so much reading, folks.  Please feel free to drop any complaints, comments, or additional tips to us in the comments, or direct them to me on twitter.

Have fun!  Stay (a little bit) Paranoid!!

— charity

img_6772

Outsource Your O11y: Now Roll It Out And Keep Them Happy (part 3/3)

Outsource Your O11y: Get Aligned With Security (part 2/3)

This is part two of a three-part series of guest posts:

  1. How To Be A Champion, on how to choose a third-party vendor and champion them successfully to your security team.  (George Chamales)
  2. Get Aligned With Security, how to work with your security team to find the best possible outcome for all sides (Lilly Ryan)
  3. Now Roll It Out And Keep Them Happy, on how to operationalize your service by rolling out the integration and maintaining it — and the relationship with your security team — over the long run (Andy Isaacson)

All this pain will someday be worth it.  🙏❤️  charity + friends


“Get Aligned With Security”

by Lilly Ryan

If your team has decided on a third-party service to help you gather data and debug product issues, how do you convince an often overeager internal security team to help you adopt it?

When this service is something that provides a pathway for developers to access production data, as analytics tools often do, making the case for access to that data can screech to a halt at the mention of the word “production”. Progressing past that point will take time, empathy, and consideration.

I have been on both sides of the “adopting a new service” fence: as a developer hoping to introduce something new and useful to our stack, and now as a security professional who spends her days trying to bust holes in other people’s setups. I understand both sides of the sometimes-conflicting needs to both ship software and to keep systems safe.  

This guide has advice to help you solve the immediate problem of choosing and deploying a third-party service with the approval of your security team.  But it also has advice for how to strengthen the working relationship between your security and development teams over the longer term. No two companies are the same, so please adapt these ideas to fit your circumstances.

Understanding the security mindset

The biggest problems in technology are never really about technology, but about people. Seeing your security team as people and understanding where they are coming from will help you to establish empathy with them so that both of you want to help each other get what you want, not block each other.

First, understand where your security team is coming from. Development teams need to build features, improve the product, understand and ship good code. Security teams need to make sure you don’t end up on the cover of the NYT for data breaches, that your business isn’t halted by ransomware, and that you’re not building your product on a vulnerable stack.

This can be an unfamiliar frame of mind for developers.  Software development tends to attract positive-minded people who love creating things and are excited about the possibilities of new technology. Software security tends to attract negative thinkers who are skilled at finding all the flaws in a system.  These are very different mentalities, and the people who occupy them tend to have very different assumptions, vocabularies, and worldviews.   

But if you and your security team can’t share the same worldview, it will be hard to trust each other and come to agreement.  This is where practicing empathy can be helpful.

Before approaching your security team with your request to approve a new vendor, you may want to run some practice exercises for putting yourselves in their shoes and forcing yourselves to deliberately cultivate a negative thinking mindset to experience how they may react — not just in terms of the objective risk to the business, or the compliance headaches it might cause, but also what arguments might resonate with them and what emotional reactions they might have.

My favourite exercise for getting teams to think negatively is what I call the Land Astronaut approach.

The “Land Astronaut” Game

Imagine you are an astronaut on the International Space Station. Literally everything you do in space has death as a highly possible outcome. So astronauts spend a lot of time analysing, re-enacting, and optimizing their reactions to events, until it becomes muscle memory. By expecting and training for failure, astronauts use negative thinking to anticipate and mitigate flaws before they happen. It makes their chances of survival greater and their people ready for any crisis.

Your project may not be as high-stakes as a space mission, and your feet will most likely remain on the ground for the duration of your work, but you can bet your security team is regularly indulging in worst-case astronaut-type thinking. You and your team should try it, too.

The Game:

Pick a service for you and your team to game out.  Schedule an hour, book a room with a whiteboard, put on your Land Astronaut helmets.  Then tell your team to spend half an hour brainstorming about all the terrible things that can happen to that service, or to the rest of your stack when that service is introduced.  Negative thoughts only!

Start brainstorming together. Start out by being as outlandish as possible (what happens if their data centre is suddenly overrun by a stampede of elephants?). Eventually you will find that you’ll tire of the extreme worst case scenarios and come to consider more realistic outcomes — some of which which you may not have thought of outside of the structure of the activity.

After half an hour, or whenever you feel like you’re all done brainstorming, take off your Land Astronaut helmets, sift out the most plausible of the worst case scenarios, and try to come up with answers or strategies that will help you counteract them.  Which risks are plausible enough that you should mitigate them?  Which are you prepared to gamble on never happening?  How will this risk calculus change as your company grows and takes on more exposure?

Doing this with your team will allow you all to practice the negative thinking mindset together and get a feel for how your colleagues in the security team might approach this request. (While this may seem similar to threat modelling exercises you might have done in the past, the focus here is on learning to adopt a security mindset and gaining empathy for this thought process, rather than running through a technical checklist of common areas of concern.)

While you still have your helmets within reach, use your negative thinking mindset to fill out the spreadsheet from the first piece in this series.  This will help you anticipate most of the reasonable objections security might raise, and may help you include useful detail the security team might not have known to ask for.

Once you have prepared your list of answers to George’s worksheet and held a team Land Astronaut session together, you will have come most of the way to getting on board with the way your security team thinks.

Preparing for compromise

You’ve considered your options carefully, you’ve learned how to harness negative thinking to your advantage, and you’re ready to talk to your colleagues in security – but sometimes, even with all of these tools at your disposal, you may not walk away with all of the things you are hoping for.

Being willing to compromise and anticipating some of those compromises before you approach the security team will help you negotiate more successfully.

While your Land Astronaut helmets are still within reach, consider using your negative thinking mindset game to identify areas where you may be asked to compromise. If you’re asking for production access to this new service for observability and debugging purposes, think about what kinds of objections may be raised about this and how you might counter them or accommodate them. Consider continuing the activity with half of the team remaining in the Land Astronaut role while the other half advocates from a positive thinking standpoint. This dynamic will get you having conversations about compromise early on, so that when the security team inevitably raises eyebrows, you are ready with answers.

Be prepared to consider compromises you had not anticipated, and enter into discussions with the security team with as open a mind as possible. Remember the team is balancing priorities of not only your team, but other business and development teams as well.  If you and your security colleagues are doing the hard work to meet each other halfway then you are more likely to arrive at a solution that satisfies both parties.

Working together for the long term

While the previous strategies we’ve covered focus on short-term outcomes, in this continuous-deployment, shift-left world we now live in, the best way to convince your security team of the benefits of a third-party service – or any other decision – is to have them along from day one, as part of the team.

Roles and teams are increasingly fluid and boundary-crossing, yet security remains one of the roles least likely to be considered for inclusion on a software development team. Even in 2019, the task of ensuring that your product and stack are secure and well-defended is often left until the end of the development cycle.  This contributes a great deal to the combative atmosphere that is common.

Bringing security people into the development process much earlier builds rapport and prevents these adversarial, territorial dynamics. Consider working together to build Disaster Recovery plans and coordinating for shared production ownership.

If your organisation isn’t ready for that kind of structural shift, there are other ways to work together more closely with your security colleagues.

Try having members of your team spend a week or two embedded with the security team. You may even consider a rolling exchange – a developer for a security team member – so that developers build the security mindset, and the security team is able to understand the problems your team is facing (and why you are looking at introducing this new service).

At the very least, you should make regular time to meet with the security team, get to know them as people, and avoid springing things on them late in the project when change is hardest.

Riding off together into the sunset…?

If you’ve taken the time to get to know your security team and how they think, you’ll hopefully be able to get what you want from them – or perhaps you’ll understand why their objections were valid, and come up with a better solution that works well for both of you.

Investing in a strong relationship between your development and security teams will rarely lead to the apocalypse. Instead, you’ll end up with a better product, probably some new work friends, and maybe an exciting idea for a boundary-crossing new career in tech.

But this story isn’t over! Once you get the green light from security, you’ll need to think about how to roll your new service out safely, maintain it, and consider its full lifespan within your company.  Which leads us to part three of this series, on rolling it out and maintaining it … both your integration and your relationship with the security team.

 

Lilly Ryan is a pen tester, Python wrangler, and recovering historian from Melbourne. She writes and speaks internationally about ethical software, social identities after death, teamwork, and the telegraph. More recently she has researched the domestic use of arsenic in Victorian England, attempted urban camouflage, reverse engineered APIs, wielded the Oxford comma, and baked a really good lemon shortbread.

Outsource Your O11y: Get Aligned With Security (part 2/3)