“So I’ve almost got our group at work up to Step 1 in your observability maturity model, but some of the devs that I work with want to turn OFF our lovely structured logging in prod for informational-level msgs due to their legacy philosophy (‘we only log errors in prod’). The reasons given are mostly philosophical (“I’m a dev and only interested when things error out, I don’t want any other noise in prod logs”, “I don’t want to slow my app down in prod”). Help?!?”
As I was reading this, I was itching to fly out and dive into battle with Eric. I know exactly where his opinionated devs are coming from. I used to say the same things! I even wrote a whole blog post about it.
These developers have internalized a set of rules and best practices for dealing with output data, in the context of “monolith application development in the early 2000s”.
Monolithic systems assumptions
Those systems had many common constraints and assumptions, such as:
We have a monolith service, or a very small number of services. We can model the system in our heads.
Logging is done to local disk, which can impact performance
Disks are expensive
Log lines are spat out inline with execution. A poorly placed printf can take the whole system down.
Investigation is rare, and usually means a human reading error logs.
Logging is of poor utility for understanding internal states or execution paths; you should just read the code or use a debugger. (There are few or network hops between functions.)
Logging is mostly useful for detecting certain terminal crash states or connection errors.
Monolithic logging best practices
We should be very stingy in what we log
Debuggers should be used for understanding internal states of the code
Logs are a last resort and record of crash dumps. We do not expect to use log data in the course of our daily work. We assume log-related manual investigation will be infrequent and of limited utility.
These were exactly the right lessons to learn in the era of expensive hardware and monolithic repos/artifacts. Many people still work in environments like this, and follow logging best practices like these. God bless, more power to em.
Distributed systems assumptions
But more and more of us face systems that are very different.
We have many services, possibly many MANY services. A representative request will have “many” hops across “many” services and routers and proxies and meshes and storage systems.
We cannot model the system in our heads; it would be a mistake to try. We rely on tooling as the source of truth for those systems.
You may or may not have access to those services, or the systems your code runs on. There may or may not be a logging facility, or a centralized log aggregator. Your only view of the system is through the instrumentation of your code.
Disks and system resources are cheap, ephemeral, all but disposable.
Data services are similarly cheap. We can almost entirely silo application performance off from the cost of writing perf data out.
Investigation is prohibitively slow and expensive for a human to do by hand. Many of the nodes or processes we need to inspect may no longer even exist, but their past states may still be relevant to us in understanding patterns to the present time.
Investigation should usually be done distributedly, across all instantiations of your code, however many there might be — and in real time
Investigation requires computation — not just string search. We need to ask on the fly involving math and percentiles and breakdowns and group by’s. And we need access to the raw requests in order to run accurate computations — no pre-aggregates.
The hardest part isn’t usually debugging the code, it’s figuring out where is the code you need to debug. Or what the errors or outliers have in common from the perspective of the code. Fixing the code itself is often comparatively trivial, once found.
What even is ‘logging’?
What even is ‘local disk’?
This isn’t optional: at some point of complexity or scale or distributedness, it becomes necessary if you want to work with these systems.
Logs can’t help you here.
And you aren’t going to get that kind of explorable data out of loglevel:ERROR, or by chopping up your telemetry into disconnected metrics devoid of context.
You are only going to get this kind of explorable, ad hoc, computation-friendly data if you take a radically new approach to how you output and aggregate telemetry. You’re going to need to replace your log lines and log levels with a different sort of beast: arbitrarily wide structured events that describe the request and its context, one event per
request per service.
If it helps, don’t think of them as log files any more. Think of them as events. Yes, you can stash this stream in a file, but why would you? on what disk? will that work for your serverless functions too? Just stream them over the network to wherever you want to put them.
Log levels are another confusing and unnecessary artifact of yesteryear that you no longer really need. The more you think of structured events as logs, the more tempted you may be to apply the old set of best practices. So just don’t think of them as logs at all.
How to gather and structure your data
Instead of dribbling little pebbles of log effluvia throughout your code, do this. (If you’re a honeycomb user, our beelines do it all automatically for you *and* pre-propagate the blobs with everything we know of your context.)
Initialize an empty blob at the beginning, when the request first enters the service.
Stuff any and all interesting detail about the request into that blob throughout the lifetime of the request.
Any unique id, any high-cardinality variable, any headers passed in, every full query, normalized query, and query execution time; every http call out to a remote service, every http execution time; any shopping cart id, first and last name, execution time — literally anything interesting, append to blob.
Then, when the request is about to exit or error, write the blob off to honeycomb or another service or disk somewhere.
You can see immediately how this method has radically different performance implications and risks than the earlier shotgun spray approach. No more “oops i accidentally put a print line INSIDE a for loop”. The write amplification profile is compressed. Most importantly, the incremental cost of capturing more detail about the request per service is nearly zero.
And now you have the kind of structured data that you can feed into something like a columnar store, or honeycomb, and run ad hoc queries to your heart’s delight.
Distributed systems logging events best practices:
Let’s sum up. (I’m including links to other past rants on this topic):
Feed this data into a columnar store or honeycomb or similar
Now use it every day. Not just as a last resort. Get knee deep in production every single day. Explore. Ask and answer rich questions about your systems, system quality, system behavior, outliers, error conditions, etc. You will be absolutely amazed how useful it is … and appalled by what you turn up. 🙂
No more doing multi-line regexps trying to look for the same request ID or user ID doing five suspicious things in a row.
No more regexps at all, for fuck’s sake.
No more bullshit percentiles that were computed at write time by averaging over a bunch of other averages
No more having to jump around from dashboards to logs trying to vainly eyeball correlate one spike with another. No more wondering why no two tools can agree if anything even exists or not
Just gather the detail you need to ask the questions when you need them, and store it in a single source of truth. It’s that simple.
No need to shame people from learning best practices that worked perfectly well for a long time. You can either let them learn the hard way that this transformation is non optional, or you can help them learn the easy way that it’s simply much better and easier to invest in this telemetry up front. You seem like a nice enough chap, which is probably why you chose door 2. (If you wanted to get tougher about it, have a few reformed folks in to tell their horror stories. Try some ex-twitter engineers.)
The hardest part seems to be getting people to unlearn all the best practices they once learned for dealing with logs. So just don’t call it logs anymore, if that helps. Call it “structured events”.
Last night I was out with a dear friend who has been an engineering manager for a year now, and by two drinks in I was rattling off a long list things I always say to newer engineering managers.
Then I remembered: I should write a post! It’s one of my goals this year to write more long form instead of just twittering off into the abyss.
There’s a piece I wrote two years ago, The Engineer/Manager Pendulum, which is probably my all time favorite. It was a love letter to a friend who I desperately wanted to see go back to engineering, for his own happiness and mental health. Well, this piece is a sequel to that one.
It’s primarily aimed at new managers, who aren’t sure what their career options look like or how to evaluate the opportunities that come their way, or how it may expand or shrink their future opportunities.
The first fork in the manager’s path
Every manager reaches a point where they need to choose: do they want to manage engineers (a “line manager”), or do they want to try to climb the org chart? — manage managers, managers of other managers, even other divisions; while being “promoted” from manager to senior manager, director to senior director, all the way up to VP and so forth. Almost everyone’s instinct is to say “climb the org chart”, but we’ll talk about why you should be critical of this instinct.
They also face a closely related question: how technical do they wish to stay, and how badly do they care?
Are you an “engineering MANAGER” or an “ENGINEERING manager”?
These are not unlike the decisions every engineer ends up making about whether to go deep or go broad, whether to specialize or be a generalist. The problem is that both engineers and managers often make these career choices with very little information — or even awareness that they are doing it.
And managers in particular then have a tendency to look up ten years later and realize that those choices, witting or unwitting, have made them a) less employable and b) deeply unhappy.
Lots of people have the mindset that once they become an engineering manager, they should just go from gig to gig as an engineering manager who manages other engineers: that’s who they are now. But this is actually a very fragile place to sit long-term, as we’ll discuss further on in this piece.
But let’s start at to the beginning, so I can speak to those of you who are considering management for the very first time.
“So you want to try engineering management.”
COOL! I think lots of senior engineers should try management, maybe even most senior engineers. It’s so good for you, it makes you better at your job. (If you aren’t a senior engineer, and by that I mean at least 7+ years of engineering experience, be very wary; know this isn’t usually in your best interest.)
Hopefully you have already gathered that management is a career change, not a promotion, and you’re aware that nobody is very good at it when they first start.
That’s okay! It takes a solid year or two to find new rhythms and reward mechanisms before you can even begin to find your own voice or trust your judgment. Management problems look easy, deceptively so. Reasons this is hard include:
Most tech companies are absolutely abysmal at providing any sort of training or structure to help you learn the ropes and find your feet.
Even if they do, you still have to own your own careerdevelopment. If learning to be a good engineer was sort of like getting your bachelor’s, learning to be a good manager is like getting your PhD — much more custom to who you are.
It will exhaust you mentally and emotionally in the weirdest ways for much longer than you think it should. You’ll be tired a lot, and you’ll miss feeling like you’re good at something (anything).
This is because you need to change your habits and practices, which in turn will actually change who you are. This takes time. Which is why …
The minimum tour of duty as a new manager is two years.
If you really want to try being a manager, and the opportunity presents itself, do it! But only if you are prepared to fully commit to a two year long experiment.
Commit to it like a proper career change. Seek out new peers, find new heroes. Bring fresh eyes and a beginner’s mindset. Ask lots of questions. Re-examine every one of your patterns and habits and priorities: do they still serve you? your team?
Don’t even bother thinking about in terms of whether you “enjoy managing” for a while, or trying to figure out if you are are any good at it. Of course you aren’t any good at it yet. And even if you are, you don’t know how to recognize when you’ve succeeded at something, and you haven’t yet connected your brain’s reward systems to your successes. A long stretch of time without satisfying brain drugs is just the price of admission if you want to earn these experiences, sadly.
It takes more than one year to learn management skills and wire up your brain to like it. If you are waffling over the two year commitment, maybe now is not the time. Switching managers too frequently is disruptive to the team, and it’s not fair to make them report to someone who would rather be doing something else or isn’t trying their ass off.
It takes about 3-5 years for your skills to deteriorate.
So you’ve been managing a team for a couple years, and it’s starting to feel … comfortable? Hey, you’re pretty good at this! Yay!
With a couple of years under your belt as a line manager, you now have TWO powerful skill sets. You can build things, AND you can organize people into teams to build even bigger things. Right now, both sets are sharp. You could return to engineering pretty easily, or keep on as a manager — your choice.
But this state of grace doesn’t last very long. Your technical skills stop advancing when you become a manager, and instead begin eroding. Two years in, you aren’t the effective tech lead you once were; your information is out of date and full of gaps, the hard parts are led by other people these days.
More critically, your patterns of mind and habits shift over time, and become those of a manager, not an engineer. Consider how excited an engineer becomes at the prospect of a justifiable greenfield project; now compare to her manager’s glum reaction as she instinctively winces at having to plan for something so reprehensibly unpredictable and difficult to estimate. It takes time to rewire yourself back.
If you like engineering management, your tendency is to go “cool, now I’m a manager”, and move from job to job as an engineering manager, managing team after team of engineers. But this is a trap. It is not a sound long term plan. It leads too many people off to a place they never wanted to end up: technically sidelined.
Why can’t I just make a career out of being a combo tech lead+line manager?
One of the most common paths to management is this: you’re a tech lead, you’re directing ever larger chunks of technical work, doing 1x1s and picking up some of the people stuff, when your boss asks if you’d like to manage the team. “Sure!”, you say, and voila — you are an engineering manager with deep domain expertise.
But if you are doing your job, you begin the process of divesting yourself of technical leadership responsibilities starting immediately. Your own technical development should screech to a halt once you become a manager, because you have a whole new career to focus on learning.
Your job is to leverage that technical expertise to grow your engineers into great senior engineers and tech leads themselves. Your job is not to hog the glory and squat on the hard problems yourself, it’s to empower and challenge and guide your team. Don’t suck up all the oxygen: you’ll stunt the growth of your team.
But your technical knowledge gets dated, and your skills atrophy.. The longer it’s been since you worked as an engineer, the harder it will be to switch back. It gets real hard around three years, and five years seems like a tipping point.
And because so much of your credibility and effectiveness as an engineering leader comes from your expertise in the technology that your team uses every day, ultimately you will be no longer capable of technical leadership, only people management.
On being an “engineering manager” who only does people management
I mean, there’s a reason we don’t lure good people managers away from Starbucks to run engineering teams. It’s the intersection and juxtaposition of skill sets that gives engineering managers such outsize impact.
The great ones can make a large team thrum with energy. The great ones can break down a massive project into projects that challenge (but do not overwhelm) a dozen or more engineers, from new grads to grizzled veterans, pushing everyone to grow. The great ones can look ahead and guess which rocks you are going to die on if you don’t work to avoid them right now.
The great ones are a treasure: and they are rare. And in order to stay great, they regularly need to go back to the well to refresh their own hands-on technical abilities.
There is an enormous demand for technical engineering leaders — far more demand than supply. The most common hackaround is to pair a people manager (who can speak the language and knows the concepts, but stopped engineering ages ago) with a tech lead, and make them collaborate to co-lead the team. This unwieldy setup often works pretty well.
But most of those people managers didn’t want or expect to end up sidelined in this way when they were told to stop engineering.
If you want to be a pure people manager and not do engineering work, and don’t want to climb the ladder or can’t find a ladder to climb, more power to you. I don’t know that I’ve met many of these people in my life. I have met a lot of people in this situation by accident, and they are always kinda angsty and unhappy about it. Don’t let yourself become this person by accident. Please.
Which brings me to my next point.
You will be advised to stop writing code or engineering.
Everybody’s favorite hobby is hassling new managers about whether or not they’ve stopped writing code yet, and not letting up until they say that they have. This is a terrible, horrible, no-good VERY bad idea that seems like it must originally have been a botched repeating of the correct advice, which is:
Stop writing code and engineering
in the critical path
Can you spot the difference? It’s very subtle. Let’s run a quick test:
Authoring a feature? ⛔️
Covering on-call when someone needs a break? ✅
Diving on the biggest project after a post mortem? ⛔️
Code reviews? ✅
Picking up a p2 bug that’s annoying but never seems to become top priority? ✅
Insisting that all commits be gated on their approval? ⛔️
Cleaning up the monitoring checks and writing a library to generate coverage? ✅
The more you can keep your hands warm, the more effective you will be as a coach and a leader. You’ll have a richer instinct for what people need and want from you and each other, which will help you keep a light touch. You will write better reviews and resolve technical disputes with more authority. You will also slow the erosion and geriatric creep of your own technical chops.
I firmly believe every line manager should either be in the on call rotation or pinch hit liberally and regularly, but that’s a different post.
Technical Leadership track
If you love technology and want to remain a subject-matter expert in designing, building and shipping cutting-edge technical products and systems, you cannot afford to let yourself drift too far or too long away from hands-on engineering work. You need to consciously cultivate your path , probably by practicing some form of the engineer/manager pendulum.
If you love managing engineers — if being a technical leader is a part of your identity that you take great pride in, then you must keep up your technical skills and periodically invest in your practice and renew your education. Again: this is simply the price of admission. You need to renew your technical abilities, your habits of mind, and your visceral senses around creating and maintaining systems. There is no way to do this besides doing it. If management isn’t a promotion, then returning to hands-on work isn’t a demotion, either. Right?
One warning: Your company may be great, but it doesn’t exist for your benefit. You and only you can decide what your needs are and advocate for them. Remember that next time your boss tries to guilt you into staying on as manager because you’re so badly needed, when you can feel your skills getting rusty and your effectiveness dwindling. You owe it to yourself to figure out what makes you happy and build a portfolio of experiences that liberate you to do what you love. Don’t sacrifice your happiness at the altar of any company. There are always other companies.
Honestly, I would try not to think of yourself as a manager at all: you are an “engineering leader” performing a tour of duty in management. You’re pursuing a long term strategy towards being a well-respected technologist, someone who can sling code, give informed technical guidance and explain in detail customized for to anyone at any level of sophistication.
Organizational Leadership Track
Most managers assume they want to climb the ladder. Leveling up feels like an achievement, and that can feel impossible to resist.
Resist it. Or at least, resist doing it unthinkingly. Don’t do it because the ladder is there and must be climbed. Know as much as you can about what you’re in for before you decide it’s what you want.
Here are a few reasons to think critically about climbing the ladder to director and executive roles.
Your choices shrink. There are fewer jobs, with more competition, mostly at bigger companies. (Do you even like big companies?)
You basically need to do real time at a big company where they teach effective management skills, or you’ll start from a disadvantage.
Bureaucracies are highly idiosyncratic, skills and relationships may or may not transfer with you between companies. As an engineer you could skip every year or two for greener pastures if you landed a crap gig. An engineer has … about 2-3x more leeway in this regard than an exec does. A string of short director/exec gigs is a career ender or a coach seat straight to consultant life.
You are going to become less employable overall. The ever-higher continuous climb almost never happens, usually for reasons you have no control over. This can be a very bitter pill.
Your employability becomes more about your “likability” and other problematic things. Your company’s success determines the shape of your career much more than your own performance. (Actually, this probably begins the day you start managing people.)
Your time is not your own. Your flaws are no longer cute. You will see your worst failings ripple outward and be magnified and reflected. (Ditto, applies to all leaders but intensifies as you rise.)
You may never feel the dopamine hit of “i learned something, i fixed something, i did something” that comes so freely as an I.C. Some people learn to feel satisfaction from managery things, others never do. Most describe it as a very subdued version of the thrill you get from building things.
You will go home tired every night, unable to articulate what you did that day. You cannot compartmentalize or push it aside. If the project failed for reasons outside your control, you will be identified with the failure anyway.
Nobody really thinks of you as a person anymore, you turn into a totem for them to project shit on. (Things will only get worse if you hit back.) Can you handle that? Are you sure?
It’s pretty much a one-way trip.
Sure, there are compensating rewards. Money, power, impact. But I’m pointing out the negatives because most people don’t stop to consider them when they start saying they want to try managing managers. Every manager says that.
The mere existence of a ladder compels us all to climb.
I know people who have climbed, gotten stuck, and wished they hadn’t. I know people who never realized how hard it would be for them to go back to something they loved doing after 5+ years climbing the ladder farther and farther away from tech. I know some who are struggling their way back, others who have no idea how or where to start. For those who try, it is hard.
You can’t go back and forth from engineering to executive, or even director to manager, in the way you can traverse freely between management and engineering as a technologist.
I just want more of you entering management with eyes wide open. That’s all I’m saying.
If you don’t know what you want, act to maximize your options.
Engineering is a creative act. Managing engineers will require your full attentive and authentic self. You will be more successful if you figure out what that self is, and honor its needs. Try to resist the default narratives about promotions and titles and roles, they have nothing to do with what satisfies your soul. If you have influence, use it to lean hard against things like paying managers more than ICs of the same level.
It’s totally normal not to know who you want to be, or have some passionate end goal. It’s great to live your life and work your work and keep an eye out for interesting opportunities, and see what resonates. It’s awesome when you get asked to step up and opportunistically build on your successes.
If you want a sustainable career in tech, you are going to need to keep learning your whole life. The world is changing much faster than humans evolved to naturally adapt, so you need to stay a little bit restless and unnaturally hungry to succeed in this industry.
The best way to do that is to make sure you a) know yourself and what makes you happy, b) spend your time mostly in alignment with that. Doing things that make you happy give you energy. Doing things that drain you are antithetical to your success. Find out what those things are, and don’t do them.
Don’t be a martyr, don’t let your spending habits shackle you, and don’t build things that trouble your conscience.
And have fun.
Yours in inverting $(allthehierarchies),
 Important point: I am not saying you can’t pick up the skills and patience to practice engineering again. You probably can! But employers are extremely reluctant to pay you a salary as an engineer if you haven’t been paid to ship code recently. The tipping point for hireability comes long before the tipping point for learning ability, in my experience.
 It is in no one’s best interest for money to factor into the decision of whether to be a manager or not. Slack pays their managers LESS than engineers of the same level, and I think this is incredibly smart: sends a strong signal of servant leadership.
The company is growing like crazy, your engineering team keeps rising to the challenge, and you are ferociously proud of them. But some cracks are beginning to show, and frankly you’re a little worried. You have always advocated for engineers to have broad latitude in technical decisions, including choosing languages and tools. This autonomy and culture of ownership is part of how you have successfully hired and retained top talent despite the siren song of the Faceboogles.
But recently you saw something terrifying that you cannot unsee: your company is using all the languages, all the environments, all the databases, all the build tools. Shit!!! Your ops team is in full revolt and you can’t really blame them. It’s grown into an unsupportable nightmare and something MUST be done, but you don’t know what or how — let alone how to solve it while retaining the autonomy and personal agency that you all value so highly.
I hear a version of this everywhere I’ve gone for the past year or two. It’s crazy how often. I’ve been meaning to write my answer up for ages, and here it (finally) is.
First of all: you aren’t alone. This is extremely common among high-performing teams, so congratulations. Really!
There actually seems to be a direct link between teams that give engineers lots of leeway to own their technical decisions and that team’s ability to hire and retain top-tier talent, particularly senior talent. Everything is a tradeoff, obviously, but accepting somewhat more chaos in exchange for a stronger sense of individual ownership is usually the right one, and leads to higher-performing teams in the long run.
Second, there is actually already a well-trod path out of this hole to a better place, and it doesn’t involve sacrificing developer agency. It’s fairly simple! Just five short steps, which I will describe to you now.
How to build a golden path and reverse software sprawl
Assemble a small council of trusted senior engineers.
Task them with creating a recommended list of default components for developers to use when building out new services. This will be your Golden Path, the path of convergence (and the path of least resistance).
Tell all your engineers that going forward, the Golden Path will be fully supported by the org. Upgrades, patches, security fixes; backups, monitoring, build pipeline; deploy tooling, artifact versioning, development environment, even tier 1 on call support. Pave the path with gold. Nobody HAS to use these components … but if they don’t, they’re on their own. They will have to support it themselves.
Work with team leads to draw up an umbrella plan for adopting the Golden Path for their current projects as well as older production services, as much as is reasonable or possible or desirable. Come up with a timeline for the whole eng org to deprecate as many other tools as possible. Allocate real engineering time to the effort. Hell, make a party out of it!
After the cutoff date (and once things have stabilized), establish a regular process for reviewing and incorporating feedback about the blessed Path and considering any proposed changes, additions or removals.
There you go. That’s it. Easy, right??
(It’s not easy. I never said it was easy, I said it was simple. 👼🏼)
Your engineers are currently used to picking the best tool for the job by optimizing locally. What data store has a data model that is easiest for them to fit to their needs? Which language is fastest for I/O throughput? What are they already proficient in? What you need to do is start building your muscles for optimizing globally. Not in isolation of other considerations, but in conjunction with them. It will always be a balancing act between optimizing locally for the problem at hand and optimizing globally for operability and general sanity.
(Oh, incidentally, requiring an engineer to write up a proposal any time they want to use a non-standard component, and then defend their case while the council grills them in person — this will be nothing but good for them, guaran-fucking-teed.)
Let’s go into a bit more detail on each of the five points. But quick disclaimer: this is not a prescription. I don’t know your system, your team, your cultural land mines or technical interdependencies or anything else about your situation. I am just telling stories here.
1. Assemble your council
Three is a good number for a council. More than that gets unwieldy, and may have trouble reaching consensus. Less than three and you run into SPOFs. You never want to have a single person making unilateral decisions because a) the decision-making process will be weaker, b) it sets that person up for too much interpersonal friction, and c) it denies your other engineers the opportunity to practice making these kinds of decisions.
Your council members need technical breadth more than depth, and should be widely respected by engineers.
At least one member should have a long history with the company so they know lots of stupid little details about what’s been tried before and why it failed.
At least one member should be deeply versed in practical data and operability concerns.
They should all have enough patience and political skill to drive consensus for their decisions. Absolutely no bombthrowers.
If you’re super lucky, you just tap the three senior technologists who immediately come to mind … your mind and everyone else’s. If you don’t have this kind of automatic consensus, you may want to let teams or orgs nominate their own representative so they feel they have some say.
2. Task the council with defining a Golden Path
Your council cannot vanish for a week and then descend from the mountain lugging lists engraved on stone tablets. The process of discovery and consensus is what validates the result.
The process must include talking to and gathering feedback from your engineers, talking to experts outside the company, talking to teams at other companies who are farther along using that technology, coming up with detailed pro/con lists and reasons for their choices. Maybe sometimes it includes prototyping something or investigating the technical depths … but yeah no mostly it’s just the talking.
You need your council members to have enough political skill to handle these conversations deftly, building support and driving consensus through the process. Everybody doesn’t have to love the outcome, but it shouldn’t be a *surprise* to anyone by the end.
3. Know where you’re going
Your council should create a detailed written plan describing which technologies are going to be supported … and a stab at what “supported” means. (Ask the experts in each component what the best practices are for backups, versioning, dependency management, etc.)
You might start with something like this:
* Backend lang: Go 1.11 ## we will no longer be supporting
backend scripting languages
* Frontend lang: ReactJS v 16.5
* Primary db: Aurora v 2.0 ## Yes, we know postgres is "better",
but we have many mysql experts and 0 pg experts except the one guy
who is going to complain about this. You know who you are.
* Deploy pipeline: github -> jenkins + docker -> S3 -> custom k8s
* Message broker: kafka v 2.10, confluent build
* Mail: SES
* .... etc
Circulate the draft regularly for feedback, especially with eng managers. Some team reorganization will probably be necessary to bear the new weight of your support specifications, and managers will need some lead time to wrangle this.
This is also a great time to reconceive of the way on call works at your company. But I am not going to go into all that here.
4. Set a date, draft a plan: go!
Get approval from leadership to devote a certain amount of time to consolidating your stack and paying down a lump sum of tech debt. It depends on your stage of decay, but a reasonable amount of time might be “25% of engineering time for three months“. Whatever you agree to, make sure it’s enough to make the world demonstrably better for the humans who run it; you don’t want to leave them with a tire fire or you’ll blow your credibility.
The council and team leads should come up with a rough outer estimate for how long it would take to rewrite everything and move the whole stack on to the Golden Stack. (It’s probably impossible and/or would take years, but that’s okay.) Next, look for the quick wins or swollen, inflamed pain points.
If you are running two pieces of functionally similar software, like postgres and mysql, can you eliminate one?
If you are managing something yourself that AWS could manage for you (e.g. postfix instead of SES, or kafka instead of kinesis), can you migrate that?
If you are managing anything yourself that is not core to your business value, in fact, you should try to not manage it.
If you are running any services by hand on an AWS instance somewhere, could you try using a service?
If you are running your own monitoring software, etc … can you not?
If you have multiple versions of a piece of software, can you upgrade or consolidate on one version?
The hardest parts are always going to be the ones around migrating data or rewriting components. Not everything is worth doing or can afford to be done in the time span of your project time, and that’s okay.
Next, brainstorm up some carrots. Can you write templates so that anybody who writes a service using your approved library, magically gets monitoring checks without having to configure anything? Can you write a wrapper so they get a bunch of end-to-end tests for free? Anything you can do to delight people or save them time and effort by using your preferred components is worth considering.
(By the way, if you don’t have any engineers devoted to internal tooling, you’re probably way overdue at this point.)
Pay down as much debt as you can, but be pragmatic: it’s better to get rid of five small things than one large thing, from a support perspective. Your main goal is to shrink the number of types of software your team has to support, particularly databases.
Do look for ways to make it fun, like … running a competition to see who can move the most tools to AWS in a week, or throwing a hack week party, or giving dorky prizes like trophies that entitle you to put your manager on call instead of you for a day, etc.
5. Make the process sustainable
After your target date has come and gone, you probably want to hold a post mortem retrospective and do lots of listening. (Well — first might I recommend a bubble bath and a bottle of champagne? But then a post mortem.)
Nothing is ever fixed forever. The company’s needs are going to expand and contract, and people will come and go, because change is the only constant. So you need to bake some flex into your system. How are you going to handle the need for changes to the Golden Path? Monthly discussions? An email list? Quarterly meetings with a formal agenda? I’ve seen people do all of these and more, it doesn’t really matter afaict.
Nobody likes a cabal, though, so the original council should gradually rotate out. I recommend replacing one person at a time, one per quarter, and rotating in another senior engineer in their place. This provides continuity while giving others a chance to learn these technical and political skills.
In the end, engineers are still free to use any tool or component at any time, just like before, only now they are solely responsible for it, which puts pressure on them not to do it unless REALLY necessary. So if someone wants to propose adding a new tool to the default golden path, they can always add it themselves and gain some experience in it before bringing it to the council to discuss a formal place for it.
That’s all folks
See, wasn’t that simple?
(It’s never simple.)
I dearly wish more people would write up their experiences with this sort of thing in detail. I think engineering teams are too reluctant to show their warts and struggles to the world — or maybe it’s their executives who are afraid? Dunno.
Regardless, I think it’s actually a highly effective recruiting tool when teams aren’t afraid to share their struggles. The companies that brag about how awesome they are are the ones who come off looking weak and fragile. Whereas you can always trust the ones who are willing to laugh about all the ways they screwed up. Right?
In conclusion, don’t feel like an asshole for insisting on some process here. There should be friction around adding new components to your stack. (Add in haste, repent at leisure, as they say.) Anybody who argues with you probably needs to be exposed to way, way more of the support load for that software. That’s my professional opinion.
Anyway. You win or you die. Good luck with your sprawl.
On Monday I gave a talk at DOES18 called “All the World’s a Platform”, where I talked about a bunch of the lessons learned by using and abusing and running and building platforms at scale.
I promised to do a blog post with the takeaways, so here they are.
Platform Commandment #1: Any time you have to think about one particular user, you have failed in some way. It doesn’t scale. Just a few one-offs a day will drag you down and drown your forward momentum.
Corollary: you will always have to do this every day. Solution: turn one-offs into a support problem, not an engineering problem.
Platform Commandment #2: keep your critical path as small and independent as possible. Have explicit tiers of importance. You cannot care about everything equally, sacrifices must be made.
Example: at Parse the core API was tier 1, push was tier 2, website was somewhere down around tier 10. We always knew what to bring up and care about first.
Platform Commandment #3: It is the job of the platform to protect itself at all costs, including at the expense of your app.
Platform Commandment #4: Remember that your platform is a magical black box to your users. You can’t expect them to behave reasonably without feedback loops and a rich mental model. Help them out — esp your super-users. It will save you time if you can help them help themselves.
Platform Commandment #5: Always expose a visible request id, shard id, uuid, trace id, any other relevant diagnostic information in user-visible errors. Up to the point where it reveals too much exploitable information about your service, which is probably much farther than you think. Poorly obfuscated infrastructure decisions are usually less of a threat to your business than befuddled users are.
Platform Commandment #6: Your observability must center your users’ perspective, not your own. The health of the system doesn’t matter. The health of every request, and every high-cardinality grouping of requests — those are what matter.
You must be able to care about and inspect the perf and quality from the perspective of every single application and/or user and their users, as richly as though theirs was the *only* application.In real-time.
Dashboards are practically useless unless you can drill down into them. Top-10 lists are useless — your biggest customers may not be your most important customers.
Solution: Invest in tooling (like Honeycomb) that lets you slice and dice on dimensions of arbitrary cardinality, so you can do things like a) break down by one uuid out of millions, b) break down by endpoint, latency percentile, raw query, data store, etc — to see what the experience actually looks like for that user, not for a high level aggregate like a dashboard.
Platform Commandment #7: Use end-to-end checks to traverse all the key code paths and architecture paths.
You will be tempted to disable them because they seem flappy and flaky and need to be fixed.But this is actually what your users are suffering through every day they use your platform. Don’t disable them, fix them.
Platform Commandment #8: Invest early in every kind of throttle, blacklist, velvet rope, in-flight rewrite, custom url/error responder, content inspection, etc … both partial and total, for every slice of events or users. You will need all these fine-grained controls to keep your platform alive for 99.9% of users while you debug the .1% who are outliers and bad actors.
Platform Commandment #9: And use a multi-threaded language ffs.
Platform Commandment #10: USE YOUR OWN PLATFORM. For work, if possible. Feel the pain that you inflict on others.
Bonus Commandment: all cotenancy isolation guarantees are bullshit**
“How can I learn to be a better manager? I have no idea what I’m doing.”
“I’m a tech lead, and responsible for shipping large projects, but all of my power is informal. How do I get people to listen to me and do what I need them to?”
“As a manager, I don’t feel safe talking to anyone about my work problems. What if it gets passed around as gossip?”
Leadership is such a weird thing. Leadership is something every one of us does more and more of as we get older and more experienced, but mostly you learn leadership lessons from trial and error and failing a lot. Which is too bad when you’re doing something you really care about, for the first time.
(Like starting a company.)
I’ve read books on leadership. I’ve been semi-consensually subjected to management training, I’ve had coaches, I’ve tried therapy and mentors. Most of this has been impressively (and expensively) unhelpful.
There’s only one thing that has reliably accelerated my development as a leader or manager, and that is forming bonds and swapping stories with my peers.
Stories are power tools.
A story is a tool. The more stories you have about how other people have solved a problem like yours, the more tools you have.
People are very complicated puzzles, and the more tools you have the more likely you are to find a tool that works.
Unlike management books which speak in abstractions and generalities, stories are real and specific. When you have the storyteller in front of you, you can drill down and find out more about how the situation was like your own or not, and what they wish they’d done in retrospect.
Details matter. Context matters.
Sometimes all you really need is a sympathetic ear to listen and make murmuring noises of encouragement while you work it out yourself out loud. Sometimes they have grappled with a similar situation and can tell you how it all worked out, what they wish they’d known. Sometimes they will cut you off and tell you to quit feeling sorry for yourself or sabotaging yourself.. if you’re lucky.
Peers?? Why not a ‘mentor’?
No insult meant to anyone who gets a lot out of mentoring, but it isn’t really my bag. I’ve always had.. let’s say issues with authority. Which is a nice way of saying “never met a power structure I didn’t simultaneously want to crush and invert”. So I prefer the framing of “peers” over even the relatively tame hierarchy of the mentor-mentee relationship.
I mean, one-way relationships are fucked up. Lots of my peers are more junior than me and some are more senior, yet somehow we all manage to be givers as well as takers. And if you’re both giving support and receiving it, then what the fuck do you need different roles like mentor and mentee?
I don’t want to be someone’s mentor. I want to be their friend and to sometimes be helpful. I don’t want to be someone’s “mentee” either, that makes me feel like their charity case (ha ha).
But friends and peers? Those just make my life better and awesomer.
From each according to their ability, to each according to their need.
The first year or two of honeycomb, I had a small list of friends who I got dinner or drinks with once a month like clockwork. Most of them were founders or execs or had been at one point, so they knew how depressed I was without needing to say so.
They listened to my stories (even though I was terrible company), and shared plenty of their own. They just kept showing up, reminding me to sleep, asking if they could help, not taking it personally whatever state I was in.
These friendships carried me through some dark times. When Christine’s role required her to level up at leadership skills, I encouraged her to get some peers too. And that’s when I began to realize some of the limitations of the 1×1 model: it’s very time consuming, and doesn’t scale well.
But hey! scaling problems are fun. 😀 I decided to pull together a peer group where people could come together and give and get support all at the same time.
(Actually two groups. One for me, one for Christine, so we could complain about each other in proper peace and privacy.)
Practicing vulnerability, establishing intimacy.
It took some time to assemble the right groups, but then we met weekly for 6 weeks straight, and after that roughly once a month for a year. The six starter weeks were intended to help us practice vulnerability and establish intimacy in a compressed time frame.
Last week was our one-year birthday.
There’s something sterile about management books and leadership material, something that makes it hard for me to emotionally connect my problems to the solutions they preach. I want advice from someone who knows me in all my strengths and weaknesses, who knows what advice I can take and perform authentically and what I can’t. Context matters. Who we are matters.
As word has started to get around about the group, sometimes people ask me about joining or how to form a group of their own. Turns out lots of people are hungry to get better at leadership, and there are precious few resources.
That’s why I decided to write up and publish my notes. Everything I learned along the way about how to run a tech leadership skill swap — the logistics, the facilitation, the homework, the ground rules. Who to invite. Recommended reading lists.
(It’s a little rough, but I positively cannot spare any more time.)
This shit is hard. You need a posse.
Would you like to run a tech leads skill swap? Please tell me if you do, I would love to know! I’m happy to help you get started with a phone call, if you want.
All I ask is that you try to pull together a posse that’s at least 50% women, queers, and other marginalized folks.
Good luck. ~charity
 I take a pretty expansive view of leadership. For example, an intern might exercise leadership in the vaunted area of database backups — just by volunteering to own backups, reliably performing said backups and serving as a point of coordination and education for how we do backups here at $corp.
If you have expertise and people rely on you for it, this is a legit form of influence and power … in other words, that’s leadership.
 A HUGE thanks to Rachel Chalmers for scribing a first draft of these notes, and to Kris for running the other group and contributing the homework sheet and stories of a related group at twitter.
On twitter this week, @srhtcn noted that “Many incidents happen during or right after release” and asked for advice on ways to fix this.
And he’s right! Rolling out new software is the proximate cause for the overwhelming majority of incidents, at companies of all sizes. Upgrading software is both a necessity and a minor insanity, considering how often it breaks things.
But it’s still risky. And most issues are still caused by humans and our pesky need for “improvements”. So what can be done?
It’s not ok for software releases to be scary and hazardous
First of all: If releasing is risky for you, you need to fix that. Make this a priority. Track your failures, practice post mortems, evaluate your on call practices and culture. Know if you’re getting better or worse. This is a project that will take weeks if not months until you can be confident in the results.
You have to fix it though, because these things are self-reinforcing. If shipping changes is scary and fraught, people will do it less and it will get even MORE scary and treacherous.
Likewise, if you turn it into a non-cortisol inducing event and set expectations, engineers will ship their code more often in smaller diffs and therefore break the world less.
Fixing deploys isn’t about eliminating errors, it’s about making your pipeline resilient to errors. It’s fundamentally about detecting common failures and recovering from them, without requiring human intervention.
Value your tools more
As an short term patch, you should run deploys in the mornings or whenever everyone is around and fresh. Then take a hard look at your deploy pipeline.
In too many organizations, deploy code is a technical backwater, an accumulation of crufty scripts and glue code, forked gems and interns’ earnest attempts to hack up Capistrano. It usually gives off a strong whiff of “sloppily evolved from many 2 am patches with no code review”.
This is insane. Deploy software is the most important software you have. Treat it that way: recruit an owner, allocate real time for development and testing, bake in metrics and track them over time.
If it doesn’t have an owner, it will never improve. And you will need to invest in frequent improvements even after you’re over this first hump.
Signal high organizational value by putting one of your best engineers on it.
Recruit help from the design side of the house as well. The “right” thing to do must be the fastest, easiest thing to do, with friendly prompts and good docs. No “shortcuts” for people to reach for at the worst possible time. You need user research and design here.
Track how often deploys fail and why. Managers should pay close attention to this metric, just like the one for people getting interrupted or woken up, and allocate time to fixing this early whenever it sags. Before it gets bad.
Allocate real time for development, testing, and training — don’t expect the work to get shoved into people’s “spare time” or post mortem cleanup time. Make sure other managers understand the impact of this work and are on board. Make this one of your KPIs.
In other words, make deploy tools a first class citizen of your technical toolset. Make the work prestigious and valued — even aspirational. If you do performance reviews, recognize the impact there.
(Btw, “how we hardened our deploys” is total Velocity-bait (&& other practitioner conferences) as well as being great for recruiting and general visibility in blog post form. People love these stories; there definitely aren’t enough of them.)
Turn software engineers into software owners
The canonical CI/CD advice starts with “ship early, ship often, ship smaller change sets”. That’s great advice: you should definitely do those things. But they are covered plenty elsewhere. What’s software ownership?
Software ownership is the natural end state of DevOps. Software engineers, operations engineers, platform engineers, mobile engineers — everyone who writes code should be own the full lifecycle of their software.
Software owners are people who:
Can deploy and roll back their own code
Are able to debug their own issues in prod (via instrumentation, not ssh)
If you’re lacking any one of those three ingredients, you don’t have ownership.
Why ownership? Because software ownership makes for better engineers, better software, and a better experience for customers. It shortens feedback loops and means the person debugging is usually the person with the most context on what has recently changed.
Some engineers might balk at this, but you’ll be doing them a favor. We are all distributed systems engineers now, and distributed systems require a much higher level of operational literacy. May as well start today.
Fail fast, fix fast
This is about shifting your mindset from one of brittleness and a tight grip, to one of flexibility where failures are no big deal because they happen all the time, don’t impact users, and give everyone lots of practice at detecting and recovering from them.
Here are a few of the best practices you should adopt with this practice.
The engineer who writes the code and merges the PR should also run the deploy
Everyone who writes code must be trained in how to deploy, roll back & revert to last known good state (before escalating if necessary). They should also know the basics of instrumentation, feature flagging and debugging in prod..
After deploying you MUST go verify: are your changes behaving as expected? Does anything else look .. unexpected? You have the most context on what to expect; just two minutes spent verifying that things look reasonable will catch the overwhelming majority of errors before users even notice.
Everyone who puts software in production needs to understand and feel responsible for the full lifecycle of their code, not just how it works in their IDE.
Baking: it’s not just for cookies
Shipping something to production is a process of incrementally gaining confidence, not a switch you can flip.
You can’t trust code until it’s been in prod a while, until you’ve seen it perform under a wide range of load and concurrency scenarios, in lots of partial failure modes. Only over time can you develop confidence in it not being terrible.
Nothing is production except production. Don’t rely on never failing; expect failure, embrace failure. Practice failure! Build guard rails around your production systems to help you find and fix problems quickly.
The changes you need to make your pipeline more resilient are roughly the same changes you need to safely test in production. These are a few of your guard rails.
Build canaries for your deploy process, so you can promote releases gracefully and automatically to larger subsets of your traffic as you gain confidence in them
Create cohorts. Deploy to internal users first, then any free tier, etc in order of ascending importance. Don’t jump from 10% to 25% to 50% and then 100% — some changes are related to saturating backend resources, and the 50%-100% jump will kill you.
Have robots check the health of your software as it rolls out to decide whether to promote the canary. Over time the robot checks will mature and eventually catch a ton of problems and regressions for you.
The quality of code is not knowable before it hits production. You may able to spot some problems, but you can never guarantee a lack of then. It takes time to bake a new release and gain incremental confidence in new code.
Get someone to own the deploy software
Value the work
Create a culture of software ownership
LOOK at what you’ve done after you do it
Be suspicious of new versions until they prove themselves
Two blog posts in one weekend! That’s definitely never happened before. Thanks to Baron for asking me to draft this up following the weekend’s twitter thread: https://twitter.com/mipsytipsy/status/1030340072741064704.
Let’s talk about influence. As an engineer, how do you get influence? What does influence look like, what is it rooted in, how do you wield it or lose it? How is it different from the power and influence you might have as a manager?
This often comes up in the context of ICs who desperately want to become managers in order tohave more access to information and influenceover decisions. This is a bad signal, though it’s sadly very common.
When that happens, you need to do some soul-searching. Does your org make space for senior ICs to lead and own decisions? Do you have an IC track that runs parallel to the manager track at least as high as director level? Are they compensated equally? Do youhave a career ladder? Are your decision-making processes mysterious to anyone who isn’t a manager? Don’t assume what’s obvious to you is obvious to others; you have to ask around.
If so, it’s probably their own personal baggage speaking. Maybe they don’t believe you. Maybe they’ve only worked in orgs where managers had all the power. Maybe they’ve even worked in lots of places that said the exact same things as you are saying about how ICs can have great impact, but it was all a lie and now they’re burned. Maybe they aren’t used to feeling powerful for all kinds of reasons.
Regardless, people who want to be managers in order to perpetuate a bad power structure are the last people you want to be managers.
But what does engineering influence look like? How do your powers manifest?
I am going to avoid discussing the overlapping and interconnected issues of gender, race and class, let’s just acknowledge that it’s much more structurally difficult for some to wield power than for others, ok?
The power to create
Doing is the engineering superpower. We create things with just a laptop and our brain! It’s incredible! We don’t have to constantly convince and cajole and coerce others into building on our behalf, we can just build.
This may seem basic, but it matters. Creation is the ur-power from which all our forms of power flow. Nothing gets built unless we agree to build it (which makes this an ethical issue, too).
Facebook had a poster that said “CODE WINS ARGUMENTS”. Problematic in many ways, absolutely. But how many times have you seen a technical dispute resolved by who was willing to do the work? Or “resolved” one way.. then reversed by doing? Doing ends debates. Doing proves theories. Doing is powerful. (And “doing” doesn’t only mean “write code”.)
Furthermore, building software is a creative activity, and doing it at scale is an intensely communal one. As a creative act, we are better builders when we are motivated and inspired and passionate about our work (as compared to say, chopping wood). And as a collaborative act, we do better work when we have high trust and social cohesion.
Engineering ability and judgment, autonomy and sense of purpose, social trust and cooperative behaviors: this is the raw stuff of great engineering. Everybody has a mode or two that they feel most comfortable and authoritative operating from: we can group these roughly into archetypes.
(Examples drawn from some of the stupendously awesome senior engineers I’ve gotten to work with over the years, as well as the ways I loved to fling my weight around as an engineer.)
Archetypes of influence
“Doing the work that is desperately hard and desperately needed — and often desperately dull.” SOC2 compliance, backups and restores, terrifying refactors, any auth integration ever: if it’s moving the business forward, they don’t give a shit how dull the work is. If you are this engineer, you have a deep well of respect and gratitude.
“Debugger of last resort.” Often the engineer who has been there the longest or originally built the system. If you are helpful and cheerful with your history and context, this is a huge asset. (People tend to wildly overestimate this person’s indispensability, actually; please don’t encourage this.)
The “expert” archetype is closely related. If you are the deep subject matter expert in some technology component, you have a shit ton of influence over anything that uses or touches that component. (You should stay up on impending changes to retain your edge.)
There are people who deliver a bafflingly powerful firehose of sustained output, sometimes making headway on multiple fronts at once. Some work long hours, others just have an unerring instinct for how to maximize impact (this sometimes maps to junior/senior manifestations). Nobody wants to piss off those people. Their consent is critical for … everything. Their participation will often turbo charge a project or pull a foundering effort over the finish line.
Not all influence is rooted in raw technical strength or output. Just a few of the wide variety of creative/collaborative/interpersonal strengths:
Some engineers are infinitely curious, and have a way of consistently sniffing a few steps ahead of the pack. They might seem to be playing around with something pointless, and you want to scold them; then they save your ass from total catastrophe. You learn to value their playing around.
Some engineers solve problems socially, by making friends and trading tips and fixes and favors in the industry. Don’t underestimate social debugging, it’s often the quickest path to the right answer.
Some are dazzlingly lazy and blow your mind with their elegant shortcuts and corners correctly cut.
Some are recruiting magnets, and it’s worth paying their salary just for all the people who want to work with them again.
Some are skilled at driving consensus among stakeholders.
Some are killer explainers and educators and storytellers.
Some are the senior engineer everyone silently wants to grow up to be.
Some can tell such an inspiring story of tomorrow that everyone will run off to make it so.
Some teach by turning code reviews into a pedagogical art form.
Some make everyone around them somehow more productive and effective. Some create relentless forward momentum. Some are good at saying no.
And there are a few special wells of power that bear calling out as such.
Engineers who have been managers are worth their weight in gold. They can translate business goals for junior engineers in their native language with impeccable credibility (something managers never really have, esp in junior engineers’ eyes.). They make strong tech leads, they can carve up projects into components that challenge but do not overwhelm each contributor while hitting deadlines.
Some engineers are a royal pain in the ass because they question and challenge every system and hierarchy. But these are sharp, powerful rocks that can polish great teams. Though they do require a strong manager, to channel t heir energy towards productive dialogue and improvement and keep them from pissing off the whole team.
And let’s not forget engineers who are on call. If you have a healthy on call culture,your ownership over production creates a deep, deep well of power and moral authority — to make demands, drive change, to prioritize. On call should not be a shit salad served up to those who can’t refuse, it should be a badge of honor and seriousness shouldered by every engineer who ships code. (And it should not be miserable or regularly life-impacting.)
… I could go on all day. Engineering is such a powerful role and skill set. It’s definitely worth unpacking where your own influence comes from, and understanding how others perceive your strengths.
Most forms of power boil down to “influence, wielded”.
But just banging out code is not enough. You may have credibility, but having it is not the same as using it. To transform influence into power you have to use it. And the way you use it is by communicating.
What’s locked up in your head has no impact on the rest of us. You have to get it out.
You can do this in lots of ways: by writing, in 1x1s, conversations with small groups, openly recruiting allies, convincing someone with explicit authority, broadcasting inpublic, etc.
Because engineering is a creative activity, authoritarian power is actually quite brittle and damaging. The only sustainable forms of power are so-called “soft powers” like influencing and inspiring, which is why good managers use their soft power freely and hard power sparingly/with great reluctance. If your leadership invokes authority on the regular, that’s an antipattern.
If you don’t speak up, you don’t have the right to sit and fume over your lack of influence. And speaking up does mean being vulnerable — and sometimes wrong — in front of other people.
This is not a zero-sum game.
Most of you have far more latent power than you realize or are used to wielding, because you don’t feel powerful or don’t recognize what you do in those terms.
Managers may have hard power and authority, but the real meaty decisions about technical delivery and excellence are more properly made by the engineers closest to them. These belong properly to the doers, in large part because they are the ones who have to support the consequences of these decisions.
Power tends to flow towards managers because they are privy to more information. That makes it important to hire managers who are aware of this and lean against it to push power back to others.
In the same way that submissives have ultimate power in healthy BDSM relationships, engineers actually have the ultimate power in healthy teams. You have the ultimate veto: you can refuse to create. Demand is high for your skills. You can usually afford to look for better conditions. Many of you probably should.
And when technical and managerial priorities collide, who wins? Ideally you work together to find the best solution for the business and the people. The teams that feel 🔥on fire🔥 always have tight alignment between the two.
Pick your battles.
One final thought. You can have a lot of say in what gets built and how it gets built, if you cultivate your influence and spend it wisely. But you can’t have a say in everything. It doesn’t work that way.
Think of it like @mcfunley’s famous “innovation tokens”, but for attention and fucks given.
The more you use your influence for good outcomes, the more you build up over time, yes … but it’s a precision tool, not background noise. Imagine someone trying to give you a massage by laying down on your whole back instead of pushing their elbow or hand into knots and trigger points. A too-broad target will diffuse your force and limit your potential impact.
Spend your attention tokens wisely.
And once you have influence, don’t forget to use it on behalf of others. Pay attention to those who aren’t being heard, and amplify their voices. Give your time, lend your patronage and credibility, and most of all teach the skills that have made you powerful to others who need them.
P.S. I owe a huge debt to all the awesome senior engineers i’ve gotten to work with. Mad love to you all. <3
 I successfully answered one (1) of these questions before running out of steam. Later.
 Sheepish confession: this is why I became a manager.
 It’s also a bad sign if they won’t grant any explicit authority to the people they hold responsible for outcomes. I’m talking about relatively healthy orgs here, not pathological ones where people (often women) are told they don’t need promotions or explicit authority, they should just use their “soft power” — esp when the hard forms of power aligned against with them. That’s setting you up for failure.
 Some people seem caught off guard by my use of “power” to signal anything other than explicit granted powers by the org. This doesn’t make any sense to me. I find it too depressing and disempowering to think of power as merely granted authority. It doesn’t map to how I experience the world, either. Individual clout is a thing that waxes and wanes and only exists in relation to others’. I’ve seen plenty of weak managers pushed around by strong personalities (which is terrible too).
Power has a way of flowing towards people managers over time, no matter how many times you repeat “management is not a promotion, it’s a career change.”
It’s natural, like water flowing downhill. Managers are privy to performance reviews and other personal information that they need to do their jobs, and they tend to be more practiced communicators. Managers facilitate a lot of decision-making and routing of people and data and things, and it’s very easy to slip into making the all decisions rather than empowering people to make them. Sometimes you want to just hand out assignments and order everyone to do as told. (er, just me??)
But if you let all the power drift over to the engineering managers, pretty soon it doesn’t look so great to be an engineer. Now you have people becoming managers for all the wrong reasons, or everyone saying they want to be a manager, or engineers just tuning out and turning in their homework (or quitting). We all want autonomy and impact, we all crave a seat at the table. You need to work harder to save those seats for non-managers.
So, in the spirit of the enumerated rights and responsibilities of our musty Constitution, here are some of the commitments we make to our engineers at Honeycomb — and some of the expectations we have for managering and engineering roles. Some of them mirror each other, and others are very different.
(Incidentally, I find it helpful to practice visualizing the org chart hierarchies upside down — placing managers below their teams as support structure rather than perched atop.)
Engineer’s Bill of Rights
You should be free to go heads down and focus, and trust that your manager will tap you when you are needed (or would want to be included).
We will invest in you as a leader, just like we invest in managers. Everybody will have opportunities to develop their leadership and interpersonal skills.
Technical decisions must remain the provenance of engineers, not managers.
You deserve to know how well you are performing, and to hear it early and often if you aren’t meeting expectations.
On call should not substantially impact your life, sleep, or health (other than carrying your devices around). If it does, we will fix it.
Your code reviews should be turned around in 24 hours or less, under ordinary circumstances.
You should have a career path that challenges you and contributes to your personal life goals, with the coaching and support you need to get there.
You should substantially choose your own work, in consultation with your manager and based on our business goals. This is not a democracy, but you will have a voice in our planning process.
You should be able to do your work whether in or out of the office. When you’re working remotely, your team will loop you in and have your back.
Make forward progress on your projects every week. Be transparent.
Make forward progress on your career every quarter. Push your limits.
Build a relationship of trust and mutual vulnerability with your manager and team, and invest in those relationships.
Know where you stand: how well are you performing, how quickly are you growing?
Develop your technical judgment and leadership skills. Own and be accountable for engineering outcomes. Ask for help when you need it, give help when asked.
Give feedback early and often, receive feedback gracefully. Practice both saying no and hearing no. Let people retract and try again if it doesn’t come out quite right.
Own your time and actively manage your calendar. Spend your attention tokens mindfully.
Recruit and hire and train your team. Foster a sense of solidarity and “teaminess” as well as real emotional safety.
Care for every engineer on your team. Support them in their career trajectory, personal goals, work/life balance, and inter- and intra-team dynamics.
Give feedback early and often. Receive feedback gracefully. Always say the hard things, but say them with love.
Move us relentlessly forward, watching out for overengineering and work that doesn’t contribute to our goals. Ensure redundancy/coverage of critical areas.
Own the quarterly planning process for your team, be accountable for the goals you set. Allocate resources by communicating priorities and recruiting eng leads. Add focus or urgency where needed.
Own your time and attention. Be accessible. Actively manage your calendar. Try not to make your emotions everyone else’s problems (but do lean on your own manager and your peers for support).
Make your own personal growth and self-care a priority. Model the values and traits we want our engineers to pattern themselves after.
I’d love to hear from anyone else who has a list like this.
Okay! As of today it’s been one week since I wrote some advice and the internet exploded in my face, so now it’s time to do what I always do: post mortem that shit.
This is going to be long. I erred by making my first post too short, so I’m going to ship $(allthedetail) this time. Duly warned.
Around 8 am on Friday, March 2nd, after pulling an all-nighter, I decided to pound out a quick blog post that has been on my todo list forever: the only advice I feel equipped to give on how to succeed in tech.
My advice, in brief, was this:
as a junior engineer, tough it out. work hard, learn everything, earn your stripes.
stay technical. don’t get sucked into an offramp unless you are god damn sure you want out for good.
once you are senior, use your power to advocate for others and fuck that shit up.
Money, power, credibility. This is the best way I know how to earn these things. This is what worked for me and most of the senior technical women I know and admire.
First of all: I don’t think there should be anything controversial at all about this advice. It’s good advice, if a bit bluntly put. Pick your battles, show strategic impact, leverage your influence into power and use that power to fuck shit up in the manner of your choosing.
The fact is, we are far too chickenshit about telling young women straight up how to succeed at work. We praise them for all kinds of dumb shit and second shift work and emotional labor that has little if any strategic impact to the bottom line, and wonder why they’re burned out and resentful.
We live in a fallen world. I didn’t make it this way, I just want to help you level up to be a powerful destroyer being so you can make it better.
So I hit “publish”.
Around 9:30 am, Camille Fournier gave me a bunch of unsolicited criticism. Unfortunately, due to some sour personal history with Camille I was extremely not disposed to receive this from her. I can be a resentful little shit: as soon as she told me to change it in certain ways, it was the last fucking thing in the world I was going to do.
For a few hours, all the feedback was good. People liked my advice to stay technical (“god I wish someone had told me that 15 years ago”) and my pointing out the loophole that lets women advocate for each other without being penalized.
A few people nailed what I was trying to say even better than I did:
IME the optimal strategy at an individual/micro level is wildly different or even opposite from optimal strategy at a group/macro level — the politics of the individual being different in kind to the politics of the group
This has been my biggest struggle with some of the talking points - it sounds as if I’m completely deprived of any agency in how my life pans out. Which I find very disturbing - since it then makes me feel like I have no control over anything. That’s very disempowering.
But by the end of the day I was receiving a steady stream of angry tweets from people I had never heard of, with objections that seemed puzzling and ridiculous to me.
They were acting as though the sum total of my advice had been ordering bullied and abused people to just shut up and tough it out. Soon I was getting tweets accusing me of trashing all diversity work, trashing all women, only being out for myself and my own career, erasing sexual assault, being insensitive and destructive to people of color, and on and on.
People were subtweeting me like crazy, or DM’ing me telling me how much they liked my piece but were afraid to say so in public. Others were harassing my engineering managers and people who follow me.
I have never received textual scrutiny of this type before, where every single word was turned over and macerated and peered at for evidence of traitorous views. It sucks. (And it’s pretty hypocritical, to say the least … some of these same women who were gleefully bashing me for clumsy words remain good friends with men who are actual known harassers and abusers of women.)
Lots of people wanted me to take the post down immediately, or publish a retraction or correction immediately. Some prominent feminists publicly chided me and refused to talk to me until I repented of my sins. 🙄
Let’s be clear. I have no problem admitting my errors and making amends. I do it all the fucking time. But I am disinclined to grovel before a howling mob. It wasn’t even clear to me what I had done wrong, given all the contradictory noises.
So I decided to wait a week before responding, so I could talk to people and figure out what to take away from the mess.
(Also last week: traveled to multiple continents, flew a few dozen hours, wrote multiple talks, delivered presentations at various conferences and meetups, visited and pitched to potential customers, managed a handful of teams, fit 1x1s in between hops and time zones and you know just tried to do my fucking job while dealing with crazed nuts screaming abuse at me online.)
I had a couple of hard but helpful conversations with people like Alice Goldfuss and Courtney Nash, who took the time to walk me through ways that what I wrote may be misinterpreted or wrongly received. This feedback can mostly be bucketed into the following categories:
“Assume the reader knows nothing about you and considers you hostile until proven otherwise.” Well shit, I am not used to writing defensively. I live my life in high trust, high transparency environments and prefer it that way.
“Your advice doesn’t apply to $x.” True! I didn’t bracket it in layers of padding — “this is just what worked for me”, “may not apply to every situation” — because I thought that was freaking obvious.
“It sounds like you are shit talking all diversity efforts.” No, but I was waving vaguely in the direction of some very cynical and tired feelings on the subject. I’m pretty over corporate diversity issues and pinkwashing that doesn’t expand opportunity or share power.
“It sounds like you are shitting on all women.” Oof. This is the one that is really painful, because this is the one I have been working hard on for close to 20 years… and should have seen coming. I did intend to put some space between myself and women in tech, because I don’t exactly identify as a woman.. exactly. I grew up fundamentalist and misogynist af, and have been working hard to recover from that ever since I left home at 15.
“Maybe you shouldn’t give advice to women at all.” Courtney challenged me on whether I should speak to women, given my ambivalence wrt my own gender identification. Which is an interesting question that I have pondered a lot.
This was all desperately inevitable and predictable, however, and I made some unforced errors. So let’s talk about what I do and don’t regret about all this, and what I would or would not do differently.
NO REGRETS: giving the advice. It’s good advice, it needed to be said. I’m tired of seeing women burn themselves out on shitty corporate diversity work that only diverts their energy from amassing real power and strategic impact. Not sorry.
REGRETS: I was sloppy about waving in the direction of my gender issues. I intended to put some space between myself and “women’s issues”, because I don’t exactly identify as a woman, exactly, and I have always felt uncomfortable in women’s spaces. Given the historic devaluation of women’s spaces and issues, I should have been clearer. I am sorry.
NO REGRETS: I think it’s fine for me to give advice to women if they ask, which they do. After all I was raised as a woman, have always been read and treated as one, and assumed there was no other option for 30+ years. I get to speak. Not sorry.
SEMI-REGRETS: I still can’t figure how anyone managed to project into my piece that I was slamming all diversity work. I said somewhat colorfully that a lot of the advice didn’t work for me and wasn’t my favorite thing to dwell on, in the same grouchy grumbly tone that I use when bitching about query planners and terraform variable interpolation. I don’t think this would have been a big deal if the frenzy hadn’t gotten whipped up, but if anyone genuinely felt hurt or dismissed by it, I can be sorry for that.
REGRETS: the impact it had on my poor engineering managers and other people who work with me. They are still being asked to denounce me or defend me and their decision to work with me. So deeply not ok. I am sorry — but that’s really on you, internet assholes.
BIGGEST REGRETS: any accidental cover given to misogynists. By far the most annoying thing about the brouhaha has been when men with toxic views compliment me because they think I’m agreeing with them. I am NOT, so get off me. Sorry not sorry.
SADDEST REGRETS: my plummeting opinion of the feminist internet trash mob. I am a feminist and damn proud of it, but I am also disgusted by the hyperperformative boundary policing of certain self-proclaimed “tech feminists”. If your great joy in life is roving the interwebs looking for any toes pressing a line so you can rapturously castigate them and shun them until they have licked your boots and begged for forgiveness … if you love performing elaborate outrage rituals and whipping up a frenzy of whispers or a witch hunt… then:
Fuck. The Fuck. Off. You are an embarrassment. This is about your ego, and your manufactured grievance machines are Not Helping.
I honestly thought these feminist pile-on mobs were a right-wing fantasy, and I’m sad that I was wrong. I’m also pretty sad about all the folks who know me and have every reason to know better. In my world you check in with your friends before leaping to judgment, and you help teach each other when you’re being stupid. A pretty dismal number of people I would have called friends just leapt excitedly into the fray passing judgment.
So now I know more about who my friends are.
Why even stick my neck out? I guessed something might go wrong, I just didn’t know what. So why?
Because I want to help, dammit. The farther I get in my career the more time I spend pondering how to bring others along with me, how to open the gates a little wider.
I’ve gotten to do a few things. I have tried to create an equitable, respectful working environment where everyone can do their best work, with managers who are passionate about diversity and strong where I am weak.
But … I have felt very often alienated by the messaging and attempts to help women. I can’t be the only one who responds more to a strategic message than an empathetic one, who feels condescended to and patronized by the mainstream corporate efforts.
I can’t be the only one who feels simmering resentment every time I get held up as a successful “woman in tech” (the world’s worst participation trophy). I don’t want a fucking consolation prize. I want to sweep the competition, I want to change the world. I can’t be the only one who hungers for power, money and credibility.
I know I’m not, actually. I know because they are telling me. The response has been at least 100-1 positive in private — from junior women especially — thanking me for being brutally honest and treating them like adults, like equals. (I’ve been told there are armies of women who feel dreadfully hurt but too afraid to say so. Pity if true, as they say.)
There has always been tension between the people who see the world as it is and fight to succeed in it, and the people who opt out and refuse to participate because it’s compromised. The world needs us both. So shut the fuck up and let the kids pick for themselves.
And maybe stop persecuting the people who stand with you.
I don’t really do “women stuff”. I don’t really identify with any gender and I find a lot of the advice to be condescending and overly delicate, and it’s just a really boring thing to think and talk about. For me.
But I’m feeling guilty after turning down a bunch of requests to do shit for International Women’s Day next week. So I’m gonna do a thing I’ve been avoiding doing for years, and write down my (deeply problematic but practical) advice.
Toughen up. For your first 10 years or 3 jobs in the industry, you’re a junior contributor. You need them way more than they need you, so suck it up. Try not to dwell on the bullshit. Work hard and level up and always angle for more money and power when you can.
Stay technical. There are a thousand paved ramps out of engineering roles and only a few hard paths back in. Technical excellence is currency in this industry, even more so if your credibility is gonna get challenged again and again. So don’t stop engineering til you’re great at it.
Use your power for good. Once you become a senior contributor — and i’m not talking bullshit titles but real seniority, when people are coming to you for help far more than you go to them — then you can afford to get sensitive. …On behalf of others. The research convincingly shows that women get punished for advocating for themselves, but not for advocating for others. It’s a sweet loophole, use it.
If you feel like table flipping out of tech, just remember the rest of the world is at LEAST as sexist as tech is, but without the money and power and ridiculous life-coddling. Where exactly do you think you’re going to go?
Don’t quit tech: quit your job. There are LOTS of tolerable-to-great companies out there. If you stay and suffer, you’re just rewarding the shitholes with your presence. Don’t reward the shitholes any more than you can help it.
Learn shit, save your money, amass great power. Then use it to fuck shit up.