AWS Networking, Environments and You

Last week we hit an important milestone in the life of our baby startup: a functional production environment, with real data flowing from ingestion to storage to serving queries out!  (Of course everything promptly exploded , but whatever.)

This got me started thinking about how to safely stage and promote terraform changes, in order to isolate the blast radius of certain destructive changes.  I tweeted some random thing about it,

… and two surprisingly fun things happened.  I learned some new things about AWS that I didn’t know (after YEARS of using it).  And hilariously, I learned that I and lots of other people still believe things about AWS that aren’t actually true anymore.

That’s why I decided to write this.  I’m not gonna get super religious on you, but I want to recap the current set of common and/or best practices for managing environments, network segmentation, and security automation in AWS, and I also want to remind you that this post will be out of date as soon as I publish it, because things are changing pretty crazy fast.

Multiple Environments: Why I Should Care?

An environment exists to provide resource isolation.  You might want this for security reasons, or testability, or performance, or just so you can give your developers a place to fuck around and not hurt anything.

Maybe you run a bunch of similar production environments so you can give your customers security guarantees or avoid messy performance cotenancy stuff.  Then you need a template for stamping out lots of these environments, and you need to be *really* sure they can’t leak into each other.

Or maybe you are more concerned about the vast oceans of things you need to care about beyond whatever unit tests or functional tests are running on your laptop.  Like: capacity planning, load testing, validating storage changes against production workloads, exercising failovers, etc.  Any scary changes you  have, you need a production-like env to practice in.

Bottom line: If you can’t spin up a full copy of your infra and test it, you don’t actually have “infrastructure as code”.  You just have … some code, and duct tape.

The basics are simple:

  • Non-production environments must be walled off from production as strongly as possible.  You should NEVER be able to accidentally to connect to a prod db from staging (or from one prod env to another).
  • Production and non-production environments (or all other prod envs) should share as much of the same tooling and code paths as possible.  Like, some amount asymptotically approaching 100%.  Any gaps there will inevitably, eventually bite you in the ass.

Managing Multiple Environments in AWS

There are baaaasically three patterns that people use to manage multiple environments in AWS these days:

  1. One AWS billing account and one flat network (VPC or Classic), with isolation performed by routes or security groups.
  2. Many AWS accounts with consolidated billing.  Each environment is a separate account (often maps to one acct per customer).
  3. One AWS billing account and many VPCs, where each environment ~= its own VPC.

Let’s start old school with a flat network.

1:  One Account, One Flattish Network

This is what basically everyone did before VPC.  (And ummm let’s be honest, lots of us kept it up for a while because GOD networking is such a pain.)

In EC2 Classic the best you got was security groups.  And — unlike VPC security groups — you couldn’t stack them, or change the security group of an instance type without destroying it, and there was a crazy low hard cap on the (# of secgroup rules * # of secgroups).  You could kind of gently “suggest” environments with things like DNS subdomains and config management, and sometimes you would see people literally just roll over and resort to  $ENV variables.

Most people either a) gave up and just had a flat network, or b) this happened.

At Parse we did a bunch of complicated security groups plus chef environments that let us spin up staging clusters and the occasional horrible silo’d production stack for exceptional customer requirements.  Awful.

VPC has made this better.  Even if you’re still using a flat network model.  You can now manage your route tables, stack security groups and IAM rules and reapply to existing nodes without destroying the node or dropping your internet connections.  You can define private network subnets with NAT gateways, etc.

Some people tell me they are using a single VPC with network ACLs to separate environments for ease of use, because you can reuse security groups across environments.  Honestly this seems a bit more like a bug than a feature to me, because a) you give up isolation and b) I don’t see how that helps you have a versioned, tested stack.

Ok moving on to the opposite of the spectrum, the crazy kids who are doing tens (or TENS OF THOUSANDS) of linked accounts.

2:  One AWS Account Per Environment

A surprising number of people adopted this model in the bad old EC2 Classic days, not because they necessarily wanted it but because they need a stronger security model and looser resource caps.  This is why AWS released Consolidated Billing way back in 2010.

I actually learned a lot this week about the multi-account model!  Like that you can create IAM roles that span accounts, or that you can share AMIs between accounts..  This model is complicated, but there are some real benefits.  Like real, actual, hardcore security/perf isolation.  And you will run into fewer resource limits than jamming everything into a single account, and revoking/managing IAM creds is clearer.

Security nerds love this model, but it’s not clear that …. literally anyone else does.

Some things that make me despise it without even trying it:

  • AWS billing is ALREADY a horrendous hairball coughed up by the world’s ugliest cat, I can’t even imagine trying to wrangle multiple accounts.
  • It’s more expensive, you incur extra billing costs.
  • Having to explicitly list any resource you want to share between any accounts just makes me want to tear my hair out strand by strand while screaming on a street corner
  • Account creation API still has manual steps, like getting certs/keypairs.
  • You cannot make bulk changes to accounts, and AWS doesn’t like you having thousands of linked accounts.  Also limits your flexibility with Reserved Instances.

Here is a pretty reasonable blog p0st laying out some of the benefits though, and as you can see, there are plenty of other crazy people who like it.  Mostly security nerds.

3:  One AWS Account, One VPC Per Environment

I have saved the best for last.  I think this is the best model, and the one I am adopting.  You spin up a production VPC, a staging VPC, dev VPC, Travis-CI VPC.  EVERYBODY GETS A VPC!#@!

One of those things that everybody seems to “know” but isn’t true is that you can’t have lots of VPCs.  Yes, it is capped at 5 by default, and many people have stories about how they couldn’t get it raised, and that used to be true.  But the hard cap is now 200, not 10, so VPC awayyyyyyyy my pretties!

Here’s another reason to love VPC <-> env mapping: orchestration is finally coming around to the party.  Even recently people were still trying to make chef-metal a thing, or developing their own coordination software from scratch with Boto, or just using the console and diffing and committing to git.

Dude, stop.  We are past the point where you should default to using Terraform or CloudFormation for the bones of your infrastructure, for the things that rarely change.  And once you’ve done that you’re most of the way to a reusable, testable stack.

Most of the cotenancy problems that account-per-env solved are a lot less compelling to me now that VPCs exist.

VPCs are here to help you think about infrastructure like regular old code.  Lots of VPCs are approximately as easy to manage as as one VPC.  Unlike lots of accounts, which are there to give you headaches and one-offs and custom scripts and pointy-clicky shit and complicated horrible things to work around.

VPCs have some caveats of their own.  Like, you can only assign a /16 to any VPC.  If you’re using 4 availability zones and public subnets + natted private subnets, that’s only ~8k IPs per subnet/AZ pair.  Shrug.

You can peer security groups across different VPCs, but not across regions (yet).  Also, if you’re a terraform user, be aware that it handles VPC peering fine but doesn’t handle multiple accounts very well.

Lots of people seem to have had issues with security group limits per VPC, even though the limit is 500 and says it can be raised by request.  I’m …. not sure what to think of that.  I *feel* like if you’re building a thing with > 500 sec group rules on a single VPC, you’re probably doing something wrong.

Test my code you piece of shit I dare you

Here’s the thing that got me excited about this from the start though, which is having the ability to do things like test terraform modules on a staging VPC from a branch before promoting the clean changes to master.  If you plan on doing things like running bleeding-edge software in production *cough* you need allllll the guard rails and test coverage you can possibly scare up.  VPCs help you get this in a massive way.

Super quick example, say you’re using adding a NAT gateway to your staging cluster, you would use the remote git source with your changes:

// staging
module "aws_vpc" {
  source = "git::ssh://"
  env = ${var.env}

And then once you’ve validated the change, you simply merge your branch to master and run terraform plan/apply to production.

// production
module "aws_vpc" {
  source = ""
  env = "${var.env}"

And for GOD’S SAKE USE DIFFERENT STATE FILES FOR EACH VPC / ENVIRONMENT okayyyyyy but that is a different rant, not an AWS rant, so let’s move along.

In Conclusion

There are legit reasons to use all three of these models, and infinite variations upon them, and your use case is not my use case, blah blah.

But moving from one VPC to multiple VPCs is really not a big deal.  I know a lot of us bear scars, but it is nothing like the horror that was trying to move from EC2 Classic to VPC.

VPC has a steeper learning curve than Classic, but it is sooooo worth it.  Every day I’m rollin up on some new toy you get with VPC (ICMP for ELBs! composable IAM host roles! new VPC NAT gateway!!!).  The rate at which they’re releasing new shit for VPC is staggering, I can barely keep up.

Alright, said I wasn’t gonna get religious but I totally lied.

VPC is the future and it is awesome, and unless you have some VERY SPECIFIC AND CONVINCING reasons to do otherwise, you should be spinning up a VPC per environment with orchestration and prob doing it from CI on every code commit, almost like it’s just like, you know, code.

That you need to test.

Cause it is.

(thank you to everybody who chatted with me and taught me things and gave awesome feedback!!  knuckle tatts credit to @mosheroperandi)


AWS Networking, Environments and You

Two weeks with Terraform

I’ve been using terraform regularly for 2-3 weeks now.  I have terraformed in rage, I have terraformed in delight.  I thought it might be helpful to share some of my notes and lessons learned.

Why Terraform?

Because I am fucking sick and tired of not having versioned infrastructure.  Jesus christ, the ways my teams have bent over backwards to fake infra versioning after the fact (nagios checks running ec2 diffs, anyone?).

Because I am starting from scratch on a green field project, so I have the luxury of experimenting without screwing over existing customers.  Because I generally respect Hashicorp and think they’re on the right path more often than not.

If you want versioned infra, you basically get to choose between 1) AWS CloudFormation and its wrappers (sparkleformation, troposphere), 2) chef-provisioner, and 3) Terraform.

The orchestration space is very green, but I think Terraform is the standout option.  (More about why later.)  There is precious little evidence that TF was developed by or for anyone with experience running production systems at scale, but it’s … definitely not as actively hostile as CloudFormation, so it’s got that going for it.

First impressions

Stage one: my terraform experiment started out great.  I read a bunch of stuff and quickly spun up a VPC with public/private subnets, NAT, routes, IAM roles etc in < 2 days.  This would be nontrivial to do in two days *without* learning a new tool, so TOTAL JOY.

Stage two: spinning up services.  This is where I started being like … “huh.  Has anyone ever actually used this thing?  For a real thing?  In production?”  Many of the patterns that seemed obvious and correct to me about how to build robust AWS services were completely absent, like any concept of a subnet tier spanning availability zones.  I did some inexcusably horrible things with variables to get the behavior I wanted.

Stage three: … modules.  Yo, all I wanted to do was refactor a perfectly good working config into modules for VPC, security groups, IAM roles/policies/users/groups/profiles, S3 buckets/configs/policies, autoscaling groups, policies, etc, and my entire fucking world just took a dump for a week.  SURE, I was a TF noob making noob mistakes, but I could not believe how hard it was to debug literally anything..

This is when I started tweeting sad things.

The best (only) way of debugging terraform was just reading really, really carefully, copy-pasting back and forth between multiple files for hours to get all the variables/outputs/interpolation correct.  Many of the error messages lack any context or line numbers to help you track down the problem.  Take this prime specimen:

Error downloading modules: module aws_vpc: Error loading .terraform
/modules/77a846c64ead69ab51558f8c5be2cc44/ Error reading 
config for aws_route_table[private]: parse error: syntax error

Any guesses?  Turned out to be a stray ‘}’ on line 105 in a different file, which HCL vim syntax highlighting thought was A-OK.  That one took me a couple hours to track down.

Or this:

aws_security_group.zookeeper_sg: cannot parse '' as int: 
strconv.ParseInt: parsing "": invalid syntax

Which *obviously* means you didn’t explicitly define some inherited port as an int, so there’s a string somewhere there lurking in your tf tree.  (*Obviously* in retrospect, I mean, after quite a long time poking haplessly about.)

Later on I developed more sophisticated patterns for debugging terraform.  Like, uhhh, bisecting my diffs by commenting out half of the lines I had just added, then gradually re-adding or re-commenting out more lines until the error went away.

Security groups are the worst for this.  SO MANY TIMES I had security group diffs run cleanly with “tf apply”, but then claim to be modifying themselves over and over.  Sometimes I would track this down to having passed in a variable for a port number or range, e.g. “cidr_blocks = [“${var.ip_range}”]”.  Hard-coding it to “cidr_blocks [“″]” or setting the type explicitly would resolve the problem.  Or if I accidentally entered a CIDR range that AWS didn’t like, like instead of  The change would apply and usually it would work, it just didn’t think it had worked, or something.  TF wasn’t aware there was a problem with the run so it would just keep “successfully” reapplying the diff every time it ran.

Some advice for TF noobs

  • As @phinze told me, “modules are basically like functions — a variable is an argument, output is a return value”.  This was helpful, because that was completely unintuitive to me when I started refactoring.  It took a few days of wrestling with profoundly inscrutable error messages before modules really clicked for me.
  • Strings.  Lists.  You can only pass variables around as strings.  Split() and join() are your friends.  Oh my god I would sell so many innocent children for the ability to pass maps back and forth between modules.
  • No interpolation for resource names makes me so sad.  Basically you can either use local variable maps, or multiple lists and just … run those index counters like a boss I guess..
  • Use AWS termination protection for stateful services or anything risky once you’re in production.  Use create_before_destroy on resources like ASG launch configs.  Use “don’t destroy” where you must — but as sparingly as possible, because that basically breaks the entire TF model.
  • If you change the launch config for an ASG, like replacing the AMI for example, you might expect TF to kick off an instance recycle.  It will not.  You must manually terminate the instances to pick up the new config.
  • If you’re collaborating with a team — ok, even if you’re not — find a remote place to store the tfstate files.  Try S3 or github, or shell out for Atlas.  Local state on laptops is for losers.
  • TF_LOG=DEBUG has never once been helpful to me.  I can only assume it was written for the Hashicorp developers, not for those of us using the product.

Errors returned by AWS are completely opaque.  Like “You were not allowed to apply this update”.  Huh?  Ok well if it fails on “tf plan”, it’s probably a bad terraform config.  If it successfully plans but fails on “tf apply”, your AWS logic is probably at fault.

Terraform does not do a great job of surfacing AWS errors.

For example, here is some terraform output:

tf output: "* aws_route_table.private: InvalidNatGatewayID.NotFound
: The natGateway ID 'nat-0e5f4ea507113b423' does not exist"

Oh!~  Okay, I go to the AWS console and track down that NAT gateway object and find this:

"Elastic IP address [eipalloc-8583b7e1] is already associated"

Hey, that seems useful!  Seems like TF just timed out bringing up one of the route tables, so it tried assigning the same EIP twice.  It would be nice to surface more of this detail into the terraform output, I hate having to resort to a web console.

Last but not least: one time I changed the comment string on a security group, and “tf plan” went into an infinite dependency loop.  I had to roll back the change, run terraform destroy against all resources in a bash for loop, and create an new security group with all new instances/ASGs just to change the comment string.  You cannot change comment strings or descriptions for resources without the resources being destroyed.  This seems PROFOUNDLY weird to me.

Wrapper scripts

Lots of people seem to eventually end up wrapping terraform with a script.  Why?

  • There is no concept of a $TF_ROOT.  If you run tf from the wrong directory, it will do some seriously confusing and screwed up shit (like duping your config, but only some of it).
  • If you’re running in production, you prob do not want people to be able to accidentally “terraform destroy” the world with the wrong environment
  • You want to enforce test/staging environments, and promotion of changes to production after they are proven good
  • You want to automatically re-run “tf plan” after “tf apply” and make sure your resources have converged cleanly.
  • So you can add slack hooks, or hipchat hooks, or github hooks.
  • Ummm, have I mentioned that TF can feel somewhat undebuggable?  Several people have told me they create rake tasks or YML templates that they then generate .tf files from so they can debug those when things break.  (Erf …)

Okay, so …..

God, it feels I’ve barely gotten started but I should probably wrap it up.[*]  Like I said, I think terraform is best in class for infra orchestration.  And orchestration is a thing that I desperately want.  Orchestration and composability are the future of infrastructure.

But also terraform is green as fuck and I would not recommend it to anyone who needs a 4-nines platform.

Simply put, there is a lot of shit I don’t want terraform touching.  I want terraform doing as little as possible.  I have already put a bunch of things into terraform that I plan on taking right back out again.  Like, you should never be running a script after TF has bootstrapped a node.  Yuck.. That is a job for your cfg management, or possibly a job for packer or a custom firstboot script, but never your orchestration tool!  I have already stuffed a bunch of Route53 DNS into TF and I will be ripping that right back out soon.  Terraform should not be managing any kind of dynamic data.  Or service registry, or configs, or ….

Terraform is fantastic for defining the bones of your infrastructure.  Your networking, your NAT, autoscaling groups, the bits that are robust and rarely change.  Or spinning up replicas of production on every changeset via Travis-CI or Jenkins — yay!  Do that!

But I would not feel safe making TF changes to production every day.  And you should delegate any kind of reactive scaling to ASGs or containers+scheduler or whatever.  I would never want terraform to interfere with those decisions on some arbitrary future run.

Which is why it is important to note that terraform does not play nicely with others.  It wants to own the whole thing.  Monkeypatching TF onto an existing infra is kind of horrendous.  It would be nice if you could tag certain resources or products as “this is managed by some other system, thx”.

So: why terraform?

Well, it is fairly opinionated.  It’s actively developed by some really smart people.  It’s moving fast and has most of the momentum in the space.  It’s composable and interacts well with other players iff you make some good life choices.  (Packer, for example, is amazing, by far the most unixy utility of the Hashicorp library.)

Just look at the rate of bug fixes and releases for Terraform vs CloudFormation.  Set aside crossplatform compatibility etc, and just look at the energy of the respective communities.  Not even a fair fight.

Want more?  Ok, well I would rather adopt one opinionated philosophy for my infrastructure, supplementing where necessary, than duct tape together fifty different half baked philosophies about how software and infrastructure should work and spend all my time mediating their conflicts.  (This is one of my beefs with CloudFormation: AWS has no opinions, only slobbering, squidlike, directionless-flopping optionalities.  And while we’re on the topic it also has nothing like “tf plan” for previewing changes, so THAT’S PRETTY STUPID TOO.)

I do have some concerns about Hashicorp spreading themselves too thin on too many products.  Some of those products probably shouldn’t exist.  Meh.

Terraform has a ways to go before it feels predictable and debuggable, but I think it’s heading in the right direction.  It’s been a fun couple weeks and I’m excited to start contributing to the ecosystem and integrating with other components, like chef-solo & consul.


[*] OMGGGGGGG, I never even got to the glorious horrors of the terraforming gem and how you are most definitely going to end up manually editing your *.tfstate files.  Ahahahahaa.

[**] Major thanks to @phinze, @solarce, @ascendantlogic, @lusis, @progrium and others who helped me limp through my first few weeks.

Two weeks with Terraform

How to survive an acquisition

Last week Facebook announced that they will be shutting down Parse.


I have so, so many feelings.  But this isn’t about my feelings now.  This is about what I wish I had known when we got acquired.

When they told us we were getting acquired, it felt like a wrecking ball to the gut.  Nothing against Facebook, it just wasn’t what I signed up for.  Big company, bureaucracy, 3-4 hour daily commute, and goddammit Parse was my *baby*.  What the fuck was this big company going to do to my baby?

Here’s the first thing I wish I had known: this is normal.  Trauma is the norm.  Acquisitions are *always* massively disruptive and upsetting, even the good ones.

This was all compounded by the jarring disconnect between internal and external perceptions of the acquisition.  The surreality of being written up in the press and having everyone act like this was the greatest thing that ever happened to us, versus the shock and dismay and confusion that many of us were actually experiencing.

Here are a few more things I wish I had understood.

You don’t own your product anymore.  Your product is now “strategic alignment”.  Look at yourself in the mirror and repeat that five times every morning.

Your customers are not your customers anymore.  Your customer is now your corporate overlord.  You can resist this and try to serve your old customers first, but it will wear your engineering team out and eventually drive you mad.

Cultures will clash.  Yours will lose.  You *must* learn the tribal and hierarchal games of the new org if you want to succeed.  They don’t make sense to you, and you didn’t choose into them, but you must learn them anyway if you want to succeed.

Assume good intent.  Aggressively assume good intent.  If a big company buys up your tiny startup, lots of people are going to condescend to you and assume they know how to solve your own problems better than you do.  In retrospect, I realize that a lot of tiny acquired startups *don’t* know what the fuck they’re doing and so the behemoth just gets used to assuming that everyone is like that.  Just grit your teeth and take it.

And if you honestly can’t see yourself ever embracing the new parent company, whether for cultural or ethical or technical reasons, you should leave sooner rather than later.  (UNLESS it’s for the golden handcuffs in which case for god’s sake try to emotionally disengage and do not be a people manager.)

One of Facebook’s internal slogans is something like “always do what’s best for Facebook”.  I never gave a shit about Facebook’s mission; I cared about Parse’s.  I remain incredibly proud of everything we accomplished for Parse, MongoDB, and RocksDB.  If only our strategic alignments had, well, aligned.


How to survive an acquisition

2015 in recaps and life changes

2015 is dead, long live 2016

A pretty serious list of ridiculous/amazing shit happened in 2015.  My face was on a fucking pillar.  I got to meet KEITH BOSTIC.  A bunch of my favorite engineers got dressed up in cheetah kigurumi pajamas and ran around playing booth babes.  And that’s just a three day sampling of mischief.

IMG_4213 (1)
THIS IS MY FACE ON A PILLAR.  you are allowed to find this incredibly creepy.

There are at least three key things about 2015 that I feel like I am always going to associate with this year, so that’s what I want to write about.  (I’ll do a boring second post later with links to all the talks, articles etc for the year.)

The three things that stood out for me this year were:

  1. Conference Overload
  2. MongoDB + RocksDB (aka “mongo grows up”)
  3. Leaving Parse.

The Year of Conference Insanity

In 2015 I gave talks at 21 conferences, which was … excessive.

This wasn’t really intentional.  But I was bored and restless and unhappy a lot and had many erratically available spare cycles.  Being a manager at a big company makes this depressing Swiss cheese out of your calendar.  You can’t really carve out the heads-down time you would need in order to do challenging technical work, but it’s easy to go on lots of short trips to conferences.  And travel is nice.  So I just kind of said “yes” to everything.

I really appreciate all the brilliant people and organizers who welcomed me to their events this year, and all the many random, unexpectedly lovely connections I made there.

Next year will be very different.

IMG_4135 (1)
that one day i packed lighter fluid in to work to burn things because reasons

MongoDB + RocksDB

When I started running MongoDB in 2012, we were on version 2.0 and it had a single lock.  Yep, just the one!  This was a SUPER fun way to try and build a sophisticated multi-tenant platform.

Mongo was rolling out performance improvements with every release.  But shit actually got real when they put the storage engine API on the roadmap and acquired WiredTiger.  The Facebook RocksDB team enthusiastically pitched in too, providing feedback on the API design and ultimately delivering a full-fledged implementation of MongoDB with RocksDB storage engine.

the RocksDB “marketing” team, with bonus Peter Zaitsev

Parse spent the past 1.5 years being the alpha customer for MongoRocks.  We developed a load testing framework for replaying production workloads, evaluated TokuMX and WiredTiger as well as Rocks, worked with the RocksDB team (aka Igor Canadi) to iron the kinks out of the MongoRocks implementation and figure out how to run and monitor the damn thing, and then got it rolled it out to 100% of our production replica sets in under a year.  (Worth it? with Rocks we used 1/10th the storage space and writes were 50-200x faster compared to mmap.)

Asya, Domas and Mark Callaghan at the inaugural MongoDB Storage Engine Summit

I’ve said this before but I really believe it: 2015 is the year that MongoDB grew up and became a “real database”.  This was Mongo’s InnoDB moment.  Remember how much shit everyone used to give MySQL back in the MyISAM days?  Hey guess what, software actually can get better!  This is one reason it has been a special delight getting to work with Mark Callaghan.  Mark was one of those engineers who turned MySQL from a punch line into a “real database” which powers much of the internet.  So it was kind of like having this guy around.

I loved having a front row seat to mongo’s awkward adolescent phase.  I loved getting to play a small role in promoting storage engine diversity.  With Percona now offering enterprise support for both MyRocks and MongoRocks, I think the project is in good hands, and I’m really really fucking proud of this.

Leaving Parse

I said goodbye to Parse in early September.  This was one of those emotionally overwhelming life changes that somehow manages to feel both too soon and way, way overdue.  I spent 3.5 years working on Parse — one year pre-acquisition, 2.5 years at Facebook.

IMG_4631 (1)
my going-away cake.  yes, it is a unicorn shitting rainbows with a purple tutu and sparkly silver slippers.  thank you nancy!!

I like to think of myself as this crusty backend engineer who sighs wearily and doesn’t give a shit about product, but the fact is I fell in love with Parse on the very first day.  I am so, so proud of what we built.  We built a thing that people really cared about.  We empowered so many developers to build and create and make whole new businesses on our platform.

There is nothing as heady as getting to pour yourself into a product that people are passionate about, tackle problems that are hard and might actually be impossible, with a team that blows your socks off with their brilliance and joy and hilariousness on a daily basis.

the amazing parse production engineering team.  my boys 4 lyfe

So yeah, it was hard to let go.  But it was time.  The 3-4 hour commute was fucking killing me, and it had been a long, *long* time since I felt like I was learning new things or pushing my boundaries as an engineer or a leader.  I need a certain amount of chaos and panic in my life.  Being comfortable for too long just makes me miserable.

Parse is going to be a hard act to follow.  Thanks for raising the bar, motherfuckers.

quitting day

In summary

2015 was ridiculous and awesome, and I want 2016 to be ridiculous and awesome but in some radical and fantastically new ways.  I have many hopes and intentions for 2016.  but I’ll save that for another time.

Oh but my single greatest achievement of 2015?  is *clearly* this tattoo.


12 months, 11 appts, about 30 hours of inking.  thank you micah riot.

2015 in recaps and life changes

Hello world

“Start a blog” has been on my todo list for so long, it’s going to leave a gaping void in my life when I check it off.

There are some things I have been wanting to talk about for a while now, in a way that’s less ephemeral than twitter.  Like how the tech industry is terrible at interviewing and hiring, and some ideas for making it slightly less terrible.  Or the appalling proliferation of new ways to bootstrap your infrastructure over the past couple years.  Tips for how to survive when your startup gets acquired.  Stuff like that.

The first thing on the list tho will be a wrap-up of all the talks and articles and shit that I did in 2015, because Caitie did it and it was amazing and inspiring.  ❤

This post is basically for me to try and figure out how to use wordpress themes.  But here, just so that it isn’t 100% devoid of content let me state for the record where “@mipsytipsy” comes from.  When I was in school we had an assembly language class where we had to use SPIM to emulate the MIPS R3000 processor.  The class was wretched, but “spimmy” became my irc nick.  During my Everquest years I played an enchanter named Mipsy Tipsy, and now I’m just too lazy to come up with anything new.

A surprising number of people still call me “Spimmy” in real life.  Heh.


photo credit: i don’t remember whose hands these were.  maybe @mosheroperandi?