Portland, Oregon
2017-05-22 to 2017-05-24
Description
Living #opslife makes us keenly aware of the cavernous gap between lofty ideals and 3am reality. In a perfect world, everyone would be devopsing sans effort. In the real world, sharing oncall is not as easy as giving devs prod AWS creds, adding them to the rotation, and saying "good luck! have fun!"
Multi-team oncall means caring enough to let go, while not letting your beloved co-workers fall (and fail) unsupported. Distributing understanding requires building trust. Start by letting go of control, acknowledging the tribal knowledge you need to externalize, and untangling the tightly coupled bits. Feature teams have their own parts to play, such as writing health checks which aren't aspirational. (Don’t return 200 OK when broken, unless your site is “This is fine” dog as a service.)
Architectural considerations such as microservices can really save your hipster soy-based breakfast strips, ensuring oncall doesn’t have to be an all-or-nothing scenario across your entire org. Containers and functions as a service are implementation details, as are features of your particular cloud, but in general, go for self-healing and redundancy. Treating infra as an armada, not a yacht, will keep you shipping.
From tightly-guarded fiefdoms to “of course all the devs are on call” to carefully negotiated compromises, I’ve lived this movie enough times to see what works (and what definitely doesn’t). I spent 1999 to 2015 on call for production infrastructure and made mistakes so you don’t have to! Spoiler alert: instead of volunteering as tribute to the vagaries of the pager, volunteer to invest in your architecture and your co-workers; you’ll sleep better at night.
Slides
Video
Monitorama PDX 2017 - Bridget Kromhout from Monitorama on Vimeo.
Tweets
How many people here have a visceral dread of a ringing phone? omg yes so much this @bridgetkromhout #monitorama
— Mercedes Coyle (@benzobot) May 23, 2017
Last talk of the day is the unstoppable awesomeness that is @bridgetkromhout #Monitorama pic.twitter.com/mQ60rzRhm0
— Matt Broberg 📊 CHAOSScon / Fosdem (@mbbroberg) May 23, 2017
I volunteer as tribute, the future of On-Call - @bridgetkromhout #monitorama #monitorama2017 pic.twitter.com/p7aWROQtlh
— Andrew "Medium Data" Rodgers (@acedrew) May 23, 2017
Being on call from 1999-2015 broke me so deeply that I report up to Marketing now @bridgetkromhout #monitorama
— Mercedes Coyle (@benzobot) May 23, 2017
"I was oncall from 1999 to 2015, and it broke me because I now report into marketing" - @bridgetkromhout #monitorama pic.twitter.com/G9rrRMU5bo
— bletchley punk (@alicegoldfuss) May 23, 2017
Ditto
— Jérôme Petazzoni (@jpetazzo) May 23, 2017
It was a more innocent time, all we had to worry about was Mongo's global write lock - @bridgetkromhout on #monitorama 2014 pic.twitter.com/jN51ttHP6G
— Andrew "Medium Data" Rodgers (@acedrew) May 23, 2017
Watching @bridgetkromhout map out the future of Oncall. #monitorama and making things contextual...
— pteralix - the birdywhirl (@CadsOakley) May 23, 2017
Let's celebrate people who settle, productionize, and maintain, in addition to those who build shiny and new -@bridgetkromhout #monitorama pic.twitter.com/3LU31Iy3yT
— Ryn Daniels (@rynchantress) May 23, 2017
Oh golly @bridgetkromhout has a 3cute5me bunbun in her slide at #Monitorama! Thinking on her Laura Bell quote: "... be the settlers instead" pic.twitter.com/NbXid6dSe3
— Avi 🐰🏳️🌈🏳️⚧️ (@_llzes) May 23, 2017
"...not saying you should just YOLO stuff into production, because someone's gotta monitor that stuff, but..." @bridgetkromhout #monitorama pic.twitter.com/9TVbxcwwW5
— Some Guy (@guycirino) May 23, 2017
"Sometimes YOLO works out."
— Matt Broberg 📊 CHAOSScon / Fosdem (@mbbroberg) May 23, 2017
Past company success is not recommended by @bridgetkromhout 😂 #Monitorama
This slide is so perfect. @bridgetkromhout #monitorama pic.twitter.com/mbjO82lZZW
— raine, spider goddess (@queerops) May 23, 2017
I’m always saying this > “but #opslife means I’m a cynical realist” @bridgetkromhout #monitorama
— Sarah Z (she/her) (@szelechoski) May 23, 2017
Blaming people is not going to solve the problem faster. #monitorama @bridgetkromhout
— Dawn Parzych (@dparzych) May 23, 2017
Even if you don't ssh into the servers, the servers still exist and you still have to care about architecture @bridgetkromhout #monitorama pic.twitter.com/7nyqCjHJUF
— Ryn Daniels (@rynchantress) May 23, 2017
"Blaming people is not going to solve the problem faster." - @bridgetkromhout , #monitorama2017
— Jam Leomi 🏳️🌈 🎶 (@jamfish728) May 23, 2017
"It's easy to get caught up in hype driven development..but what problem r you trying to solve and how does that fit in your organization?" pic.twitter.com/UAQPVokgEv
— Some Guy (@guycirino) May 23, 2017
love the shoutout to this tweet from @bridgetkromhout #monitorama https://t.co/PVZAlKgPIw— brandon burton (afk) (@solarce) May 23, 2017
FLATTERED
— mark mcbride (@mccv) May 23, 2017
"Your org structure is not what your org chart says. People talk to each other." @bridgetkromhout #monitorama pic.twitter.com/gTb4xVEK6d
— bletchley punk (@alicegoldfuss) May 23, 2017
You have to align incentives to have support across the organization. @bridgetkromhout #monitorama
— Dawn Parzych (@dparzych) May 23, 2017
“[dev] team is partying bc they shipped, [ops] team is preparing for a week of hell” - @bridgetkromhout on aligning incentives #monitorama
— Sam Stokes (@samstokes) May 23, 2017
@bridgetkromhout on need to build trust across #developers and #DevOps @Monitorama #monitoringlove pic.twitter.com/Sk0tz65VK7
— JP Marcos (@SignifAICEO) May 23, 2017
"We're all on team rabbitduck" - @bridgetkromhout #monitorama
— Philip J. Hollenback (@philiph) May 23, 2017
"Put why in your commit messages." @bridgetkromhout #Monitorama
— Rich Burroughs (@richburroughs) May 23, 2017
Put the *relevant* people on call - don't have your ops team as very angry named pipes that just page the devs @bridgetkromhout #monitorama
— Ryn Daniels (@rynchantress) May 23, 2017
"angry named pipes" - that's brilliant
— Jeff Sussna (@jeffsussna) May 24, 2017
"We can't solve these problems by throwing human misery at them". - @bridgetkromhout #monitorama
— Aditya Mukerjee, the 🦦-ific 🏳️🌈 @ GoDays (@chimeracoder) May 23, 2017
Bridget Kromhout - I volunteer as tribute: the future of oncall. @bridgetkromhout #monitorama pic.twitter.com/T27dMqEC0K
— Sarah Huffman (@dangerpudding) May 23, 2017
Building on top of human misery doesn't scale. @bridgetkromhout #monitorama pic.twitter.com/zdQU8Y1Kx2
— kitchens (@this_hits_home) May 23, 2017
We can't keep asking people to pull bunnies out of a hat. One day there will be no bunny. There will only be lots of overtime and burnout.
— kitchens (@this_hits_home) May 23, 2017
also it's disingenuous and insulting to people who build infrastructure and have made this abomination possible because we automate so much.
— Levi DeHaan (@levidehaan) May 23, 2017
Apropos https://t.co/WgMOWHNcYY
— Jackson Faddis (@betabit) May 23, 2017
"Being detailed when you commit is a gift to future you. Put 'why' in your commit messages." - @bridgetkromhout at #Monitorama
— Frank Mitchell (@onefrankguy) May 24, 2017
Love these notes on my #monitorama talk by @dangerpudding! 💖✨👌 https://t.co/cGCZgcWlgx
— Bridget Kromhout (@bridgetkromhout) May 24, 2017