On Call Duty
March 11, 2018
[sysadm
]
[devops
]
[management
]
Recently, there was a flurry of tweets about on-call and SRE duty schedules. It triggered a memory of a conversation at a manager meeting about a new product offering:
Manager 1: On call duty? We can’t ask our developers! They work 9 to 6, Monday through Friday. That’s it. Remember when it was 9-5, hmm?
Me: Our servers run 24x7x365
Manager 2: The developers are our most precious resource! We can’t piss them off. They’ll quit!
Me: You seem to misunderstand the role of a senior developer and yours as manager.
A feature (or any other chunk of code) is never really “done” until the code is erased from production. All of us must support it, somehow. Bugs happen. Customers call. We are paid for working systems. Clicking “Deliver” and “Accept” is not the end of our responsibility.
As a manager, it is your job to express the importance of production code to your staff. They are not royalty nor privileged-spoiled-generation whatever. In fact, when I’ve have conversations with my directs, most have been remarkably pragmatic and supportive when they are informed about the value of on-call duty.
More importantly, if you give your staff the trust and respect that comes with, “there is no dev/ops divide, there is only the product and the success our organization,” then they will come up with their own solutions to the problem: What to do when things go pear shaped.
- Support tiers
- Run books
- Reasonable response time windows
- QC tests
- Production smoke tests
- Production health checks.
Let’s face it, most applications are not mission / life critical – and those that are, they require live monitoring and response teams, right? We must also recognize that off-hours duty can be a burden. Young children, elder care, other commitments compete for our time. We don’t own a staffer’s entire 24 hour day, so make accomodations! On-call is not the same as watch-standing. You don’t have to be awake for duty time, just reachable on short notice.
- Flex time
- Eily traded slots
- Bonuses or extra vacation
- Customized pre configured laptops with mondo cell data plan & up-to-date dev environment
- A phone
- An on-duty bag
- Trick up the laptop with games, maybe
There Lots of creative ways to make it work. Run retrospectives to make it better. Finally, if a staff member absolutely can’t be present off-hours because life happens and the original job description did not include off-hours duty, don’t penalize them. It’s not their fault that your business plan failed to predict one of the most obvious needs of a 24x7 operation.
I’ve been on-call 24x7x365 minus vacation, in daily, weekly and monthly rotations, and not at all. I’ve been so tired that I missed the call and the boss had to call my wife to wake me up. It doesn’t happen too often, and I’ve accepted the responsibility as part of the job. I mostly enjoy it because I get to talk to real customers and hear about their pain points. I get a better sense of what’s important to them than any 3-hour death-by-powerpoint could ever give me. The customers also see that there are real, compassionate, human beings supporting them and their work. Because that’s the real thing: We build stuff to help other people solve problems. Think about that the next time you drag out your luser hashtag.