jon_blogpost2

Building a scheduler is not as easy as it sounds. Oh wait, it doesn’t sound easy? Good because it’s not, says Jon (pictured above) the developer who has been tackling the challenges of building the entire VictorOps product around a calendar. Since we’re all about helping teams manage on-call scheduling and making it easier than ever before, the calendar is an obvious and essential part of the solution.

He was kind enough to tell me about a few of the challenges that have surfaced since working on this…

  • Date math – Quick question, hotshot. When is a month after March 7th? Is it April 7th (a month later in theory) or is it April 4th (which is exactly 4 weeks later)? Also, how many minutes are there in a month? Are you taking into account Daylight Savings Time or whether it’s a Leap year? Leap seconds? Not easy. A calendar is not logical and therefore, doing math with it is difficult. And Jon was honest when he said, “I knew enough to be afraid of date math.” It’s not just making a calendar work, it’s about making the entire rest of the product work with a calendar.
  • Ability to customize – As Jon was soon to learn, “no two companies want to do things the same way” when it comes to on-call scheduling. Therefore, the scheduler had to be something that worked for everybody. One of the ways that Jon has made this work is by building in the ability to create on-call schedules based on escalation policies instead of rotation policies. By pulling escalation up a level from rotations directly, we’re able to layer rotations to create arbitrary schedules. It’s not super clear right now, but we’re working on user interface designs that will make even the most complicated on-call schedules understandable.
  • Bit masking – In order to handle arbitrary windows of “scheduled” time, and to make the layering we just talked about work, Jon’s scheduler needed to be able to define blocks of time in a rotation during which the schedule is valid. Instead of listing the days and time blocks in the database directly, Jon chose to use bit masks instead. This way he can mask off when a rotation is valid (for example, day shifts versus night shifts) and easily query the rotations for validity at a given moment in time.
  • Starting rotation – When creating a rotation, you have to keep in mind certain variables that will affect the schedule, like addressing the issue of what happens when you add people or take them away. The VictorOps scheduler calculates the rotation based on now, not based on when you started. This means that we can’t show you the past — when a rotation is edited, the past is no longer calculable — but we won’t change your future. So, if you’ve planned a vacation based on when you’re next scheduled, that won’t change just because you hired another person. (See diagram below.)

oncall_schedule

  • What happens at midnight? – VictorOps doesn’t currently support crossing days which means that you can’t schedule a rotation that lasts from 4pm to 4am. The ability to do this still needs to be figured out internally but it becomes a UI problem which Jon thinks will be fun to work on when the time comes.
  • Scheduled overrides – A question that we still need to answer when it comes to the scheduler is what to do when you’re creating a schedule override in advance. This might happen based on employee vacation calendars or a popular time when people aren’t around, like the holidays. The VictorOps scheduler doesn’t allow for this functionality yet. But Jon is pretty sure that he’ll be tackling this one very soon.

As VictorOps went through Alpha and Beta, there were changes that had to be made to the scheduler based on user feedback and product features. Jon is currently sitting at 2 rewrites so far and knows that there will be more in his future. Although he freely admits that this was a problem he “didn’t really enjoy on a daily basis”, he takes solace in knowing that he’s not the only one who thinks that it’s a hard nut to crack.