I went to the complimentary PagerDuty Summit Sept. 13 on Market Street in SF.
The well-organized conference format was 2 tracks downstairs, with breaks and a small expo area upstairs.
Andrew Fong of Dropbox had a very good talk on their struggle to go from four 9’s (“can use tactics”) to five 9’s (“has to be strategic”.) Their solution was to have a working group composed of anybody who wanted to contribute, across departments. (Not dedicated HA staff.)
Andre Kelly of Google talked about having well-defined post-mortem processes in place now to capture outages in an organized manner and data mine the results over time later.
Apparently there’s some popular Open Source post-mortem systems for that. Please leave a comment if you have any experience with those.
Sean Reilley of IBM discussed people issues in communicating agile across a large company with pockets of staff who were used to waiting for permission (ie. not inherently agile.)
Upstairs, the mini-expo seemed to have a couple booths for security-related start-up Cloud products, Datadog, plus a booth for PagerDuty itself to do customer demos and get beta feedback.
Sketch of New PagerDuty Incident Timeline Visualization Tool
The money shot was seeing their new beta graphical incident timeline, to be released in November, which made the trip worthwhile. Until then, you can enable HTML emails for a slightly richer experience.