By now, the news has spread rapidly in security circles and on mailing lists about an exploit to the WordPress software less than or equal to version 2.1.3. To give you some background, we had held off on upgrading to version 2.2 that came out last week due to bugs in the software that we felt were unacceptable to our company. Nothing critical, but as we are nearly finished rolling out new themes that are all widgetized to the network, I felt that lingering widget bugs were pretty critical to our platform. The decision was made on release day that we would not upgrade until WP 2.2.1 was released next month sometime.
That was the plan as recently as Monday evening. But something changed quickly and I want to give you a window into how the team worked together to avert a crisis. As this timeline is fairly raw, I hope it gives some perspective on how we are able to react and triage situations quickly and put issues to rest all the time. We don’t always have critical security flaws, but we do work together to problem solve on a daily basis. This is how we roll.
Monday, May 21 – 10:52PM EDT
An email is sent to the WordPress hackers mailing list alerting the community of a posted exploit to all versions of WordPress under version 2.1.3.
Monday, May 21 – 11:17PM EDT
Exercising caution as with all security alerts, I carefully setup a test and run proof of concept script against one of our blogs. Threat confirmed.
Tuesday, May 22 – 12:07AM EDT
I forward the notice to the tech team for them to digest in the morning.
Tuesday, May 22 – 8:21AM EDT
Brian Layman confirms threat and indicates that our upgrade timeframe decision has been made. I agree.
Tuesday, May 22 – 9:22AM EDT
Sean Walberg, our systems administrator, suggests we delay upgrade until peak traffic time is passed. Already, we were under a Digg storm and we did not need to exacerbate issues with an upgrade.
Tuesday, May 22 – 9:40AM EDT
Channel Editors notified of the problem and the impending upgrades and are given instructions to change passwords after the upgrade.
Tuesday, May 22 – 2:30PM EDT
Brian Layman and I work up more verification of the exploit by analyzing and executing the code against further targets on our next work. Re-confirmed.
Tuesday, May 22 – 4:30PM EDT
Upgrade script and subversion repositories prepped for switch to WordPress 2.2. We chose revision 5505 as most of the widget issues we were initially concerned with were addressed prior to this revision. Core plugin set re-evaluated by team. Eliminated one plugin due to security.
Tuesday, May 22 – 6:00PM EDT
Upgraded Tech channel and verified functionality of widgets, in particular.
Tuesday, May 22 – 9:00PM EDT
Upgraded entire network to r5505.
Tuesday, May 22 – 9:30PM EDT
Support, support, support. Reports roll in regarding broken this and that – mostly having to do with plugins and widgets. Solve almost all except a weird database error on one blog.
Tuesday, May 22 – 10:40PM EDT
Major bug discovered – well, not major for WordPress, but certainly for us from a user experience perspective.
Wednesday, May 23 – 12:35AM EDT
Reupgraded network to r5520 which included further fixes for widgets.
All in all, because we have created tools and standardized everything we do, we are able to avert problems before they become problems. We do it all the time for big problems and small. Folks who run networks, whether blog networks like b5media or simply groups of blogs that are maintained by the same person or group can choose to upgrade blogs by hand, one by one, or sit on the problem hoping to not be attacked “until the weekend”, or they can take attacks seriously, use tools that assist in upgrading (Brian’s upgrade script is very good too) and be done very quickly and efficiently.
Our upgrade of over 200 blogs was completed in 30 minutes and 6 seconds – a slowdown from earlier reported times based on instituting a pause between each upgrade. Our time of execution from problem introduction to problem solution? Less than 24 hours.