This is an overview of the work that went into producing WSN's live Quidditch World Cup coverage on November 13 and 14, 2010. The coverage is available at nyunews.com/quidditch.
One of my favorite interactives that I've seen lately is undoubtedly the World Cup live game tracker from The New York Times this summer. When I heard that the Quidditch World Cup was coming to New York City this year, I wanted to do something similar. This was only eight or nine days before the event, so I didn't have a lot of time. I had also never seen a Muggle Quidditch match played; in fact, the first game I ever saw was the first game we covered live.
Because of this, I wanted to build in the ability to track as much as possible, leaving open the possibility of ignoring certain types of events if the games became too fast-paced:
Data Entry
Knowing we'd be on a field without wifi, updating via laptop was out. So was phoning home data to HQ, as the physical location of events on the field couldn't be accurately conveyed verbally. Updates by iPhone seemed the best option: a touch screen to record locations, plus more reliable internet than a spotty wifi connection. And given the choice between learning Objective-C for the event and making an iPhone optimized website, I went with the latter. (Also essential was the ability to push upgrades out to reporters during the event immediately and have anyone with an iPhone have the software at a moment's notice, something a website was more suited to.)
To keep feedback quick, data was initially loaded from the server to set up which players were available / who had possession of the ball / etc; subsequently, the only interaction with the server was to POST updates and receive status updates on the success or failure of those updates. Everything else was pure jQuery.
After reporters selected a game to cover, they could do a number of things:
During our first game, we realized that unless we had four reporters covering each game, we wouldn't be able to keep track of all the data points originally planned for (Quidditch can end up moving a lot faster than I anticipated). The first day we often had two reporters covering each match – one to keep track of possession and the other to handle shots / goals. (By the second day, reporters could handle games solo.)
Client Side
Major thanks to The New York Times' project for some of the visual metaphors I ended up using. I'd done a fair amount of custom charting with jQuery in the past, so in the interest of speed and device compatibility, I decided to go with html and javascript over Flash, or even canvas. The data was refreshed from the server every four seconds.
To deal with the possibility of refreshing the data into various types of pages with various html structures (the game tracker on the homepage showed different data than on the game detail page), I gave each piece of data a unique class based on its position in the json. So the piece of data at: data.teams[1].possession.percentage would be assigned the "data__teams__1__possession__percentage" class. Every time the data was refreshed, the javascript looked for the data in the returned json based on the class name.
Because the data would need to be shown differently depending on the length of the game (the charts start off assuming a 30 minute game, but they'd often be shorter / longer), all events were dealt with on a percentage-through-the-game scale. A master start and end timecode were refreshed with the data – as soon as one changed, the percentages would change accordingly.
Issues
Up to four games could be theoretically be running concurrently, and the data for each game was updated roughly every three-four seconds. The sheer computation needed for each refresh alone could have tied up our original server, so this data was written out to flat json files to be served to clients directly, and bypass our main Django server as much as possible. If I'd had more time, I would have integrated it with S3, but thankfully we never quite reached a bottleneck.
Our real problems on the server side came in bursts. Reporters could email photos during the games to quidditch@nyunews.com and have them show up live on the individual game pages – large numbers were occasionally uploaded all at once and needed to be thumbnailed, invariably coinciding with the game data refresh cycle. Or, we'd change the image size on the front page slideshow, and suddenly 100 2MB images needed to be thumbnailed. These issues can all be avoided next year by sharding off the tasks to separate, smaller servers.
Going forward
I'd love to turn this into a more generic app that could handle any sport - though I'd probably choose to go with redis or MongoDB for the backend over postgres were I to redo it. I'm currently talking with our Sports desk to see whether or not we want to do this type of coverage for more events throughout the year.
More importantly, this was our first case of interactive coverage in which we didn't rely on third-party data. Despite the relatively recent push for institutions and public offices to release more data to the media and public, good data sets that are both readily available and clean are few and far between. This project got us thinking about the situations in which we don't have to wait for data to me made available to us - where, with a little bit of technology to make the data gathering easier, we can collect that data ourselves, or with the help of others.
The Saturday and Sunday of the World Cup were our two most-trafficked days of the previous year.