Through all the great successes of London 2012, both inside and outside the venues themselves, this summer’s Olympic Games are also being hailed as the first to embrace big data.
It has been estimated that nearly 60GB of data per second is expected to travel across British Telecom’s networks during the Olympic period – the equivalent of 800 DVDs per minute.
Great Britain, which has made its digital switchover this year, has increased digital broadcasting for the Games. More than 2,000 hours of live coverage of Olympic events will be broadcast and the London Olympics is expected to have 30 per cent more data than the 2008 edition in Beijing.
Social media has also played a critical role in the games. Twitter is expecting more than 13,000 tweets per second about the Olympics. Indeed, the official @London2012 account has 1,568,366 followers (as of 10am this morning). Moreover, an estimated 845 million people share Olympic news daily on Facebook, which has fewer than a billion users. Facebook users are expected to be responsible for more than 15 terabytes of data each day.
The official London Olympic website has generated masses of data too, with the expected number of views to top a billion before the games are over. The world is expected to connect to the Internet via 8.5 billion computers, tablet PCs and smartphones over the course of the games.
In other words, London 2012 has been the most digitally connected Olympics there has been, resulting in more data being generated, stored and analysed than ever before. And helping BBC and the Press Association (PA) – the two leading organisations covering the Games – cope and make sense of this mass of data has been software provider MarkLogic, a sponsor at Big Data Insight Group Forums.
By creating a dynamic semantic publishing (DSP) architecture, MarkLogic have helped connect huge data sets, including stats during the events, linking articles with athlete profiles, dealing with huge surges in web traffic as well as disaster recovery to ensure no data or online services are lost. This has been delivered through an operational database which simplifies the challenge of linking each of these things through the BBC and PA’s websites.
John Pomeroy, MarkLogic’s European vice president, said: “There were two issues with the existing infrastructure. One, they were uncertain about how it would scale and the physical ability of it to manage the volume of data they have to deal with.
“The second issue was that it made it difficult to put together their websites in a dynamic way that would provide a simple interface for site contributors (journalists and photographers) and end users to navigate.”
Jem Reyfield, the BBC’s lead technical architect for the news and knowledge core engineering division, added that the new system has delivered a “greatly improved user experience” for the “high levels of user engagement”.
“DSP uses linked data technology to automate the aggregation, publishing and re-purposing of interrelated content objects. The approach enables the BBC to support greater breadth and scale, which was previously impossible using a static CMS and associated static publishing chain.”