How CERN uses Big Data in practice
The LHC’s sensors record hundreds of millions of collisions between particles, some of which achieve speeds of 99.9% of the speed of light as they are accelerated around the collider. Clearly, this generates a huge amount of data – the LHC alone generates around 30 petabytes of information a year.
This data is analysed by algorithms that are programmed to detect the energy signatures left behind by the appearance and disappearance of the elusive particles CERN are looking for. The algorithms compare the results with theoretical data on how we believe such particles act. A match provides evidence that the sensors havefound the target particles.
In 2013, scientists at CERN announced that they had found the Higgs boson particle – a huge step for science. The existence of the particle had been theorised for decades but could not be proven until technology on this scale was developed. The discovery has given scientists unprecedented insight into the structure of the universe and the complex relationships between particles.
The technical details
The LHC collects data using light sensors to record the collision,and fallout, from protons accelerated around the collider. Sensors inside the colliders pick up light energy emitted during the collisions and from the decay of the resulting particles, and convert it into data that can be analysed by computer algorithms. Much of this data, being essentially photographs, is unstructured.
The Worldwide LHC Computing Grid is the world’s largest distributed computing network, spanning 170 computing centres in 35 different countries. The 300 gigabytes per second of data provided by the seven CERN sensors is eventually whittled down to 300 megabytes per second of “useful” data. This data is made available as a real-time stream to academic institutions partnered with CERN.
Ideas and insights you can steal
CERN’s computing grid shows how distributed computing makes it possible to carry out tasks that are far beyond the capabilities of any one organisation to complete alone. Thanks to distributed systems, we can store data anywhere, across a number of different locations, and still find and access it quickly. This has had a dramatic impact on how much data companies can work with.
You can read more about how CERN is using Big Data to drive success in Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results.
Bernard Marr is a bestselling author, keynote speaker, and advisor to companies and governments. He has worked with and advised many of the world's best-known organisations. LinkedIn has recently ranked Bernard as one of the top 10 Business Influencers in the world (in fact, No 5 - just behind Bill Gates and Richard Branson). He writes on the topics of intelligent business performance for various publications including Forbes, HuffPost, and LinkedIn Pulse. His blogs and SlideShare presentation have millions of readers.