In past issues of Lake Time Magazine, we’ve taken high level looks at the Internet of Things, rural healthcare technologies, and the evolving electric grid. While all very different topics, they are similar in that they have the potential to produce a massive amount of data. Storing this data has become an industry of its own, but unlocking the value in the data can be a bit more tricky. Introducing “Big Data.”

We’ve actually been collecting data for many years. However, the pace and volume of that data has been increasing exponentially in recent years as high speed communications and networked devices become engrained in our society. Big Data refers to the more recent trend of leveraging computational analytics, machine learning, and artificial intelligence advancements to pour over data and make real world use of this powerful resource. 

A small scale example may be found right on the side of your house. Many electric utilities are currently investing in advanced electric meter systems capable of providing instantaneous data to the utility either on a regular schedule, when requested, or even when the meter detects sub-optimal situations. While some only see this as an opportunity to save money over in-person monthly meter readings, that doesn’t pass a basic cost-benefit analysis. The investment, which can reach into the millions of dollars, only pays for itself when real-time data is put to use automating the substation and electric distribution systems and reducing the cost of delivering power to the end user. With such insight into the local grid, utilities can now better manage outages, line losses, power theft, and power quality issues driving a healthier bottom line and providing a payback on the investment.

The electric utility example is one where data volumes are relatively easy to manage, analyze, and automate. Big Data makes its most impressive impacts when applying enormous amounts of data against incredibly complex problems such as the fight against cancer.

Putting theory into practice is IBM’s Watson Health and the Watson Health Cloud. It provides a computing resource capable of consuming and analyzing 10,000 scientific articles and 100 clinical trials in any given month; a pace previously unattainable for oncology professionals. In 2016 IBM announced a partnership with Illumina, a producer of genome data sequencing technologies. Together, Watson and Illumina can soon match a patient’s specific genomic information with treatment regimens most likely to be effective for that patient. While not a replacement for experienced medical professionals, it does give doctors a powerful new set of tools and has even given rise to a newly booming profession, data scientist.

Data scientists are tasked with using these tools to study data to provide information driven answers and conclusions using various analysis techniques. The role requires a broad knowledge of statistical and probabilistic analyses, an innate problem solving skill set, and a natural curiosity in the answers often hidden behind the evidence. 

It is estimated that by 2020 there will be more than 30 billion devices collecting data streams. Being able to interpret and take advantage of all this data will lead to great economic and societal benefits enabling us to lead healthier and more productive lives. 

While fossil fuels drove the Industrial Revolution, information is driving the Big Data Revolution. Fittingly, some even refer to Big Data as the new oil; a powerful and valuable commodity capable of transforming our way of life.