Skip to Content

The top three reasons to use big data

David Black addressing the audience of his Big Data session during the High School Research Teachers Conference. SOCIETY FOR SCIENCE & THE PUBLIC

Big data is a hot topic of discussion in the world today. Tech companies, government organizations and leading academic institutions continue to compile massive amounts of information that could help solve global challenges. As big data has pioneered new fields from data analytics to data science, the ability to interpret and understand it is increasingly becoming a valuable skill.

Erik Mohlhenrich, a biology teacher at Princeton International School of Mathematics and Science in Princeton, New Jersey, and David Black, science teacher at New Haven School in Spanish Fork, Utah, led a breakout session about big data in the classroom at the 2019 High School Research Teachers Conference. Erik and David shared how they apply big data in their own specialties—bioinformatics and earth sciences—but they also touched on the wider applications.

There are many benefits to using big data. Here are a few Erik and David shared:

  • Accessibility: Big data offers the convenience of conducting scientific research wherever you are: the only tool needed is a computer. For instance, David uses maps and graphs provided by geological associations to teach his students the fundamentals of pattern recognition and trends. Through the accessibility of geological survey data, he can more easily teach his students how to read charts and topography maps from any set of coordinates. With huge amounts of information just a click away, big data can help students conduct their own research projects too.
  • Breadth of data: Big data spans a large number of fields, from astronomy to molecular biology, but due to the vastness of these data sets, David explained, a great majority of it is collected using government instruments. Big data can include anything from DNA sequences to maps or even audio files of bird calls collected in the Amazon rainforest. NASA, the National Oceanic and Atmospheric Administration (NOAA) and the National Geographic Survey are a few potential sources of data that can be used in the classroom and beyond. In many cases, the data spans years or even decades and some organizations have more data than can be sifted through in a lifetime. This vast amount of information can lead to a wider range of conclusions than if students were collecting data within a limited amount of time.
A number of colorful circles connected by a colorful web
A STRING visualization of protein networks associated with a rare congenital brain disorder.
  • Creativity: Big data also allows students to get more creative, in both the questions they ask and the results they draw. Erik shared an example of one student who was researching a possible connection between sleep deprivation and Alzheimer’s disease. Using a free program called STRING, a database of protein-protein interactions, the student researched gene combinations that could potentially be contributing to the disease and their connections to sleep deprivation. Data can be interdisciplinary in this way. The only limit in finding patterns and correlations is the researcher’s imagination.

One drawback the speakers mentioned was that big data is not always available instantly. Sometimes surveys are commissioned by specific individuals who get free reign of the data before it is released to the public.

Many Society alumni have used big data in their research. For instance, Broadcom MASTERS 2018 winner, Georgia Hutchinson, used NOAA data to design her data-driven dual-axis solar tracker. Similarly, Brian Wu (ISEF 2018-2019) used data collected by the MARVELS Radial Velocity instrument to find undiscovered exoplanets. Using a computer program that compiles data collected by the instrument, Brian determined whether a signal detected from outer space was coming from a star or a planet.

Ultimately, there is more than one way to conduct scientific research. Big data is one tool to consider. While it can be empowering to march out into the field to collect your own water samples, it can be equally satisfying and productive to utilize the valuable data that’s already waiting at your fingertips.