At a conference hosted by Harvard and MIT, schools using the open-source edX platform agreed on a common data structure for their online courses, with the goal of facilitating research on how students learn.

With online courses now part of the mainstream, colleges and universities are collecting terabytes of data on how students interact with their systems and content. But most schools gather this data according to their own specs, which makes comparisons difficult for researchers trying to identify broader trends.

However, this may all change in the wake of a conference hosted by Harvard and MIT this August that saw a dozen schools implement a standardized data structure for MOOCS and other online courses using the Open edX platform. The goal: Create a better understanding of how students learn online and improve instructional approaches accordingly.

“The biggest issue we’re trying to address right now is the feedback loop in online learning—going from content back to better instruction and better material,” said Daniel Seaton, a research scientist at Harvard’s Office of the Vice Provost for Advances in Learning. “We’re helping our partner schools build infrastructure that is capable of analyzing large data sets, which can be quite detailed and messy.”

A New Infrastructure

At the heart of this infrastructure are data standards intended to give researchers a common foundation on which to build. “A big part of the conference involved helping individual institutions set up a workflow that will allow them to extract data about how students interact in MOOCS and online courses, and put them in a usable format,” added Dustin Tingley, a professor of government at Harvard. “An important part of any future collaborative process is a common standard for how certain things are calculated and what specific types of data sets are produced.”

(Next page: Learning to use the right data for the right challenge)

Add your opinion to the discussion.