It turns out there is a kind of data that, like black holes or evil wizards of Middle Earth, only becomes more powerful the larger it grows.
What’s more, suggest researchers Enric Junqué de Fortuny, David Martens and Foster Provost, even if you’re not gathering this kind of data at present, the new results suggest you may lose out to a competitor who is.
To understand what’s going on—stay with us, it’s worth it—you have to know whether your data is dense or sparse. Most businesses are gathering data about their customers and clients such that a great deal is known about any one person. For example, you survey a handful of customers in depth, as in a customer survey comprised of dozens of questions.
This is “dense” data, with lots of information on every person, object or event you’re cataloging.
But the most useful data, in part because it’s still hard to get your hands on or evaluate properly, is called “sparse” data. This is the kind of data the web’s giants, like Google and Facebook, gather all the time. It’s “sparse” because all you’re getting is a few data points from any one person, when you could be getting thousands or even millions.
Take Netflix’s movie rating database, for example—if a person could rate all of the movies in Netflix’s database, Netflix would have perfect knowledge about that person’s tastes. But we can only watch and evaluate so many movies, so for most of the films in the database, Netflix knows zip about our tastes. Hence, sparse data.