A new breed of technologies is taking shape that will extend the reach of search engines into the web’s hidden corners, such as internet-connected databases they can’t penetrate today, reports the New York Times. Search engines rely on programs, known as crawlers, that gather information by following the trails of hyperlinks that tie the web together. While that approach works well for the pages that make up the surface web, these programs have a harder time penetrating databases that are set up to respond to typed queries. "The crawlable web is the tip of the iceberg," says Anand Rajaraman, co-founder of Kosmix (www.kosmix.com), a so-called "deep web" search start-up whose investors include Jeffrey P. Bezos, chief executive of Amazon.com. Kosmix has developed software that matches searches with the databases most likely to yield relevant information, then returns an overview of the topic drawn from multiple sources.
"Most search engines try to help you find a needle in a haystack," Rajaraman said, "but what we’re trying to do is help you explore the haystack." Another new approach comes from Prof. Juliana Freire at the University of Utah, who’s working on an ambitious project called DeepPeep (www.deeppeep.org) that eventually aims to crawl and index every database on the public web…

Click here for the full story

About the Author:

eSchool News


Add your opinion to the discussion.