Though Google Book Search has the largest collection of digitally scanned books of its kind, and the initiative has captured a great deal of attention, researchers should keep in mind there are other book-search sites that can help them find rare works as well, a university librarian says.
And while the scanning of traditional paper books to digital format is making it easier for researchers to locate hard-to-find information, there are some problems with available services, including copyright issues and problems with optical character recognition (OCR) technology.
Books are available digitally in multiple locations and formats, but with limited overlap, said Greg R. Notess, reference team leader at the Montana State University library, with Google Book Search dominating the field.
“Four years ago, Amazon had more material, but now Google Books dominates,” he said Sept.14 at the WebSearch University conference in Washington, D.C., adding that a settlement between the online company and book publishers is still pending. Google’s 10-month-old settlement with groups representing U.S. authors and publishers would allow the company to act as its partners’ sales agent.
More than 10 million books already have been scanned into Google’s electronic index since 2004. The settlement would clear the legal hurdles that have been preventing Google from stockpiling millions of copyrighted books that are out-of-print. Because those books are scattered in the different libraries across the nation, they’re inaccessible to most people.
“There’s a lot of publicity and press about how wonderful it is to have all of these old books up and available. I agree; I love more data and more stuff I can play with and search out there. It doesn’t mean that they’re necessarily easy to use, especially if you go back in time,” Notess said.
Other noted book-search sites include WorldCat, Hathi Trust, Open Library, and Internet Archive Texts, as well as some commercial publisher sites that offer access for a fee.
But even though Google Book Search has improved availability, Notess said problems with scanning and OCR technology makes searching for information harder on all sites.
“When searching for something … I can fully expect that [words] would be transliterated and OCR’d in multiple ways,” Notess said. “Bear in mind when you’re searching, especially for older data, that you’re not getting everything unless you think creatively.”
Some other issues with online book-search services include access that is limited in some cases only to subscribers, overly general subject headings, and catalogs that haven’t been kept up to date.
Notess offered strategies to help researchers locate the information they are looking for. For example, he suggested that researchers check multiple sources while searching in multiple ways–such as by title, author, publisher, publication date, keyword, and phrase.
Internet Archive Texts