Finding a needle of data in expanding haystack

Tips on searching

Posted: Saturday, June 03, 2000

-- Go to where you think the answer will be. Your bookmark list may still be the fastest way to get an answer to your question.

-- Take a few minutes to read the help screens and look past the front door of a search site -- you might find some shortcuts and helpful tools you'd otherwise overlook.

-- Use the most specific search term you can think of.

-- Or try searching on more than one word at once. A study by market researchers the NPD Group showed that 45 percent of searchers used multiple keyword searching, but 29 percent still just do one-word searches.

-- If you're looking for something really obscure, use a meta-search engine that makes a wide, clumsy sweep across a selection of different engines. The professional searchers surveyed chose Dogpile (www.dogpile.com) as their favorite, although Metacrawler (www.metacrawler.com) and ProFusion (www.profusion.com) also work.

By MARGOT WILLIAMS

Washington Post

WASHINGTON -- Searching the Web is like (a) opening a phone book with half the pages ripped out; (b) using a library with all the books strewn across the floor; (c) looking for a needle in a haystack; (d) all of the above.

''All of the above'' is correct for a lot of frustrated Web searchers, and the situation is only getting worse.

The problem is that the Web is just getting too vast -- it now consists of well over a billion pages and is growing, according to the NEC Research Institute's C. Lee Giles, author of studies on the accessibility and distribution of information on the Web.

''That's the publicly indexable Web,'' said Giles, speaking of the segment that the search engines ''crawl'' and index. ''There must be at least five times as much as what's indexed by the search engines.''

Consider what's happened to one of the first attempts to keep track of what's where on the Web -- a simple list of bookmarks called ''Jerry's Guide to the World Wide Web,'' the bookmark list Stanford engineering student Jerry Yang started in 1994 and which grew into Yahoo, the most visited site on the Web today.

Now a portal with all kinds of bells and whistles, at Yahoo's heart is a human-made categorized directory of more than a million sites on all subjects. The Open Directory Project (www.dmoz.org) is another guide to all topics, compiled by a volunteer army attempting to shelve all the books on the library floor.

But humans just couldn't keep up on their own. Yahoo's concept of one big, categorized list of bookmarks needed to be backed up with searchable, computer-generated indexes of all the words on all the pages collected by spider programs crawling the Web, such as AltaVista, Go, Northern Light and the Inktomi index fueling HotBot, MSN and other search services.

But as these computer-driven indexes grew in size to cover more than 1 million pages to 100 million pages to current claims of up to 500 million pages, the results of simple word searches -- what many people do -- became unwieldy and frustrating. And few users wanted to read the instructions for advanced search features that make retrieval more precise.

So the search scientists brought the people's choices back into the programs. Google, the favorite computer-made search engine, according to an informal survey of more than 40 professional news researchers, uses the Web community's expertise in its page ranking.

Just as you trust the links on a really good site to get you to other good pages, Google crawls the Web scooping up hyperlinks and uses them to figure out how important a page is by what's pointing to it.

''I swear it's psychic -- I put in a few search terms and it almost inevitably finds me just what I need,'' said Susan Beachy, an information specialist at Fox News Channel.

The competition among search engines is making searches better.

''One of the best things that's been happening is the integration of diverse resources and techniques to present the user with more valuable results,'' said Randolph Hock, author of ''The Extreme Searcher's Guide to Web Search Engines.''

Subject specialization is another trend. There are searchable directories that only categorize the Web content in a specific area, such as law (findlaw.com), golf (golfsearch.com) or chocolate (the Chocolate Lovers' page, at chocolate.scream.org). There are limited-area, robot-made search engines that index only the sites relating to a specific topic or coming from a particular region or type of institution. Searchmil.com pinpoints pages on U.S. military sites; Researchindex.com indexes just computer-science papers.

''Specialized search engines are much more likely to be up-to-date and have things better indexed,'' said Giles.

What's next?

''We're moving away from the Web as a collection of documents,'' said Chris Sherman, About.com's Web searching guide. ''There's much more multimedia. And delivery to wireless devices is a natural progression.''

In April, Google announced a version of its full index optimized for the wireless application protocol (WAP) used by cell phones and Palm handhelds.

More personalized searching and intelligent agents that cater to you are coming, in the form of software companions such as GuruNet.com and Flyswat.com. Both let you click on a word in a Web page or other documents to find links to related information, such as definitions or news.



CONTACT US

  • Switchboard 218-829-4705
  • Report News 218-855-5860
  • Advertising 218-855-5835
  • Classifieds 218-855-5898
  • Circulation 218-855-5897
  • Vox Pop 218-855-5888
  • View the Staff Directory
  • or Send feedback

ADVERTISING

SUBSCRIBER SERVICES

SOCIAL NETWORKING