Computer Search
Home

Develop Your Computer Search Abilities

Finding details has never been easier if you have great internet search technique abilities.

INTRODUCTION

According to the results of a study published by Cyveillance in July 2000, the World Wide Web is estimated to contain more than two billion pages of publicly-accessible information.(1) As if the Web's immense size weren't enough to strike fear in the heart of all but the most intrepid surfers, consider that the Web continues to grow at an exponential rate: tripling in size over the past two years, according to one estimate.(2)

Add to this, the fact that the Web lacks the bibliographic control standards we take for granted in the print world: There is no equivalent to the ISBN to uniquely identify a document; no standard system, analogous to those developed by the Library of Congress, of cataloguing or classification; no central catalogue including the Web's holdings. In fact, many, if not most, Web documents lack even the name of the author and the date of publication.

Information OverloadImagine you are searching for information in the world's largest library, where the books and journals (stripped of their covers and title pages) are shelved in no particular order, and without reference to a central catalogue. A researcher's nightmare? Without question. The World Wide Web defined? Not exactly. Instead of a central catalogue, the Web offers the choice of dozens of different search tools, each with its own database, command language, search capabilities, and method of displaying results.

Given the above, the need is clear to familiarize yourself with a variety of search tools and to develop effective search techniques, if you hope to take advantage of the resources offered by the Web without spending many fruitless hours flailing about, and eventually drowning, in a sea of irrelevant information.

Back to Workshop Outline

 

SEARCH ENGINES AND SUBJECT DIRECTORIES

The two basic approaches to searching the Web are search engines and subject directories.

Search engines allow the user to enter keywords that are run against a database (most often created automatically, by "spiders" or "robots"). Based on a combination of criteria (established by the user and/or the search engine), the search engine retrieves WWW documents from its database that match the keywords entered by the searcher. It is important to note that when you are using a search engine you are not searching the Internet "live", as it exists at this very moment. Rather, you are searching a fixed database that has been compiled some time previous to your search.

While all search engines are intended to perform the same task, each goes about this task in a different way, which leads to sometimes amazingly different results. Factors that influence results include the size of the database, the frequency of updating, and the search capabilities. Search engines also differ in their search speed, the design of the search interface, the way in which they display results, and the amount of help they offer.

In most cases, search engines are best used to locate a specific piece of information, such as a known document, an image, or a computer program, rather than a general subject.

Examples of search engines include:

AltaVista (http://www.altavista.com)

Excite (http://www.excite.com/search)

alltheweb (http://www.alltheweb.com)

Google (http://www.google.com)

HotBot (http://hotbot.lycos.com)

Back to Workshop Outline

The growth in the number of search engines has led to the creation of "meta" search tools, often referred to as multi-threaded search engines. These search engines allow the user to search multiple databases simultaneously, via a single interface. While they do not offer the same level of control over the search interface and search logic as do individual search engines, most of the multi-threaded engines are very fast. Recently, the capabilities of meta-tools have been improved to include such useful features as the ability to sort results by site, by type of resource, or by domain, the ability to select which search engines to include, and the ability to modify results. These modifications have greatly increased the effectiveness and utility of the meta-tools.

Popular multi-threaded search engines include:

Metacrawler (http://www.metacrawler.com)

Ixquick (http://www.ixquick.com)

SurfWax  (http://www.surfwax.com

Dogpile (http://www.dogpile.com)

ProFusion (http://www.profusion.com)

Back to Workshop Outline

Subject-specific search engines do not attempt to index the entire Web. Instead, they focus on searching for Web sites or pages within a defined subject area, geographical area, or type of resource. Because these specialized search engines aim for depth of coverage within a single area, rather than breadth of coverage across subjects, they are often able to index documents that are not included even in the largest search engine databases. For this reason, they offer a useful starting point for certain searches. The table below lists some of the subject-specific search engines by category. For a more comprehensive list of subject-specific search engines, see one of the following directories of search tools:

Beaucoup! (http://www.beaucoup.com)

Search Engine Colossus (http://www.searchenginecolossus.com)

Searchengines.com (http://www.searchengines.com/)

Table of selected subject-specific search engines
 

Regional (Canada)

AltaVista Canada (http://www.altavistacanada.com)

Excite Canada (http://www.excite.ca/)

Yahoo! Canada (http://ca.yahoo.com/)

Regional (Other)

Geographically specific search engines (Beaucoup):

Americas

Asia/Australia/Middle East/Africa

Europe

Companies

Advice for investors (http://www.adviceforinvestors.com)

Hoover's Online (http://www.hoovers.com/)

InfoSpace Canada (http://www.infospace.com/canada/index_ylw_ca.htm)

Wall Street Research Net (http://www.wsrn.com/)

WorldPages (http://www.worldpages.com/)

People (E-mail addresses)

Bigfoot (http://bigfoot.com/)

InfoSpace Canada Email Search (http://www.infospace.com/canada/email1.htm)

WhoWhere? (http://www.whowhere.lycos.com)

Yahoo! People Search (http://people.yahoo.com/)

People (Postal addresses & telephone numbers)

Bigfoot (http://bigfoot.com/)

Canada 411 (http://canada411.sympatico.ca)

InfoSpace Canada People Finder (http://www.infospace.com/canada/index_ppl_ca.htm)

Switchboard.Com (http://www.switchboard.com)

Images

The Amazing Picture Machine (http://www.ncrtec.org/picture.htm)

Lycos Multimedia (http://multimedia.lycos.com/)

WebSEEk (http://www.ctr.columbia.edu/webseek/)

Yahoo! Picture Gallery  (http://gallery.yahoo.com/)

Jobs

+Jobs Canada

Monster.ca (http://www.monster.ca/)

Monster.com (http://www.monster.com/)

Canada Job Bank (http://www.jobbank.gc.ca/)

The Riley Guide (http://www.dbm.com/jobguide/)

Games

Games Domain (http://www.gamesdomain.com/)

GameSpot (http://gamespot.com/gamespot/)

Software

Jumbo (http://www.jumbo.com)

Shareware.com (http://shareware.cnet.com/)

ZDNet Downloads (http://www.zdnet.com/downloads/)

Health/Medicine

Achoo (http://www.achoo.com/)

BioMedNet (http://www.bmn.com/)

Combined Health Information Database (http://chid.nih.gov/)

MayoClinic.com (http://www.mayohealth.org/)

MEDLINEplus (http://www.nlm.nih.gov/medlineplus/)

Education/Children's Sites

AOL NetFind Kids Only (http://www.aol.com/netfind/kids/)

Blue Web'n (http://www.kn.pacbell.com/wired/bluewebn/)

Education World (http://www.education-world.com/)

Kids Domain (http://www.kidsdomain.com)

KidsClick! (http://www.kidsclick.org/)

Yahooligans! (http://www.yahooligans.com)

Back to Workshop Outline

Subject directories are hierarchically organized indexes of subject categories that allow the Web searcher to browse through lists of Web sites by subject in search of relevant information. They are compiled and maintained by humans and many include a search engine for searching their own database.

Subject directory databases tend to be smaller than those of the search engines, which means that result lists tend to be smaller as well. However, there are other differences between search engines and subject directories that can lead to the latter producing more relevant results. For example, while a search engine typically indexes every page of a given Web site, a subject directory is more likely to provide a link only to the site's home page. Furthermore, because their maintenance includes human intervention, subject directories greatly reduce the probability of retrieving results out of context.

Because subject directories are arranged by category and because they usually return links to the top level of a web site rather than to individual pages, they lend themselves best to searching for information about a general subject, rather than for a specific piece of information.

Examples of subject directories include:

LookSmart (http://www.looksmart.com)

Open Directory (http://dmoz.org)

Yahoo (http://www.yahoo.com)

Back to Workshop Outline

Specialized subject directories
Due to the Web's immense size and constant transformation, keeping up with important sites in all subject areas is humanly impossible. Therefore, a guide compiled by a subject specialist to important resources in his or her area of expertise is more likely than a general subject directory to produce relevant information and is usually more comprehensive than a general guide. Such guides exist for virtually every topic. For example, Voice of the Shuttle (http://vos.ucsb.edu) provides an excellent starting point for humanities research. Film buffs should consider starting their search with the Internet Movie Database (http://us.imdb.com).

Just as multi-threaded search engines attempt to provide simultaneous access to a number of different search engines, some web sites act as collections or clearinghouses of specialized subject directories. Many of these sites offer reviews and annotations of the subject directories included and most work on the principle of allowing subject experts to maintain the individual subject directories. Some clearinghouses maintain the specialized guides on their own web site while others link to guides located at various remote sites.

Examples of clearinghouses include:

Argus Clearinghouse (http://www.clearinghouse.net)

About.com (http://about.com)

WWW Virtual Library (http://www.vlib.org)

Back to Workshop Outline

 

SEARCH STRATEGY

Regardless of the search tool being used, the development of an effective search strategy is essential if you hope to obtain satisfactory results. A simplified, generic search strategy might consist of the following steps:

  1. Formulate the research question and its scope

  2. Identify the important concepts within the question

  3. Identify search terms to describe those concepts

  4. Consider synonyms and variations of those terms

  5. Prepare your search logic

This strategy should be applied to a search of any electronic information tool, including library catalogues and CD-ROM databases. However, a well-planned search strategy is of especially great importance when the database under consideration is one as large, amorphous and evolving as the World Wide Web. Along with the characteristics already mentioned in the Introduction, another factor that underscores the need for effective Web search strategy is the fact that most search engines index every word of a document. This method of indexing tends to greatly increase the number of results retrieved, while decreasing the relevance of those results, because of the increased likelihood of words being found in an inappropriate context. When selecting a search engine, one factor to consider is whether it allows the searcher to specify which part(s) of the document to search (eg. URL, title, first heading) or whether it simply defaults to search the entire document.

Back to Workshop Outline

Search logic refers to the way in which you, and the search engine you are using, combine your search terms. For example, the search Okanagan University College could be interpreted as a search for any of the three search terms, all of the search terms, or the exact phrase. Depending on the logic applied, the results of each of the three searches would differ greatly. All search engines have some default method of combining terms, but their documentation does not always make it easy to ascertain which method is in use. Reading online Help and experimenting with different combinations of words can both help in this regard.  Most search engines also allow the searcher to modify the default search logic, either with the use of pull-down menus or special operators, such as the + sign to require that a search term be present and the - sign to exclude a term from a search.

Boolean logic is the term used to describe certain logical operations that are used to combine search terms in many databases. The basic Boolean operators are represented by the words AND, OR and NOT. Variations on these operators, sometimes called proximity operators, that are supported by some search engines include ADJACENT, NEAR and FOLLOWED BY. Whether or not a search engine supports Boolean logic, and the way in which it implements it, is another important consideration when selecting a search tool. The following diagrams illustrate the basic Boolean operations.

AND

OR

NOT

Boolean operators are most useful for complex searches, while the + and - operators are often adequate for simple searches.

Back to Workshop Outline

SEARCH TIPS

In most cases, an effective search strategy, the correct use of Boolean logic, and familiarity with the features of each of the search engines will lead to satisfactory results. However, there are additional techniques that may further improve your results in particular circumstances. The following search tips apply to one or more of the search engines discussed in this workshop.
 

Ctrl-F: After following a link to a document retrieved with a search engine, it is sometimes not immediately apparent why the document has been retrieved. This may be because the words for which you searched appear near the bottom of the document. A quick method of finding the relevant words is to type Ctrl-F to search for the text in the current document.

Bookmark your results: If you are likely to want to repeat a search at a later date, add a bookmark (or favorite) to your current search results.

Right truncation of URLs: Often, a search will retrieve links to many documents at one site. For example, searching for "Okanagan University College Library" will retrieve not only the OUC Library home page (http://www.ouc.bc.ca/libr), but also any pages that contain the phrase "Okanagan University College Library", whether or not they are linked to the home page (eg. this page - http://www.ouc.bc.ca/libr/connect96/search.htm). Rather than clicking on each URL in succession to find the desired document, truncate the URL at the point at which it appears most likely to represent the document you are seeking and type this URL in the Location box of your web browser.

Guessing URLs: Basic knowledge of the way in which URLs are constructed will help you to guess the correct URL for a given web site. For example, most large American companies will have registered a domain name in the format www.company_name.com (eg. Microsoft - www.microsoft.com); American universities are almost always in the .edu domain (eg. Cornell - www.cornell.edu or UCLA - www.ucla.edu); and Canadian universities follow the format www.university_name.ca (eg. Simon Fraser University - www.sfu.ca or the University of Toronto - www.utoronto.ca).

Wildcards: Some search engines allow the use of "wildcard" characters in search statements. Wildcards are useful for retrieving variant spellings (eg. color, colour) and words with a common root (eg. psychology, psychological, psychologist, psychologists, etc.). Wildcard characters vary from one search engine to another, the most common ones being *, #, and ?. Some search engines permit only right truncation (eg. psycholog*), while others also support middle truncation (eg. colo*r).

Relevance ranking: All of the search engines covered in this workshop use an algorithm to rank retrieved documents in order of decreasing relevance. (3) Consequently, it is often not necessary to browse through more than the first few pages of results, even when the total results number in the thousands. Furthermore, some search engines (eg. AltaVista) allow the searcher to determine which terms are the most "important", while others have a "more like this" feature that permits the searcher to generate new queries based on relevant documents retrieved by the initial search. These features are discussed in more detail in the following section of this document.

Back to Workshop Outline

 

SEARCH ENGINE COMPARISONS

This section compares some of the major Web search engines, based on the following features:

  1. Size of the database (4)

  2. Search interface

  3. Search features

  4. Results list display features

  5. Other features of note

 

AltaVista
 

URL: http://www.altavista.com

Size: 550 million pages

Interface: AltaVista includes simple and advanced interfaces. They are less intuitive than most other search engine interfaces but they are well documented and allow some of the most powerful searching on the Web if you are willing to learn how to use them. Both interfaces allow the use of Boolean logic, though different syntax is used in the two interfaces. The simple interface includes a single search box and a pull-down menu that allows you to limit your search to one of 25 languages. The advanced interface includes a search box, limit by language, and options to limit a search by date, to rank results according to keywords of your choice, and to restrict results to one per site.

Search Features:

Search logic and syntax: AltaVista defaults to Boolean OR, that is it will retrieve results containing any of the search words; however, the greater number of search terms a document contains, the more highly it is ranked. In its simple interface, AltaVista supports the use of + and - to require and exclude terms. In its advanced interface, it supports all Boolean operators - AND, OR and AND NOT, plus the proximity operator NEAR (terms within 10 words of each other). In both interfaces, enclosing search terms in quotation marks searches for an exact phrase. 

Limit options: Searches may be limited by date (Advanced only) and language (Simple and Advanced). AltaVista also allows you to restrict a search to certain fields (or sections) within a document, and by type of document, e.g. Title, URL, Image, Java applets, and Links to a specified page. For example, the search title:"sink or swim" will retrieve only those pages that include the phrase "sink or swim" in their title. The search link:www.ouc.bc.ca/libr will retrieve all pages in AltaVista's database that include links to the specified URL. 

Truncation: AltaVista uses the * character to support both right (e.g. psycholog*) and middle (e.g. colo*r) truncation. 

Case sensitivity: If you begin a word with an upper case letter, AltaVista searches only for that word with an upper case letter. If you use lower case, AltaVista retrieves upper and lower case. For example, dodge retrieves dodge and Dodge; Dodge retrieves Dodge only.

Results:

What is displayed: The result display includes the document title, URL, and first two lines of the document text. Each entry is followed by links to translate the document, find more pages from the same site, and find related pages.

Order of results: Results are displayed in order of decreasing relevance. In Advanced Search, you may specify "ranking keywords" that force documents that contain these words to appear near the top of the result list. You may also limit your results to one per site.

Refining results: A "search within these results" checkbox allows you to narrow a search.

Other features:

Browsable subject categories (from LookSmart)

Specialty searches for images, audio and video

Translate documents from and into the major European languages

Family Filter blocks certain Web sites based on their content

Free Web-based e-mail

"Customize Settings" to make AltaVista "remember" your search preferences. This feature uses cookies so you have to customize settings for each computer from which you use AltaVista.

Numerous links to commercial services, e.g. eBay.

Back to Workshop Outline

 

Excite
 

URL: http://www.excite.com/search/

Size: 250 million pages

Interface: Excite offers two interfaces: simple and Advanced Web Search. The simple interface consists of a single search box with options to limit by media type or site type, e.g. News, Products, Photos, Audio, Video. Advanced Web Search presents the searcher with a series of search boxes that allow you to perform either word or phrase searching and to instruct Excite which words and/or phrases the document CAN contain, MUST contain, and MUST NOT contain. It also allows limiting by language, country, and domain (.com, .edu, .org, etc.).

Search Features:

Search logic and syntax: Excite defaults to Boolean OR, that is it will retrieve results containing any of the search words; however, the greater number of search terms a document contains, the more highly it is ranked. Excite supports the use of the Boolean operators AND, OR and NOT (all of which must be in capital letters), the + and - signs to require and exclude words from your search, and phrase searching using quotation marks. The "Zoom In" feature allows you to refine your subject before you search. To use this feature, enter your search string in the search box, then click "Zoom In" rather than "Search". Excite presents you with a list of terms related to your search string (broader, narrower and related terms) from which you can choose only one term (it might be more useful if you could choose more than one).

Limit options: Simple Search offers limits by media type. These limits are not available in Advanced Web Search, which allows limiting by language, country and domain.

Truncation: None 

Case sensitivity: None

Results:

What is displayed: For each document, Excite displays title, URL, and first two lines of document text. Links at the top of the result list allow you to "show titles only" or "view by URL". The latter feature sorts the results by URL and displays only the document title beneath the Web site to which it belongs.

Order of results: Results are displayed in order of decreasing relevance. The "view by URL" link allows you to sort by Web site.

Refining results: There is no way to refine search results.

Other features:

Browsable subject directory

Online shopping

Specialty searches for stocks, companies, people, maps, weather, travel, etc.

Free Web-based e-mail

Customizable portal site "My Excite"

Back to Workshop Outline

 


alltheweb
 

URL: http://www.alltheweb.com

Size: 575 million pages

Interface: Simple and Advanced interfaces. The simple search provides a single search box with a pull-down menu to select "All the words", "Any of the words" or "Exact phrase". The advanced interface includes all of the features in the simple interface, plus the ability to limit by language, domain, location of search terms in the document (text, title, link name, URL and link to the URL), and "word filters" (must include, should include, or must not include).

Search Features:

Search logic and syntax: Select "All of the words" (default), "Any of the words" or "Exact phrase", use "Word Filters" (Advanced Search), or + to include, - to exclude, and double quotes to search for a phrase.

Limit options: Language, domain, location of search terms in document (Advanced Search only).

Truncation: None

Case sensitivity: None

Results:

What is displayed: Alltheweb displays title, brief summary and URL for each document found. 

Order of results: Results are displayed ten per page, ranked by relevance. 

Refining results: Not available

Other features:

MP3 search, FTP search, Multimedia search, Mobile search (for wireless devices)

Back to Workshop Outline

 


Google
 

URL: http://www.google.com

Size: 705 million pages (5)

Interface: Simple and Advanced. The simple interface is a single search box with two search buttons: "Google Search" and "I'm Feeling Lucky". The latter automatically displays the page deemed most relevant rather than displaying a list of results. The advanced interface provides boxes for "all the words", "exact phrase", "any of the words", and "without the words", pull-down menus to limit by location on the page (anywhere, title or URL), language and domain, radio buttons to filter results using "SafeSearch", and search boxes that allow you to search for pages that are similar to or link to a given URL.

Search Features:

Search logic and syntax: Google defaults to Boolean AND. Enclose phrases in quotation marks. Use + and - to require and exclude search terms. Google also supports use of the Boolean OR.

Limit options: Limit by language, domain, Web site (Advanced Search only) 

Truncation: None 

Case sensitivity: None

Results:

What is displayed: Results include document title, first few words of text, URL, size (in bytes) and a link to a previously cached version of the page. 

Order of results: Google's PageRank algorithm ranks pages based on the number of pages that link to a given document. That is, the more frequently a document is linked to, the "better" it is. It also analyzes the frequency of search terms and their proximity to each other in the document, like other search engines. When there is more than one result from a site, Google groups the results by site, displaying only the first two results from that site followed by a link to find more results from that site. 

Refining results: Clicking the "Similar Pages" link retrieves pages that are "related" to the current result. 

Other features:

Indexes Adobe Acrobat (.pdf) files

Google Web Directory

Google Toolbar (IE 5+ only) and browser buttons allow access to Google whenever your Web browser is running.

Back to Workshop Outline

 


HotBot
 

URL: http://hotbot.lycos.com

Size: 500 million pages

Interface: HotBot offers two interfaces: a default (not to say simple, as it offers more options than most search engines' advanced interfaces) and an Advanced Search. Both interfaces feature pull-down menus for modifying search criteria (for example, to switch between word and phrase searching), and for restricting searches by date, geographical location, language and domain name. These pull-down menus make available advanced search features to users who otherwise might be intimidated by the complex Boolean logic needed to perform a similar search in AltaVista or another search engine. Check boxes are used to limit searches to particular types of media.

Search features:

Search logic and syntax: Pull-down menus in both the default and advanced interfaces allow you to select between "all of the words", "any of the words", "exact phrase", "the person" (HotBot automatically rotates search terms, so a search for "Bill Gates" will look for "Bill Gates" and "Gates, Bill"), "links to this URL", and "Boolean phrase". HotBot supports the Boolean operators AND, OR and NOT, the + and - signs to require or exclude search terms, and quotation marks to specify phrase searching. 

Limit options: The default search screen allows limiting by date, language and media type (e.g. Javascript, Image, Video). Advanced Search provides the same options plus an increased number of media types, and limiting by Internet domain (e.g. .edu or www.ouc.bc.ca), geographical region, and page depth. "Word Stemming" (Advanced Search only) searches for grammatical variations of your search terms (e.g. searches for "thought" will also find "think" and "thinking"). 

Truncation: HotBot offers the most sophisticated truncation features of any search engine. The * character may be used to replace any number of characters, while the ? character replaces a single character. Both of these symbols may be used at the end, in the middle, or at the beginning of a word. 

Case sensitivity: Searches with all lower-case letters are not case-sensitive. The use of an upper-case letter anywhere in a search string limits search results to documents that match the exact search, including the case.

Results:

What is displayed: HotBot offers three options: full descriptions, brief descriptions, and URLs only. The full display includes the document title, the first few lines of text, URL and date. The brief display includes the title and first few lines of text. 

Order of results: Matching directory categories, if any, are followed by matching Web pages, in order of decreasing relevance. By default, HotBot displays only one result per site. Click "See results from this site only" to see all the results from a given site.

Refining results: A check box allows you to search within the results of your current search, using whichever words you specify.

Other features:

Browsable subject directory

Search for companies, people, software, stocks, maps, etc.

Free Web-based e-mail

Back to Workshop Outline

 

Dogpile

URL: http://www.dogpile.com

Interface: Single search box, with a very busy interface, and is very commercial in its orientation.  Allows you to restrict your search to one of the following categories: The Web, Images, Audio/MP3/ Action/ News, FTP, Discussion, Small Biz, and Streaming Media.

Engines Searched: Dogpile searches the following search engines: AltaVista, Bay9, Direct Hit, Dogpile Web Catalog, FindWhat, Google, GoTo.com, Kanoodle, LookSmart, Lycos, Open Directory, RealNames, Sprinks by About and Yahoo, but only will search 3 at a time.

Search Features:

Search logic and syntax: Dogpile defaults to a Boolean AND search. Enclose phrases in quotation marks. Use + and - to require and exclude search terms.

Limit options:  Dogpile allows you the ability to search the following categories as the only mechanism to limit your search; The Web, Images, Audio/MP3/ Action/ News, FTP, Discussion, Small Biz, and Streaming Media.  

Dogpile provides the user with the option to limit which search engines it uses.  You have the ability to select 5 search engines per column.  Selecting column 1 and the 5 (or fewer) search engines limits your results to those engines alone for that page of results.  This function writes a cookie on your hard drive to maintain the information for future searches.

Truncation: None

Case sensitivity: None

Results:

What is displayed:  The search engine that was used and the number of hits that it returned. Results include document title, first few words of text, URL.

Order of results: Based upon the engines searched, and their specifications of ranking.

Other features: A variety of commercial (shopping) services

Back to Workshop Outline

Ixquick

URL: http://www.ixquick.com

Interface: Single search box.  Ixquick allows the user the choice to search one of the following catogories: “Web”, “News”, “MP3”, and “Pictures”.

Engines Searched: Ixquick searches the following search engines: AOL, AltaVista, Euro Seek, Excite, Alltheweb, GoTo.com, HotBot, LookSmart, MSN, NBCi,  Webcrawler, Yahoo.

Search Features:

Search logic and syntax: Ixquick is one of the few metasearch engines that allows, natural language, simple, and Boolean searches.  Ixquick defaults to a Boolean AND search. Enclose phrases in quotation marks. Use + and - to require and exclude search terms.

Limit options: Allows for the user to choose whether to search “The Web”, “MP3”, “News”, and “Pictures”.  The user has the option to select from the following countries; Deutsch, English, English UK, Español, Français, Italiano, Nederlands, Português.

On the results page you have the option to limit or select the search engines you wish to use, for a subsequent search.

Truncation:  Ixquick has the ability to determine which search engines utilize wildcards, it has the ability to translate and use the search engines that have the ability to respond to them.

Case sensitivity: Yes, Capitalized results will retrieve only the pages where the search term is also capitalized, however lower case will disregard case sensitivity and retrieve everything.

Results:

What is displayed: Results include document title, first few words of text, URL, which search engines it used to find the site and the numbers of times they are hit by each engine.  Clicking on the search engine will take you that search engine's list of related topic pages.  All search results open in a new page so that your Ixquick search page is always available.

Order of results: Ixquick provides one star to a result list item for each search engine that ranks that site in its top ten results. That is, a five-star result is ranked in the top ten by five search engines.

Refining results: Clicking the "More Like This" link retrieves pages that are "related" to the current result.

Other features: 

 Search for pages that link to a given page (e.g. link:http://www.ouc.bc.ca)

Back to Workshop Outline

Metacrawler

URL: http://www.metacrawler.com

Interface:  

Normal Search: Single search box.  Metacrawler allows the user the choice to search for Any word, All words, or Phrase contained within the search box.  The interface is very busy and commercial in orientation.  

Power Search: Within the power search mode Metacrawler offers the ability to customize how it performs its search. The user is provided with the option of selecting which search engines to use,  the order of the results displayed, and the domain (whether it be by country, or institution, government, education, etc.).  The option is also provided to limit the number of results, and results per page. The layout is very user friendly. Metacrawler uses cookies to save customizations to your hard drive.

Engines Searched:  Metacrawler searches the following search engines: AltaVista, DirectHit, Excite, FindWhat.com, Google, GoTo.com, Internet Keywords, Kanoodle, LookSmart, Lycos, MetaCatalog, Sprinks by About, Webcrawler.

Search Features:

Search logic and syntax: Metacrawler defaults to Boolean AND. Enclose phrases in quotation marks. Use + and - to require and exclude search terms.

Limit options: Allows for the user to choose whether to search “The Web”, “Directory”, “Audio/MP3”, “Image”, “News Groups”, “Action”, “Streaming Media”.  Power search allows the user the ability to select which engines they wish to search, the domain (whether it be an Educational site, Government, or Country of choice).  You may specify the amount of time you wish to wait, as well as the quantity of results on one result page.  Search by Country allows the user the ability to select a country of choice to refine one's search.

Metacrawler also provides the option to customize your search options.  This provides the user with the ability to customize the colour scheme, as well as the default interface (Normal search, Slide Show, Power search, Low bandwidth).  This feature also provides you the ability to select which engines are searched, key word defaults, domains, the time out length, the number of results, and how they will be viewed, whether the cursor will appear in the search box each time, and if  it should save search parameters automatically every time you change an option and issue a query.  This is all accomplished by writing a cookie on your hard drive to retain this information.

Truncation: None

Case sensitivity: None

Results:

What is displayed: Results include document title, first few words of text, URL, which search engines it used to find the site, and a link to more sites similar to this title.

Order of results: Metacrawler’s PageRank algorithm ranks pages based on the frequency the document is accessed, i.e. a document accessed 100 times will rank higher than a document accessed 50 times.

Refining results: Clicking the "More Like This" link retrieves pages that are "related" to the current result.

Other features:  

View the most popular searches

A variety of commercial services

Back to Workshop Outline

ProFusion

URL: http://www.profusion.com

Interface: Single search box.  Provides search options below the search text box, thus allowing you the ability to customize your search.

Engines Searched: Profusion searches the following search engines; AltaVista, Yahoo!, GO, LookSmart, Britannica, Lycos, About, Excite, DirectHit.

Search Features:

Search logic and syntax: ProFusion defaults to a Boolean AND search, but allows you to choose how you wish to change the search logic in a pull down menu. Enclose phrases in quotation marks. Use + and - to require and exclude search terms. 

Limit options:  Profusion provides you with the option to limit your search into one of the following categories: Web Search, Business, Computing, Discussion, Entertainment, Health, Investment, MP3, Sports, Kids Fun, Genealogy, People.  Allows you the ability to choose search sources: Best 3, Fastest 3, All, You Choose.  These allow you to pick yourself or allow ProFusion to choose which search engines to use. You may predetermine the number of results that will be presented.  

Truncation: None

Case sensitivity: None.

Results:

What is displayed:  The search engine that was used and the number of hits that it returned. Results include document title, first few words of text, URL, also the search engine that the document was found by is presented as an active link (but will only take you to the search page).  It provides a "track it" feature that requires your email address, since Intelliseek's Web Tracker will email to you when the page changes or is updated.

Order of results: Results are displayed in order of decreasing relevance.

Other features: 

 Search for pages that link to a given page (e.g. link:http://www.ouc.bc.ca)

Back to Workshop Outline

SurfWax

URL: http://www.surfwax.com

Interface: Single search box.  Very simple layout, with the only option being a choice for the number of results that you wish to be returned.

Engines Searched: Surfwax searches the following search engines: All The Web, AltaVista, Excite, HotBot, InfoSeek, OpenDirectory, SearchEdu,  Webcrawler, Yahoo, Yahoo News, Ditto.

Search Features:

Search logic and syntax: SurfWax defaults to a Boolean AND search. Enclose phrases in quotation marks. Use + and - to require and exclude search terms.

Limit options: Uses a series of "dimensions".  The 1st dimension aids the user in focusing their search.  The 2nd dimension allows for what the creators call a SearchBlanket™, which allows for the user to generate a concise list of results.  The 3rd dimension allows for the user the ability to review a site with more relevant information pertaining to the search.  The 4th dimension allows for you to personalize SurfWax to your preferred style of searching.  

Truncation: None

Case sensitivity: None.

Results:

What is displayed: The display page is subdivided into two halves.  The right side displays the results of your search.  The left side, initially, makes reference to the various search engines that were used in generating the list of results.  By clicking on the green arrowhead, in front of the desired result, a description page is open in the frame on the left side.  This “Snapshot” provides you with the criterion that was used to find this page, as well as an abstract of the site and some key points that can be found in the site or page. Also provided is a series of site focus words that will allow you to further refine your search if you so desire.  When a link is chosen, that link is opened on a separate page, so you always have access to the search results page.

Order of results: Unknown.

Refining results: By clicking on the focus button allows you the ability to refine your search.  SurfWax then provides you with a series of terms that are related to the word or words you are searching for.  For example "Hurricane" brought up the following choices: a severe tropical cyclone usually with heavy rains and winds moving a [definition truncated at this point]

Broader << cyclone  

 tropical storm marked by extremely low pressure and circular winds [definition truncated at this point]

  Broader<< mile<< statute mile<< stat mi<< land mile<< mi<< speed<< velocity  

  Narrower << circular winds<< extremely low pressure 

Other features: 

 Search for pages that link to a given page (e.g. link:http://www.ouc.bc.ca)

Back to Workshop Outline


EXERCISES

Exercise 1 - Search Engines: Select one topic from the list below. Use at least two different search engines to search for information about your topic. Compare the results you retrieve from each. When comparing your results, consider the following points, among others:

How easy or difficult was it to figure out how to search?

Was there adequate documentation to help you formulate your search?

How many results did you retrieve?

What proportion of the results were relevant to your perceived information requirements?

How current were the results?

Was the amount of detail displayed with the results adequate?

Was the order in which the results were displayed evident and/or logical?

What other features contribute to (or detract from) the search engine's utility?


Suggested topics:
 

Internet 2

Mad Cow Disease

The Beat Generation

Tupac Amaru

Microbreweries

Links to search engines:
 

AltaVista

Excite

alltheweb

Google

HotBot

Exercise 2 - Multi-threaded search engines: Use a multi-threaded search engine to search for information about the topic you researched in Exercise 1. Compare your results with those you retrieved with an individual search engine. Also consider the same points as you did above.

Links to multi-threaded search engines:
 

Dogpile

Ixquick

Metacrawler

ProFusion

SurfWax

Exercise 3 - Subject directories:

(a) Browse the Yahoo subject categories (do not use the search function) to find information about the topic you researched in Exercise 1. Assuming you were able to find some relevant information (it is possible that you will not), consider the following points:

Which of the two methods - search engines or subject directories - do you consider was more successful or more appropriate? Why?

What are the advantages and disadvantages of each method?

(b) Use Yahoo's search form to search for information on the same topic as you did in (a).

Which method of searching Yahoo was more successful? Why?

How does searching Yahoo differ from searching a large search engine database?

(c) Browse and/or search a clearinghouse of specialized subject directories to see if there is a subject directory relevant to your topic. If so, peruse it to see if it would be useful for finding information on your topic.

Links to clearinghouses:
 

Argus Clearinghouse

About.com

WWW Virtual Library

Exercise 4 - Test your skills: Choose a topic that interests you, or one from the list below. Use one or more tools of each type discussed above to search for information about your topic. Try to incorporate some or all of the following into your searches:

Experiment with both simple and advanced interfaces

Read the search tool's documentation for instructions on how to search

Use Boolean logic, proximity operators, wildcard characters and phrase searching

Use one or more of the search tips discussed above

Where applicable, use a search engine's "more like this" feature to generate new searches

Try different methods of displaying results, including sorting where applicable


Suggested topics:

  1. What is the relationship between Mad Cow Disease and Creutzfeldt-Jakob Syndrome?

  2. Is it possible to infect your computer's hard drive with a virus by running programs over the Internet that use the Java programming language?

  3. Is there any information on the Internet on Canadian microbreweries, brew pubs, or brew on premises shops?

  4. I am hoping to see the movie The Sweet Hereafter. I would like to find reviews of the movie, any information about the novel on which it was based, and biographical information about the movie's director, Atom Egoyan.


Links to search tools: 
 

SEARCH ENGINES

AltaVista

Excite

alltheweb

Google

HotBot

MULTI-THREADED 
SEARCH ENGINES

Dogpile

Ixquick

Metacrawler

ProFusion

SurfWax

SUBJECT DIRECTORIES

LookSmart

Open Directory

Yahoo

CLEARINGHOUSES

Argus Clearinghouse

About.com

WWW Virtual Library

Back to Workshop Outline

REFERENCES

This list is selective, rather than comprehensive. It contains references to documents, both online and in print, that were either helpful in preparing this workshop or that may be useful for those who want to know more about searching the World Wide Web.

Barlow, Linda. The Spider's Apprentice: A Helpful Guide to Search Engines. March 2, 2001. http://www.monash.com/spidap.html

Cohen, Laura. Searching the Internet: Recommended Sites and Search Techniques. May 23, 2001. http://library.albany.edu/internet/search.html

________. Conducting Research on the Internet. May 2001. http://library.albany.edu/internet/research.html

________. Quick Reference Guide to Search Engine Syntax. May 2, 2001. http://library.albany.edu/internet/syntax.html

________. Second Generation Searching on the Web. June 12, 2001. http://library.albany.edu/internet/second.html

Courtois, Martin P. and Michael W. Berry. "Results Ranking in Web Search Engines." Online 23.3 (May 1999). Also available http://www.onlineinc.com/onlinemag/OL1999/courtois5.html

Elkordy, Angela. Web Searching, Sleuthing and Sifting.February 18, 2000. http://www.thelearningsite.net/cyberlibrarian/searching/ismain.html

Elliott, Kevin. Web Search. http://websearch.about.com/internet/websearch/ (About.com's guide to Web searching)

Gresham, Keith. "Surfing with a Purpose." Educom Review 33.5 (Sept./Oct. 1998): 22-29. Also available: http://www.educause.edu/ir/library/html/erm9851.html

Hock, Randolph. "Web Search Engines: (More) Features and Commands." Online 24.3 (May/June 2000): 17-24+.

Kriesel, Ronald W. Suggested Internet Research Strategies. August 19, 2000. http://www.concentric.net/~Rkriesel/Search/Strategies.shtml

Notess, Greg R. Search Engine Showdown: The Users' Guide toWeb Searching. http://www.searchengineshowdown.com (Many, many pages of information related to search engines)

Salpeter, Judy. In Search of the Perfect Search Engine. February 2001. http://techlearning.com/db_area/archives/TL/200102/picksofmonth.html

Sherman, Chris. "The Future Revisited: What's New with Web Search." Online 24.3 (May/June 2000): 27-28+. Also available: http://www.onlineinc.com/onlinemag/OL2000/sherman5.html

Snow, Bonnie. "The Internet's Hidden Content and How to Find It." Online 24.3 (May/June 2000): 61-66.

Sullivan, Danny. Search Engine Watch. http://searchenginewatch.com/ (An entire Web site devoted to Internet search engines; updated constantly.)

Wighton, David. Searching FAQs. July 27, 1999. http://www.cln.org/searching_faqs.html

Back to Workshop Outline

 

FOOTNOTES

1. Internet Exceeds 2 Billion Pages. July 10, 2000. http://www.cyveillance.com/us/newsroom/pressr/000710.asp
2. OCLC Office of Research, "Web Statistics." Web Characterization Project. http://wcp.oclc.org/stats.htm .
3. This algorithm is usually based on one or more of the following criteria: the number of times the search terms appear in the document; the location of the search terms in the document (eg. the appearance of a given word in a document's title produces a higher ranking than the appearance of the same word in the body of another document); the proximity of search terms to one another within a document; the number of times a term appears in the search engine's database.
4. For individual search engines only, as reported by Danny Sullivan, Search Engine Sizes, http://searchenginewatch.com/reports/sizes.html, April 6, 2001.
5. As of April 9, 2001, Google reports its size as 1,346,966,000 pages. According to Sullivan, this is because Google's linking technology allows it to return pages it has never indexed.

Back to Workshop Outline

Copyright © 1996-2001 Ross Tyner. This document may be downloaded, printed or copied for non-commercial use without further permission of the author, provided the content is not modified and this statement appears at the bottom of the page. Any use not stated above requires the written consent of the author. The distribution of a copy of this document via the Internet or other electronic medium without the written permission of the author is expressly prohibited.

 

 

 

Follow the links in this link rack to access aspects of the employment process:

Application Details Benefits Certificates Computer Search
Computer Skills Contact Information Experience Intelligence
Interview Network Preferences Records
References Resume Transcripts Volunteer
 
This search box may help you find details you seek in a hurry. Try Google:
Google

 

Here are books which may be of interest and use from Amazon.com.
Click on the cover image to go to more detail about each book.

Prepared 2005-Revised 2008
Please send comments and content to be added to dularson@bellsouth.net
Advertise at TechnicianEducation.com
webmaster@technicianeducation.com
Appropriate Related Link Exchange Encouraged