|
|
|
|
|
Mary McKeon, MSLS |
|
Information Services Librarian & Head of
Circulation |
|
617-638-4253, mamckeon@bu.edu |
|
|
|
|
|
|
|
|
Examine the Alumni Medical Library’s
multi-functional website |
|
Improve
information retrieval skills using the WWW |
|
Practice search strategies using general search
engines, web directories, and meta-search engines |
|
Use established ‘virtual library’ collections as
an alternative strategy for locating resources on the WWW |
|
Evaluate Web sites for validity, source,
content, and currency of information |
|
|
|
|
|
|
The website serves as an electronic
representation of the physical library |
|
Expands access to library services &
resources from remote locations |
|
Supports reference activities |
|
Supports user education/curriculum support
activities |
|
Supports outreach grants |
|
Serves as a department newsletter |
|
|
|
|
|
|
Created in 1989 by Tim Berners-Lee and his
colleagues at CERN, a physics laboratory in Switzerland |
|
|
|
Their goal was to provide shared documents and
graphics more easily on the Internet |
|
|
|
|
World Wide Web Gopher, FTP, HTTP,
telnet, electronic mail, etc. |
|
Hypertext transfer protocol (HTTP) allows for
text |
|
AND
hypermedia |
|
Documents are usually written in hypertext
markup language (HTML) |
|
Data is
requested by a client and provided by a server -- |
|
any computer can be a client and/or
server |
|
|
|
|
Browsers navigate the WWW |
|
|
|
Browsers are simple to use because one user
application communicates with many servers, and one server can support many
user interfaces |
|
|
|
Mosaic was released in 1993 by the National
Center for Supercomputing Applications (NCSA) as the 1st WWW browser
program |
|
|
|
Netscape |
|
Internet Explorer |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Finding Information on the Web: |
|
About Search Engines |
|
|
|
Acknowledgements: |
|
SearchWatch
http://searchenginewatch.com/ |
|
|
|
|
|
|
|
|
|
|
How do you find WWW pages/sites? |
|
other web pages 88% |
|
search engines 82% |
|
Internet directories (ie., Yahoo) 65% |
|
print media 62% |
|
friends 58% |
|
email signatures 33% |
|
TV advertising 37% |
|
|
|
|
Professional journals such as the Bulletin of
the Medical Library Association |
|
Professional news sources such as College &
Research Libraries News |
|
Specialty publications such as Medicine on the
Net |
|
Search MEDLINE for medical journal articles
identifying and evaluating topical websites |
|
|
|
|
|
|
|
|
The term "search engine" is often used
generically to describe both search engines and Web directories. |
|
|
|
They are not the same. The difference is how listings are compiled. |
|
|
|
|
|
|
|
|
|
|
|
|
|
“True” search engines are created by
machines. Databases are built by
“robots” or computer programs that
roam the WWW finding sites new to their home database, updating old ones
and deleting obsolete sites. |
|
|
|
A spider visits a web page, reads it, and then
follows links to other pages within the site (this process is called being
"spidered" or "crawled”) |
|
|
|
The spider returns to the site on a regular
basis to look for changes and this may affect how sites are listed and
retrieved. |
|
|
|
|
|
|
Page titles, body copy and other elements play a
role |
|
|
|
The software sifts through the millions of site
records in the index to find matches to a search request |
|
|
|
The software also ranks the retrieved sites by
relevancy |
|
|
|
|
|
|
|
|
|
A directory such as Yahoo! depends on humans for
its listings |
|
Site creators submit a description to the
directory for the entire site, or editors write one for sites they review |
|
A search looks for matches only in the submitted
descriptions |
|
Directories usually have much smaller databases
than true or hybrid search engines |
|
|
|
|
|
(both engine and directory) |
|
These days, almost every search tool is part
engine, part directory |
|
Being included in a search engine's directory is
usually a combination of luck and quality |
|
Site producers can “submit” their sites for
review, but there is no guarantee that they will be included in
directories. |
|
|
|
|
|
|
Meta-search engines do not maintain databases of
their own = “middle agents” |
|
|
|
Transmit your search query simultaneously to
multiple search engines |
|
|
|
Search results represent a compilation of
results from all engines queried |
|
|
|
|
|
Useful for saving time in searching multiple
engines at-once |
|
|
|
Useful for obtaining an overview of “what’s out
there” |
|
|
|
Beware:
if you enter a complex search strategy, not all of the engines
searched may be able to interpret it |
|
|
|
Try these meta-search engines: |
|
MetaCrawler Inference Find |
|
Metafind Ask Jeeves |
|
|
|
|
|
|
|
|
|
|
|
|
A search engine searches the contents of its
database |
|
--
not the World Wide Web directly |
|
|
|
None of these databases includes all the WWW
pages in existence, so results vary |
|
|
|
Each database has different features |
|
|
|
|
|
|
Some Web search databases are maintained with
little human evaluation (true search engines) |
|
|
|
In others, sites are hand-picked and evaluated
or reviewed (directories) |
|
|
|
Some search tools do both (hybrid search
engines) |
|
|
|
Search tools vary in features, size and
comprehensiveness |
|
|
|
|
|
Title words are assumed to be most relevant |
|
|
|
Keywords appearing near the top of a web page
are assumed to be relevant |
|
|
|
Assumes that any page relevant to the search
term will mention those words right from the beginning |
|
|
|
|
|
Frequency of keywords |
|
|
|
The result is that no search engine has the
exact same collection (database) of web pages to search |
|
|
|
Search engines may also give a web page a
relevancy |
|
boost if it has a lot of links pointing to
it or if it has been favorably reviewed |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
HotBot
searches all the pages on a particular site |
|
(not
just the main page or the actual homepage) |
|
HotBot
uses the first few words of the page for its descriptions |
|
HotBot’s
database is probably compiled based on searches of the first few
words (usually first 100) of any page’s text |
|
(which means that if the searched terms are 150
words into the page, that particular page won’t be retrieved) |
|
|
|
|
|
|
HotBot
does not appear to search headings or graphical text |
|
(if it did, it would have retrieved the
library’s homepage) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It’s important to understand what the search
engine is actually doing. |
|
|
|
It’s important to recognize that no two engines
work exactly alike. |
|
|
|
The more you know about how a search engine
works, the better able you will be to manipulate it to its fullest
advantage. |
|
|
|
But, most search engines don’t readily explain
what and how they are searching. |
|
|
|
|
|
|
|
DO become familiar with one or two favorite
search tools and learn to use their advanced features |
|
|
|
DO enter singular terms -- many search engines
will find substrings: |
|
searching for game will usually retrieve games too |
|
|
|
DO NOT expect these features to replicate the
kind of precision you’d find within a bibliographic database |
|
|
|
Do use collections that have been organized and
quality-filtered by libraries & other organizations |
|
|
|
|
|
|
|
|
|
WWW is highly unstructured and unorganized: |
|
|
|
No thesaurus or controlled vocabulary is used |
|
No indexing process occurs |
|
No standardization in the types of materials
that are mounted on the Web |
|
No quality controls or review process when WWW
sites are mounted |
|
Each search engine’s database works differently
and is developed based on different criteria -- no uniformity regarding
what parts of a Web page the engine is searching |
|
|
|
|
|
|
|
|
|
|
Instead of doing a “cold” search in a search
engine, think about the information another way: |
|
|
|
First, think about information in terms of category,
then find a site that fits that category. |
|
|
|
|
Information need:
HIV/AIDS surveillance reports |
|
|
|
Who might produce or distribute that
information? |
|
U.S. government agency |
|
Centers for Disease Control |
|
|
|
How to approach the search? |
|
Go to the CDC’s Web site and look for
“surveillance reports” |
|
|
|
|
Information need:
latest HIV/AIDS treatments |
|
Who might produce or distribute that
information? |
|
A variety of different places, depending on
your perspective! |
|
Is the information for: |
|
a researcher? a social worker? |
|
an administrator? a patient? |
|
a caregiver? A partner or loved
one? |
|
|
|
|
|
|
|
How to approach the search engine? |
|
Instead of searching for “Ryan White” or
“RFP” search for the “Boston Department of Public Health AIDS Information
Service” |
|
|
|
|
|
|
Instead of doing a “cold” search in a search
engine, use the virtual libraries compiled by: |
|
libraries |
|
professional organizations and associations |
|
government agencies |
|
city, county, & state agencies |
|
|
|
|
|
|
Once you find a Web site, you have to determine
whether the information is relevant. |
|
|
|
What are some of the criteria |
|
that you could use to evaluate Internet
resources? |
|
|
|
|
Criterion #1:
Content |
|
|
|
Accuracy |
|
Disclaimer |
|
Completeness |
|
|
|
|
|
|
Criterion #2:
Credibility |
|
|
|
A site should display the name & logo of the
institution responsible for the information, as well as particular
authors. Disclosing sponsorship can
assist users assess motivations of information providers and potential
conflicts of interests. |
|
|
|
|
Criterion #3:
Currency |
|
|
|
The date of the original document on which the
information is based and the date of posting on the Web assists users to
judge timeliness. |
|
|
|
|
|
Criterion #4:
Site Evaluation |
|
|
|
Sites should indicate whether the information
provided has been subject to review |
|
|
|
Is the site fact-checked or verified in some
way? |
|
Is the information accurate and factual? |
|
Or, is the site sponsored by the agency that
produces the informational content? |
|
|
|
|
|
|
|
Criterion #5:
Design, Software requirements |
|
|
|
Do you find the perfect site only to find that
your computer doesn’t have the appropriate software to view/manipulate the
site? |
|
Does your browser alter the appearance of the
page? |
|
Can you tell whether the software has limited
the amount of information on the page? |
|
Does the site have a “text only” version for
low-level browsers? |
|
|
|
|
Criterion #6:
Purpose, Target Audience, Point-of-View |
|
The best Web sites are clearly focused on their
purpose and target audience |
|
The point-of-view or agenda should be stated or
made obvious. |
|
The purpose of the site should be clearly
stated, and the information provided should be appropriate to that purpose
or mission. |
|
|
|
|
Criterion #7:
Disclosure, Profiling, Confidentiality |
|
|
|
Web sites request and use information for
purposes of which the user may be unaware. |
|
|
|
Users must be informed if any information about
them is gathered or used by the Web site. |
|
|
|
|
Criterion #8:
Internal Search Capabilities |
|
|
|
An internal search engine with an easy user
interface is highly desirable. It
should be capable of keyword or search string searching. |
|
|
|
|
|
|
|
|
|
Criterion #9:
Evaluation of Quality of Links |
|
|
|
The person/s responsible for link selection
should have the expertise and credentials to critically evaluate their
appropriateness. |
|
The site “architecture” or design of pointers to
linked sites is important for ease of navigation. |
|
The content of links should be accurate,
current, credible, relevant. The
content of the originating site is enhanced if it includes links to
high-quality sites. |
|
|
|
|
|
|
|
|
|
|
|
Criterion #10:
Style & functionality |
|
|
|
Is the site organized clearly and logically? |
|
Is the site well-written? |
|
Is the site easy to navigate? |
|
Do the links work? |
|
Does the site have an internal search engine? |
|
|
|
|
|
Blurred distinction between advertising and the
actual information/ Infommercials |
|
|
|
Is the advertising provided by same organization
that provides informational content? |
|
Does advertising bias informational content? |
|
“Infommercial” Web sites |
|
Is informational content mixed with
entertainment or advertising? |
|
|
|
|
|
|
|
Web pages out-of-context |
|
|
|
Does a search land you in the middle of a site,
so that you don’t know its origin or intended audience? |
|
Always return to the site’s “home” to determine
its source. |
|
|
|
|
|
|
|
Instability |
|
|
|
Does a favorite site disappear or move without
notice? |
|
Try to determine the stability of a site before
linking to it or becoming reliant on it. |
|
Document the URL, producer or location of the
site so that you can locate it later. |
|
|
|
|
|
Site alterations, updates |
|
|
|
Does a site suddenly change? |
|
Is information moved around? |
|
Is the site altered without notice? |
|
Is the information archived? |
|
If this is the case, attempt to verify
information using other sources. |
|
|
|
|
|
“Teasers” & limited free-of-charge access |
|
|
|
Does a site contain only “teasers” -- leading
you to think the information is comprehensive when it actually is not? |
|
Does a formerly “free” site suddenly require a |
|
fee-based subscription? |
|
Are certain sections or pages of a site
restricted to paying customers only? |
|
|
|
|
|
Privacy & confidentiality |
|
|
|
Is the information you input about yourself
confidential? |
|
Does a site “sell” your email address to
advertisers? |
|
Does a site require registration? If so, how do you determine what is done
with the information you’ve provided? |
|
|
|
|
|
|
|
|
|