Search Engines - Boston University

Download Report

Transcript Search Engines - Boston University

Search Engines
•"http://www.yahoo.com">Yahoo!
•"http://www.altavista.com">AltaVista
•"http://www.infoseek.com">Infoseek
•"http://www.lycos.com">Lycos
•"http://www.mckinley.com">Magellan
•"http://www.excite.com">Excite
•"http://www.webcrawler.com">WebCrawler
•"http://www.hotbot.com">HotBot
Search Engines (cont.)
•"http://www.nlsearch.com">Northern Lights
•"http://www.whowhere.com">Who Where
•"http://www.planetsearch.com">PlanetSearch (Shut Down!)
•"http://www.goto.com">GoTo
•"http://www.metacrawler.com">MetaCrawler
•"http://www.multicrawl.com">MultiCrawl
•"http://www.ask.com">Ask
•"http://www.hypermart.com">Hypermart
Search Engines (cont.)
•"http://www.google.com">Google
PC Magazines
•"http://www.ddj.com">Dr. Dobb’s Journal
•"http://www.pcmag.com"> PC Magazine
•"http://www.pcweek.com">PC Week
•"http://www.zdnet.com"> ZD Net
•"http://www.computershopper.com”> Computer Shopper
HTML File Structure
Head
Body
Man
<HTML>
<HEAD>
Title, Creation Date,
Key Words, Author, etc
</HEAD>
<BODY>
Links, Text, Lists,
Tables, Scripts, Forms,
Colors
</BODY>
</HTML>
HTML file
HTML Tag
General structure:
<Tag [Attributes]> Text refered by the current tag </Tag>
Start Tag action
End Tag action
Attributes: Not necessary for all tags
Example: For images, tables, fonts, etc ...
Look at:
•http://cs-www.bu.edu/faculty/snyder/cs101/sample.html
•http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimerAll.
html (A beginner guide to HTML)
Headings
The syntax of the heading element is:
<Hy>Text of heading </Hy>
where y is a number between 1 and 6 specifying the level of the heading.
Paragraphs
First example:
<P>Welcome to the world of HTML.
This is the first paragraph.
While short it is
still a paragraph!</P>
Second example:
<P>Welcome to the world of HTML. This is the
first paragraph. While short it is still a
paragraph! </P> <P>And this is the second paragraph.</P>
Note:<TT><P ALIGN=CENTER></TT>
This is a centered paragraph.
[See the formatted version below.]
</P>
This is a centered paragraph.
Unnumbered & Numbered Lists
To create an unnumbered list:
1.start with an opening list <UL> (for unnumbered list) tag
2.enter the <LI> (list item) tag followed by the individual item; no
closing </LI> tag is needed
3.end the entire list with a closing list </UL> tag
<UL>
<LI> apples
<LI> bananas
<LI> grapefruit
</UL>
Output:
•apples
•bananas
•grapefruit
Note: A numbered list (also called an “ordered list”, from which the tag
name derives) is identical to an unnumbered list, except it uses <OL>
instead of <UL>. The items are tagged using the same <LI> tag.
Definition List
A definition list (coded as <DL>) usually consists of alternating a
definition term (coded as <DT>) and a definition definition (coded as
<DD>). Web browsers generally format the definition on a new line and
indent it.
Example:
<DL>
<DT> NCSA
<DD> NCSA, the National Center for Supercomputing
Applications, is located on the campus of the
University of Illinois at Urbana-Champaign.
<DT> Cornell Theory Center
<DD> CTC is located on the campus of Cornell
University in Ithaca, New York.
</DL>
Output:
NCSA
NCSA, the National Center for Supercomputing Applications, is
located on the campus of the University of Illinois at UrbanaChampaign.
Cornell Theory Center
CTC is located on the campus of Cornell University in Ithaca,
New York.
Note: We can have also nested lists
Others:
•Preformatted text: <PRE>
•generate text in a fixed-width font
•Extended quotations: <BLOCKQUOTE>
•include lengthy quotations in a separate block on the screen. Most
browsers generally change the margins for the quotation to separate
it from surrounding text.
•Forced Line Breaks: <BR>
• forces a line break with no extra (white) space between lines
•Horizontal Rule: <HR>
•produces a horizontal line the width of the browser window
•You can vary a rule's size (thickness) and width (the percentage of
the window covered by the rule). For example:
<HR SIZE=4 WIDTH="50%">
Character formatting
<B>
<I>
<TT>
<CENTER>
<U>
<STRIKE>
<BLINK>
<SUB>
<SUP>
bold text
italic text
typewriter text, e.g. fixed-width font.
centers text
underlines text
strikes through text
creates blinking text
subscripts text
superscripts text
Note:
1. Each of these tags have a corresponding ending tag
2. Is possible to create nested tags: <H1> <Blink>Text </Blink> </H1>
3. Is possible to use attributes with tags:
<H1 ALIGN=CENTER>Computer textbooks</H1>
Changing the Font Size
<BASEFONT SIZE=VALUE>
Change font size for the entire document
<FONT SIZE=5> text </FONT> or
<FONT SIZE=+2> text </FONT>
Change the size of a character, word or group of words
Escape sequences
Three ASCII characters--the left angle bracket (<), the right angle
bracket (>), and the ampersand (&)--have special meanings in HTML
and therefore cannot be used "as is" in text. (The angle brackets are used
to indicate the beginning and end of HTML tags, and the ampersand is
used to indicate the beginning of an escape sequence.)
To use one of the special characters in an HTML document, you must
enter its escape sequence instead:
&lt; &gt; &amp; &quot; &reg; &copy;-
the escape sequence for <
the escape sequence for >
the escape sequence for &
the escape sequence for “
the escape sequence for ®
the escape sequence for ©
Linking
The chief power of HTML comes from its ability to link text and/or an
image to another document or section of a document.
HTML's single hypertext-related tag is <A>, which stands for anchor.
To include an anchor in your document:
1.start the anchor with <A (include a space after the A)
2.specify the document you're linking to by entering the parameter
HREF="filename" followed by a closing right angle bracket (>)
3.enter the text that will serve as the hypertext link in the current
document
4.enter the ending anchor tag: </A> (no space is needed before the end
anchor tag)
Here is a sample hypertext reference in a file called US.html:
<A HREF="MaineStats.html">Maine</A>
Relative Pathnames vs. Absolute Pathnames
URL
Their use depends:
•on file location (if they are on the same server or on another server)
•if the linked files are directly related
URL Definition:
Uniform Resource Locators (URLs) are used to specify the location of
files on other servers. A URL includes the type of resource being
accessed (e.g., Web, gopher, FTP), the address of the server, and the
location of the file.
Types of links
1. Link to a local file (on the same server)
2. Link to a file on another server
-Use URL
-Example: <A
HREF="http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLP
rimer.html"> NCSA's Beginner's Guide to HTML</A>
3. Link to a specific section
-See next slide
4. Link to an email address
-You have to use a keyword: “mailto:”
-Examples:
<A HREF="mailto:[email protected]">
NCSA Publications Group</a>
<A HREF="mailto:[email protected]">Mail to Dan Buzan</a>
Links to Specific Sections
Links Between Sections of Different Documents
To set a link from document A (documentA.html) to a specific section
in another document (MaineStats.html).
Insert in documentA.html:
In addition to the many state parks, Maine is also home to
<a href="MaineStats.html#ANP">Acadia National Park</a>.
Next, create the named anchor (in this example "ANP") in
MaineStats.html:
<H2><A NAME="ANP">Acadia National Park</a></H2>
Links to Specific Sections(2)
Links to Specific Sections within the Current Document
To create a link to the ANP anchor from within MaineStats, enter:
...More information about
<A HREF="#ANP">Acadia National Park</a>
is available elsewhere in this document.
Be sure to include the <A NAME=> tag at the place in your document
where you want the link to jump to (<A NAME="ANP">Acadia
National Park</a>).
Images
To include an inline image, enter:
<IMG SRC=ImageName>
where ImageName is the URL of the image file
Image Size Attributes: Tell the browser the sizes of your images
Example:
<IMG SRC=SelfPortrait.gif HEIGHT=100 WIDTH=65>
Aligning text with an image
1 - <IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]" ALIGN=TOP>)
2 - <IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]"
ALIGN=ABSCENTER>)
3 - <IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]"
ALIGN=CENTER>)
Images without text
<p ALIGN=CENTER>
<IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]">
</p>
Images as Hyperlinks
<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif"
ALT="[HOTLIST]"></A>
Note: No border image:
<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif" BORDER=0
ALT="[HOTLIST]"></A>
Background Graphics
If you want to use an image as background:
<BODY BACKGROUND="filename.gif"> or
<BODY BACKGROUND="filename.jpg">
If you want only simple colors:
<BODY BGCOLOR="#000000" TEXT="#FFFFFF" LINK="#9690CC">
Others: VLINK, ALINK
Colors:
Black:
#000000
White:
#FFFFFF
Green:
#00FF00
Red:
#FF0000
Tan:
#DEB887
More informations regarding colors:
Magenta:
#FF00FF
http://www.hidaho.com/colorcenter/
Yellow:
#FFFF00
Tables
Elements:
•<TABLE> ... </TABLE>
•<CAPTION> ... </CAPTION>
•<TR> ... </TR>
•<TH> ... </TH>
•<TD> ... </TD>
defines a table in HTML
defines the caption for the title of
the table
specifies a table row within a
table
defines a table header cell
defines a table data cell
<table BORDER WIDTH="450" >
<tr>
<th COLSPAN="2">
<h4>
<a NAME="TT"></a><font face="arial,helvetica">Table Attributes</font>
</h4> </th> </tr>
<tr>
<td ALIGN=LEFT COLSPAN="2"><b>NOTE:</b> Attributes defined within
<tt>&lt;TH></tt>... <tt>&lt;/TH></tt> or <tt>&lt;TD></tt> ... <tt>
&lt;/TD></tt> cells override the default alignment set in a <tt>&lt;TR></tt> ...
<tt>&lt;/TR></tt>.</td> </tr>
<tr> <th>Attribute</th> <th>Description</th> </tr>
<tr VALIGN=TOP> <td><tt>ALIGN (LEFT, CENTER, RIGHT)</tt></td>
<td>Horizontal alignment of a cell.</td> </tr>
...
</table>
General table format
<TABLE>
<!-- start of table definition -->
<CAPTION> caption contents
</CAPTION>
<!-- caption definition -->
<TR>
<!-- start of header row definition -->
<TH> first header cell contents </TH>
<TH> last header cell contents </TH>
</TR>
<!-- end of header row definition -->
<TR>
<!-- start of first row definition -->
<TD> first row, first cell contents </TD>
<TD> first row, last cell contents </TD>
</TR>
<!-- end of first row definition -->
(Cont.)
<TR>
<!-- start of last row definition -->
<TD> last row, first cell contents </TD>
<TD> last row, last cell contents </TD>
</TR>
<!-- end of last row definition -->
</TABLE>
<!-- end of table definition -->
Frames - Why using ?
The main idea behind a framed document is that you can
split up the browser window into two or more regions called
frames. Once this is done, you can load separate HTML documents
into each frame and allow users to see different pages
simultaneously.
Each frame has its own scrollbars in case the document is
too big to fit in the allocated space.
Frame - Example
Sub Document 1
(An ordinary HTML File)
Sub Document 2
(An ordinary HTML File)
Frame document (A HTML File)
Frames - example
My name is green.html : -) and I’m deciding how these two
rectangles are displayed on the screen !
I want to divide screen in two: A left part and a right part
•left part will be the blue rectangle, will occupy half of the
screen and its name will be blue.html !
•right part will be the red rectangle, will occupy the remaining
part of the screen and its will be red.html !
My name is blue.html
I will be on the left part of
the screen (because this is
green.html will !)
My name is red.html
I will be on the left part
on the screen (because
this is green.html will !)
Frames - How to use ?
A simple HTML file which has to be created using a text editor.
Example:
<HTML>
<HEAD>
<TITLE>Frame Example</TITLE>
</HEAD>
<FRAMESET cols="155,*">
< FRAME SRC="btbar.html" NAME="btbar" marginwidth="0"
marginheight="0" scrolling="auto">
<FRAME SRC="page.html" NAME =”main" marginwidth="0"
marginheight="0" scrolling="auto">
</FRAMESET>
</HTML>
Frames - Tags
FRAMESET
Description
The FRAMESET element is used instead of the BODY element. It
is used in an HTML document whose sole purpose is to define the
layout of the sub-HTML documents, or Frames, that will make up
the page. The ROWS and COLS values are comma-separated lists
describing the row-heights and column-widths of the Frames.
Minimum Attributes
<FRAMESET>characters... </FRAMESET>
All Possible Attributes
<FRAMESET ROWS="..." COLS="...">characters...
</FRAMESET>
Frame - Tags
FRAME
Description
The FRAME element defines a single frame in a frameset.
Minimum Attributes
<FRAME>
All Possible Attributes
<FRAME SRC="..." NAME="..." MARGINWIDTH="..."
MARGINHEIGHT="..." SCROLLING=yes|no|auto
NORESIZE>
<FRAME> TAG Attributes
•MARGINHEIGHT=n - Specifies the amount of white space to be
left at the top and bottom of the frame
•MARGINWIDTH=n - Specifies the amount of white space to be
left along the sides of the frame
•NAME="name" - Gives the frame a unique name so it can be
targeted by other documents
•NORESIZE - Disables the user's ability to resize the frame
•SCROLLING=YES|NO|AUTO - Controls the appearance of
horizontal and vertical scrollbars in the frame
•SRC="url” - Specifies the URL of the document to load into the
frame
Frame - Examples
1. Three rows
<FRAMESET ROWS="40%,15%,45%">
...
</FRAMESET>
2. Four columns
<FRAMESET COLS="150,100,3*,*">
...
</FRAMESET>
Frame - Examples (2)
3. 6 regions
<FRAMESET ROWS="33%,33%,33%">
<FRAMESET COLS="50%,50%">
<!-- Split Row 1 into two columns -->
...
</FRAMESET>
<FRAMESET COLS="50%,50%">
<!-- Split Row 2 into two columns -->
...
</FRAMESET>
<FRAMESET COLS="50%,50%">
<!-- Split Row 2 into two columns -->
...
</FRAMESET>
</FRAMESET>
Frame - Examples (3)
4. 3 regions
<FRAMESET ROWS="80,*"> <!-- Split screen into two rows. -->
<FRAME SRC="banner.html">
<FRAMESET COLS="175,*"> <!-- Split row 2 into two
columns. -->
<FRAME SRC="table_of_contents.html">
<FRAME SRC="changing_content.html">
</FRAMESET>
</FRAMESET>
Frame - Examples (4)
<FRAMESET ROWS="80,*"> <!-- Split screen into two rows. -->
<FRAME SRC="banner.html">
<FRAMESET COLS="175,*"> <!-- Split row 2 into two
columns. -->
<FRAME SRC="table_of_contents.html">
<FRAME SRC="changing_content.html" NAME="main">
</FRAMESET>
</FRAMESET>
With the frames set like the previous frames, an example link in the
file "table_of_contents.html" might look like:
<A HREF="software/index.html" TARGET="main">
Software Products </A>
Tip
There might be browsers which doesn’t know how to display frames.
Solution ?
An alternative “page” using <NOFRAMES> and </NOFRAMES>
tags.
The <NOFRAMES> and </NOFRAMES> tags must occur after the
initial <FRAMESET> tag, but before any nested <FRAMESET>
tags.
Bibliography:
1. Using HTML 3.2, Java 1.1, and CGI - Written by Eric Ladd and
Jim O'Donnell, McMillan Publishing
2. HTML Reference Manual (An old edition)