LIS650 lecture 4 Thomas Krichel 2003-12-06

Download Report

Transcript LIS650 lecture 4 Thomas Krichel 2003-12-06

LIS650 lecture 4
Thomas Krichel
2003-12-06
today
• CSS Properties
–
–
–
–
•
•
•
•
Box properties
Table properties
(Audio properties)
Paged properties
-- List properties
-- Classification properties
-- Generated content properties
Nielsen on site design
http
Information architecture
Semantic web
the 'inherit' value
• Each property can have the 'inherit' value. In this
case, the value of the property for the tag is
determined by the containing tag.
• Sometimes, 'inherit' is the default value.
validating CSS
• It is at http://jigsaw.w3.org/css-validator/
• check your style sheet there when you wonder
why the damn thing does not work.
• Note that checking the style sheet will not be
part of the assessment of the web site.
the box model
• It derives from the assumption that there is a
conceptual box around the element contents.
• The total width of the box that the box takes is
the sum of
–
–
–
–
the left and right margin
the left and right border width
the left and right padding
the width of the box' contents
• A similar reasoning holds for the height of a box.
box properties I
• {border-color: } can hold up to four colors,
separated by blanks
– one value means: all borders have the same color
– two values mean: first number for top and bottom,
second for left and right
– three values mean: first sets top, second left and
right, third bottom
– four values mean: first sets top, second sets right etc.
• {border-width: } can hold up to four widths, for
example "thin think medium 2mm"
box border properties
• {border-style:} {border-top-style} {border-right-style:} {borderbottom-style:} {border-right-style:} take the following values
–
–
–
–
–
–
–
–
–
–
none
hidden
dotted
dashed
solid
double
groove
ridge
inset
outset
No border. {border-width:} becomes zero
Same as 'none', except in terms of border conflict resolution
The border is a series of dots.
The border is a series of short line segments.
The border is a single line segment.
The border is two solid lines.
The border looks as though it were carved into the canvas.
The border looks as though it were coming out of the canvas.
The border makes the box look like embedded in the canvas.
The border makes the box look like coming out of the canvas.
box properties II
• {border-top-width: }, {border-bottom-width: },
{border-left-width: } and {border-right-width: }
also exist.
• same properties exists for {margin-top: },
{margin-bottom: } etc and {padding-top: },
{padding-bottom: } etc.
• {float: } can be one of 'left', 'right' or 'none' which
is the default. If a float is set, the text near the
tag floats on the left or right site of the tag
contents. You can use this to create run-in
headers.
box properties III
• width: sets the total width of the box
• height: sets the total height of the box
• both take a dimension or the word 'auto' e.g.
img {width: 100px; height auto}
{position:}
• 'static'
The box is a normal box, laid out according to the
normal flow.
• 'relative' The box's position is calculated according to the
normal flow. Then it is offset relative to its normal position.
The position of the following box is not affected.
• 'absolute' The box's position (and possibly size) is specified
with the {left:}, {right:}, {top:}, and {bottom:} properties that
specify offsets with respect to the box's containing tag. There
is no effect on sibling boxes.
• 'fixed'
The box's position is calculated according to the
'absolute' model, but the reference is not the containing tag
but:
• For continuous media, the box is fixed with respect to the viewport
• For paged media, the box is fixed with respect to the page
properties with {position:}
• {top:}, {right:}, {bottom:}, {left:} set offsets if
positioning is relative, absolute or fixed.
• They can take length values, percentages, and
'auto'.
• the effect of 'auto' depends on which other
properties have been set to 'auto'
box properties V
• {z-index: } let you set and integer value for a
layer on the canvas where the tag will appear.
• Thus if tag 1 has z-index value 1 and tag 2 has
z-index value number 2, they are laying on top
of each other.
• (this is the same thing as the "layer" from
photoshop)
• browser support for this property is limited.
box properties VI
• the {clear: } property tells the user agent whether
to place the current element next to a floating
element or on the next line below it.
– value 'none' tells the user agent to put contents on
either side of the floating element
– value 'left' means that the left side has to stay clear
– value 'right' means that the right side has to stay clear
– value 'all' means that both sides have to stay clear
box properties
VII
• The {visibility: } property sets the visibility of a
tag. It takes values
– 'visible' The generated box is visible.
– 'hidden' The generated box is invisible (fully
transparent), but still affects layout.
– 'collapse' The tag collapses in the table. If used on
elements other than rows or columns, 'collapse' has
the same meaning as 'hidden'.
• With this you can do sophisticated alignments
box properties VII
• The {clip:} and {overflow:} properties let you specify how
large the box of contents is. Example
p {overflow: hidden; clip: rect(15px, -10px, 5px, 10px)}
• when the {overflow: } property is not set to 'hidden' it will
take no effect.
• otherwise, it displays the start of the paragraph in the
rectangular box.
• {overflow:} can also take value ‘scroll’ to add a scroll bar
and ‘auto’ to add a scroll bar only when needed.
• browser support is not sure
classification properties VII
• {overflow: } sets what to do when the element
flows out of its box. It takes values
– visible contents may be rendered outside tag box
– hidden contents is clipped, no access to clipped
contents
– scroll contents is clipped but can be reached through
a scroll. Scrollbar will always appear, whether
contents overflows or not. When the medium is 'print'
or 'projection', overflowing content should be printed.
– auto The behavior of the 'auto' value is user agentdependent, but should cause a scrolling mechanism
to be provided for overflowing boxes.
example for clipping
<DIV style=" { width : 100px; height: 100px; border:
thin solid red;}>
<BLOCKQUOTE style="width : 125px; height :
100px; margin-top: 50px; margin-left: 50px;
border: thin dashed black">
<P>I didn't like the play, but then I saw it under
adverse conditions - the curtain was up.
<DIV style="{ text-align : right; }>- Groucho
Marx</DIV></BLOCKQUOTE></DIV>
<!– you can try out the {overflow:} here 
list properties
• {list-style-position: } can take the value ‘inside’ or
‘outside’. The latter is the default, the property
refers to the position of the list item start marker
• {list-style-image: } define the bullet point of a list
as a graphic, use url(URL) to give the location of
the graphic.
• {list-style-property: }
– takes the values ‘disc’, ‘circle’, ‘square’, ‘none’ with an
unordered list
– takes the value ‘decimal’, ‘lower-roman’, ‘upperroman’, ‘lower-alpha’, ‘upper-alpha’ with ordered list.
table properties I
• {border-collapse: } allows to choose the
fundamental table model. It can take two values
– 'separate' implies that each cell has its own box.
– 'collapse' implies that adjacent cells share the same
border
table properties II
• The properties on this slide are only useful if
you choose the separated border model.
• You can set the distance between adjacent cells
using the border-spacing: property. Set it to two
distances to specify different horizontal and
vertical values
• empty-cells: can be set to
– 'show' shows empty cells with their border
– 'hide' does not show the border around an empty cell
• there are some other table properties
classification properties I
• {display: } sets the display type of an tag, it take
the following values
– 'block' displays the tag contents as a block
– 'inline' displays it as inline contents
– 'list-item' makes it an item of a list, you can then
attach list properties to it
– 'none'
does not display it
– 'run-in'
(see later)
– 'compact'
(see later)
classification properties II
• {display: } also takes the following values
–
–
–
–
–
table
table-row
table-cell
table-caption
inline-table
-- table-footer-group
-- table-row-group
-- table-column
-- table-column-group
-- table-header-group
• these means that they behave like the table
elements that we already discussed
run-in box
• If a block box (that does not float and is not
absolutely positioned) follows the run-in box, the
run-in box becomes the first inline box of the
block box.
• Otherwise, the run-in box becomes a block box.
• Example on next page
example for run-in box
<head><title>a run-in box example</title>
<style type="text/css">
h3 { display: run-in }
</style>
</head>
<body> <h3>a run-in heading.</h3> <p>and a
paragraph of text that follows it and it continues
on the line of the h3 </body>
compact box
• If a block-level box follows the compact box, the
compact box is formatted like a one-line inline
box. The resulting box width is compared to one
of the side margins of the block box,
– left margin if direction is left-to-right
– right margin if direction is right-to-left
• If the inline box width is less than or equal to the
margin, the inline box is given a position in the
margin as described immediately below.
• Otherwise, the compact box becomes a block
box.
compact box example
<div style="dt { display: compact }
dd { margin-left: 4em }>
<dl> <dt>short <dd><p>description goes here.
<dt>too long for the margin
<dd><p>description goes here.
</div>
classification properties III
• the whitespace: property controls the display of
white space in a block level tag.
– 'normal' collapses white space
– 'pre' value similar to <pre> tag
– 'nowrap' ignores carriage returns only
generated contents properties
• generated contents is, for example, the bullet
appearing in front of a list item.
• {content:} can be used with the :before and :after
selectors. Example
• p.note:before {content: "note"} will insert the
string "note" before any paragraph in the class
'note'. The content can be
– a text string
– a url(URL) where the contents is to be found
– a attr(att) where att is the name of the attribute, the
content of which is being inserted
generated contents properties II
• Here are some counter properties
– {counter-reset: counter} resets a counter counter
– {counter-increment: counter} increments a counter
– {counter(counter)} uses the counter
• Example
h1:before {counter-increment: chapter_counter;
counter-reset: section_counter;
content: "Chapter " counter(chapter_counter) ":"}
and then we can use h2 for the sections, of
course!
• browser support uncertain!
Paged media support
I
• CSS has the concept of a page box in which
paged output should be placed into.
• @page rule can be used to specify the size of
the page
• @page {size: 8.5in 11in}
• Valid values are one or two lengths and they
words ‘portrait’ and ‘landscape’. The latter will
depend on the default print sheet size, countryspecific.
Paged media support
II
• You can add {margin: }, {margin-top: }, {marginleft: }, and {margin-right: } properties. They will
add to the margins that the printer will set by
default, and these margins you will not be able
to control.
• You can add a {marks: crop} property to add
crop marks
• You can add a {mark: cross} property to create
registration marks.
Paged media support
III
• You can use three pseudoclasses to specify
special cases
– :first for the first page
– :left for any left page
– :right for any right page
• Example
– @page :first {margin-top: 3in}
Named pages
• You can give a page rule an optional name.
Example
@page rotated { size: landscape}
• Then you can use this with the ‘page’ property to
specify specific ways to print things. Example
table {page: rotated}
will print the table on a landscape sheet. This
comes in handy for bulky tables.
Actual page breaking
• Pages will break if
– the current page box flows over
or if
– a new page format is being used with a {page: }
property
• You can take some control with the {page-breakbefore: } and {page-break-after: } properties.
They take the values
– auto
– always
– avoid
– left – right
The latter two make sure that the element is on
a left or right page. Sometimes this will require
two page breaks.
conclusions
• These are not all the properties.
• Audio properties are still missing
• But I am not sure if I should go into more.
Nielsen on site design
• This is the longest of the chapters in his book.
• It is about the organization of sites.
• But the chapter itself is badly organized. It looks
like a Jackson Pollock painting and reads like a
bad student essay
– no structure
– things repeated from before
Nielsen on site design
• Usually there is more attention on pages design
than on site design. Presumably because the
page design is visual.
• But site design is more important.
• Study found that only 42% of users could find
simple answers to questions on a web site.
the home page
• has to be designed differently than other pages.
• must answer the questions
– where am I?
– what does this site do?
• need a directory of main area
• needs a summary of the site purpose
• a principal search feature may be included,
otherwise a link to a search page will do
• you may want to put news, but not prominently
the home page
• make the home pages a splash screen is not a
good idea
• the name of the site should be very prominent,
more so than on interior pages, where it should
also be named
• There should be a link to the homepage from all
interior pages, maybe in the logo.
• The less famous a site, the more it has to have
information about the site on interior pages.
• Users should not be "forced" to go through the
home page.
metaphor
• (why does he talk about this here?)
• it is usually not a good idea to have metaphor on
the home page.
• a notable exception: the shopping cart
– has become a standard feature
– but still illustrates some limits of metaphors
• when you want to buy multiple items of the same kind
• when you want to move something out of the cart
why navigation?
• Navigation should address three questions
– where am I?
• relative to the whole web
• relative to the site
• the former dominates, as users only click through 4 to 5
pages on a site
– where have I been?
• but this is mainly the job of the browser esp. if links colors are
not tempered with
– where can I go?
• this is a matter for site structure
site structure
• to visualize it, you have to have it first. Poor
information architecture will lead to bad usability.
• Some sites have a linear structure,
• but most sites are hierarchically organized.
• What ever the structure, it has to reflect the
users' tasks, not the company structure.
Nielsen's example company
• A corporate site may be divided into
– product information
• product families
– individual products
– employment information
--investor information
• Now consider a page with configuration and
pricing for SuperWidgets. It may belong to
– company's web site -- Widgets products
– products category
-- SuperWidgets
– pricing and configuration
Nielsen says: show all five levels of navigation. Have
links to WidgetsClassic and MiniWidgets on the
SuperWidgets page.
breath vs depth in navigation
• some sites list all the top categories on the left or
top
– users are reminded of all that the site has to offer
– stripe can brand a site through a distinctive look
• an alternative is to list the hierarchical path to
the position that the user is in, through the use of
breadcrumbs
– can be done as a one liner
• combining both
– takes up a lot of space
-- can be done as an L
shape
– recommended for large (10k+ pages) with
heterogeneous contents
large volumes of information
• most user interfaces on the web are clones of
the design of the Mac in 1984. They are not
designed to handle vast amounts of information.
Nielsen does not say why.
• Historically, early web pages had long lists of
links
• Nowadays, there is more selective linking
• Users want site maps but they don't seem to be
much help.
reducing navigational clutter
• aggregation shows that a single piece of data is
part of a whole
• summarization represents large amounts of data
by a smaller amount
• filtering is throwing out information that we don't
need
• truncation is having a "more" link on a page
• example-based presentation is just having a few
examples
subsites
• most sites are too large for the page belonging
to them adding much information.
• therefore subsites can add structure
• a subsite is a bunch of pages with common
appearance and navigational structure, with one
page as the home page.
– each page in the subsite should point to the subsite
home page as well as to global homepage
– should combine global and local navigation
search and link behavior
• Nielson says that his studies show that slightly
more than 50% of users are search-dominant,
they go straight to the search.
• One in five users is link-dominant. They will only
use the search after extensive looking around
the site through links
• The rest have mixed behavior. They will make
up their mind depending on the task and the
look of the site.
search
• site search should be on all pages
• in general it is not a good idea to scope the
search to the subsite that you are on
– users don't understand the site structure
– users don't understand the scope of the search
• if you have a scoped search
– state the scope in query and results page
– include link to the search of the whole site, in query
and results page "not found? … try to <a>search
entire site</a>"
Boolean searches
• they should be avoided because noone
understands them.
• Example task.
– "you have the following pets:
• cats
• dogs
– find information about your pet"
– users search "cats and dogs" and find nothing.
– geeks or librarians among users will then say "oh, I
should have used OR".
help the user search
• Nielsen says that computers are good at
remembering synonyms, checking spelling etc,
so they should evaluate the query and make
suggestions on how to improve it.
• but I am not aware of systems that do this "out
of the box".
• use a wide box. Information retrieval research
has shown that users tend to enter more words
in a wider box.
the results page
• computed relevance scores are useless for the
user
• URLs pointing to the same page should be
consolidated
• search should use quality evaluation. say, if the
query matches the FAQ, the FAQ should give
higher ranking.
• [he has other suggestions that are either
unrealistic or would be part of serious
information retrieval research]
metadata
• Nielsen thinks that metadata should be used
because humans are better at saying what the
page is about than machines.
• He recommends writing into the "name" attribute
of <meta> with the value 'description'
• He also says you should add keywords, with
your own keywords and those of your
competitors.
• He mentions no engine that uses these…
search destination design
• when the user follows a link from search to a
page, the page should be presented in context
of the user's search
• the most common way is to highlight all the
occurrences of the search terms.
– This helps scanning the destination page.
– Helps understanding why the user reached this result.
– [but will be no good if the term is in the metadata]
URL design
• URLs should not be part of design, but in
practice, they are.
• Leave out the "http://" when referring to your
web page.
• You need a good domain name that is easy to
remember.
understandable URLs
• Users rely on reading URLs when getting an idea
about where they are on the web site.
– all directory names must be human-readable
– they must be words or compound words
• site must support URL butchering where users
remove the trailing part after a slash
• make URLs as short as possible
• use lowercase letters throughout
• avoid special chars i.e. anything but letters or
digits
• stick to one visual word separator, i.e. either
hyphen or underscore
archival URL
• After search engines and email
recommendations, links are the third most
common way for people to come across a web
site.
• Incoming links must not be discouraged by
changing site structures
dealing with yesterday current contents
• Sometimes it is necessary to have two URLs for
the same contents:
– "todays_news" …
– "feature_2003-12-06"
some may wish to link to the former, others to the
latter
• In this case you should advertise the URL at
which the contents is archived (immediately) an
hope that link providers will link to it there.
• You can put a note on the bottom of the page, or
possibly use a simple convention if it is very easy
to guess.
supporting old URLs
• Old URLs should be kept alive for as long as
possible.
• Best way to deal with them is to set up a http
redirect 301
– good browsers will update bookmarks
– search engines will delete old URLs
• There is also a 302 temporary redirect.
refresh header
• <head><meta http-equiv="refresh" content="0;
url=new_url"> </head>
• This method has a bad reputation because it is
used by search engine spammers. They create
pages with useful keywords, and then the user is
redirect to a page with spam contents.
.htaccess
• If you use Apache, you can create a file
.htaccess (note the dot!) with a line
redirect 301 old_url new_url
• old_url must be a relative path from the top of
your site
• new_url can be any URL, even outside your site
• This works on wotan by virtue of configuration
set for apache for your home directory.
Examples
– redirect 301 /~krichel http://openlib.org/home/krichel
– redirect 301 Cantcook.jpg http://www.foodtv.com
http
• Stands for the hypertext transfer protocol. This is the
most important application layer protocol on the Internet
today, because it provides the foundation for the world
wide web.
• defined in Fielding, Roy T., James Gettys, Jeffrey C.
Mogul, Paul J. Leach, Tim Berners-Lee ``Hypertext
Transfer Protocol -- HTTP/1.1'' (1999), RFC 2616
history
• 1990: version 0.9 allows for transfer of raw data.
• 1996: rfc1945 defines version 1.0. by adding
attribute:value headers.
• 1999: rfc 2616 adds support for
•
•
•
•
•
hierarchical proxies
caching,
virtual hosts and some
support for persistent connections
and is more stringent.
http resource identification
• identification of resources is assumed through
Uniform Resource Identifiers (URI).
• As far as http is concerned, URIs are string.
• http can use ``absolute'' and ``relative'' URIs.
• A URL is a special case of a URI.
rfc about http
An application-level protocol for distributed, collaborative,
hypermedia information systems.
…
HTTP is also used as a generic protocol for
communication between user agents and
proxies/gateways to other Internet systems, including
those supported by the SMTP, NNTP, FTP, Gopher, and
WAIS protocols. In this way, HTTP allows basic
hypermedia access to resources available from diverse
applications.
overall operation: client side
Client sends request, required items are
– method
– request URI
– protocol version
• optional items are
– request modifiers
– client information
overall operation server side
• Server sends response, required items are
– status line
– protocol version
– success or error code
• optional items are
– server information
– body
middleman
• intermediaries come in three flavors
– proxies, i.e. forwarding agents
– gateways, i.e. receiving agents
– tunnels, i.e. relay points that do not change the
message such as an encryption and decryption
device
http assumes transport
• http assumes that there is a reliable way to
transport data from one host on the Internet to
another one.
• All http requests and responses are separate
TCP connections. The default is TCP port 80,
but other ports can be used.
Absolute http URL
• the absolute http URL is
http://host[:port][[abs_path][?query]]
• If abs_path is empty, it is /.
• The scheme name "http" and the host name are caseinsensitive.
• Characters other than those in the ``reserved'' and
``unsafe'' sets of RFC 2396 are equivalent to their
``%HEX HEX'' encoding.
• optional components are in [ ]
character sets
• A character set is a method used with one of more tables
to convert a sequence of binary digits into a sequence of
characters.
• http shares the same registry as the MIME multimedia
email extensions. It is based at the IANA, at
http://www.isi.edu/innotes/iana/
assignments/media-types/media-types
• The default character set is ISO-8859-1.
http messages
• There are two types of messages.
– Requests are sent form the client to the server.
– Responses are sent from the server to the client.
• The generic format is the same as for email messages:
– start line
– message headers
– empty line
– body
• Empty lines before the start line are ignored.
• The request's start line is called the request-line
• The response start line is called the status-line.
The request headers
•
•
•
•
•
•
•
•
•
•
Accept:
Accept-Encoding:
Authorization:
From:
If-Match:
If-None-Match:
If-Unmodified-Since:
Proxy-Authorization:
Referer:
User-Agent:
Accept-Charset:
Accept-Language:
Expect:
Host:
If-Modified-Since:
If-Range:
Max-Forwards:
Range:
TE:
The status line
• The status line is a set of lines that are of the
form
• HTTP-Version Status-Code Reason-Phrase
• The status code is a 3-digit number used by the
computer.
• The reason line is a friendly note for a human to
read.
Status code classes
• 1 Informational: Request received, continuing process
• 2 Success: The action was successfully received,
understood, and accepted
• 3 Redirection: Further action must be taken in order to
complete the request
• 4 Client Error: The request contains bad syntax or
cannot be understood
• 5 Server error: The request is valid but can not be
executed by the server
Error codes
•
•
•
•
•
•
•
•
•
100 Continue
101 Switching Protocols
200 OK
201 Created
202 Accepted
203 Non-Authoritative Information
204 No Content
205 Reset Content
206 Partial Content
Error codes II
• 300 Multiple Choices
• 301 Moved Permanently
•
•
•
•
•
302
303
304
305
307
Found
See Other
Not Modified
Use Proxy
Temporary Redirect
Error codes III
•
•
•
•
•
•
•
•
•
400
401
402
403
404
405
406
407
408
Bad Request
Unauthorized
Payment Required
Forbidden
Not Found
Method Not Allowed
Not Acceptable
Proxy Authentication Required
Request Time-out
Error codes IV
•
•
•
•
•
•
•
•
•
409
Conflict
410
Gone
411Length Required
412
Precondition Failed
413
Request Entity Too Large
414
Request-URI Too Large
415
Unsupported Media Type
416
Requested range not satisfiable
417
Expectation failed
Error codes V
•
•
•
•
•
•
500
501
502
503
504
505
Internal Server Error
Not Implemented
Bad Gateway
Service Unavailable
Gateway Time-out
HTTP Version not supported
Response headers
• Accept-Ranges:
• Age:
• Etag:
• Location:
• Proxy-Authenticate:
• Retry-After:
• Server:
• Vary:
• WWW-Authenticate:
Entity headers, common to
response and request
•
•
•
•
•
•
•
•
•
•
Allow:
Content-Encoding:
Content-Language:
Content-Length:
Content-Location:
Content-MD5:
Content-Range:
Content-Type:
Expires:
Last-Modified
The body
• The entity-body (if any) sent with an HTTP
request or response is in a format and encoding
defined by the entity-header fields.
• When an entity-body is included with a message,
the data type of that body is determined via the
header fields Content-Type and ContentEncoding
GET and HEAD method
• The GET method means retrieve whatever information (in the form of
an entity) is identified by the Request-URI. If the Request-URI refers
to a data-producing process, it is the produced data which shall be
returned as the entity in the response and not the source text of the
process.
• The HEAD method is identical to GET except that the server MUST
NOT return a message-body in the response.
Conditional & partial GET
• The semantics of the GET method change to a ``conditional GET'' if
the request message includes an
– If-Modified-Since
– If-Unmodified-Since
– If-Match
– If-None-Match
– If-Range header
• The semantics of the GET method change to a ``partial GET'' if the
request message includes a Range header field. A partial GET
requests that only part of the entity be transferred
The POST method
• The POST method is used to request that the origin server
accept the entity enclosed in the request as a new subordinate
of the resource identified by the Request-URI in the RequestLine. POST is designed to allow a uniform method to cover the
following functions:
– Annotation of existing resources;
– Posting a message to a bulletin board, newsgroup, mailing list, or
similar group of articles;
– Providing a block of data, such as the result of submitting a form,
to a data-handling process;
– Extending a database through an append operation.
PUT and DELETE methods
• The PUT method requests that the enclosed entity be
stored under the supplied Request-URI. If the RequestURI refers to an already existing resource, the enclosed
entity should be considered as a modified version of the
one residing on the origin server.
• The DELETE method requests that the origin server
delete the resource identified by the Request-URI.
The Semantic Web
• The W3C has been developing a new architecture that applies
knowledge representation technology to the WWW.
• Using the Resource Description Framework (RDF), Statements are
made using a Subject, Predicate and Object (very similar to Lisp and
other predicate based languages).
• Each Subject, Predicate or Object are Resources in the URI sense
and are identified by URIs within an RDF Statement using XML
Namespaces.
Reading
• ``Information Architecture'' by Louis Rosenfeld
and Peter Morville, O'Reilly 1998
• There is now a second edition, hopefully it is
better
• Contents is very thin, I summarize the whole
book here.
Sensitivity exercise
• What do you hate about a web site?
• What do you like about a web site?
• All issues to do with that fall into three categories
– Technical
– Look and Feel
– Architecture
Reasons to hate a web site
•
•
•
•
•
•
•
Can't find it.
Page crowded
Loud colours
Gratuitous use of technology
Inappropriate tone
Designer centered
Lack of attention to detail
Reasons to like a web site
•
•
•
•
•
useful
attractive to look at
thought provoking
findabilty
personalisation
Why is it so difficult
• technical expertise
• graphical design expertise
• overall structure
IA determines
• organization
• content
• functionality
– navigation
– labeling
– searching
Good IA is important for the producer
• web site an important point of first contact
• needs to determine overall design before the
site is built
• reorganizing a site is
– costly
– difficult
Topics covered
•
•
•
•
Classification
navigation
labelling
making a site searchable
The challenge of classification
• ambiguity:
``a tomato is a red or yellowish fruit with a juicy pulp, used as
a vegetable, botanically it is a berry.''
•
heterogeneity
– in a library
– on a web site
• granularity
• format
• difference in perspective
• internal politics
Organizational schemes
• Exact schemes
– alphabetical
– chronological
– geographical
•
ambiguous schemes
– topical: should be there, but not the only scheme
– task-oriented
– audience-specific: open or closed
• metaphor-driven: not as overall organization
• Hybrid schemes are not good
The mixed-up library
•
•
•
•
•
•
•
•
adult
arts and humanities
community center
get a library card
learn about our library
science
teen
youth
Organizational form: hierarchies
•
•
•
•
•
keep balance between breadth and depth
obey 7 +-2 rule horizontally,
no more than 5 levels vertically
cross-link ambiguous items if really necessary
keep new sites shallow
organizational forms: hypertext
• great flexibility
• great potential for confusion
• not good as a prime organizational structure
organizational forms: database
•
•
•
•
powerful for searching
useful if there is controlled vocabulary
easy reorganization
on the fly or static generation of pages
– but ensure robot indexing
• not good for heterogenous data
Navigation aids
•
•
•
•
provide context
allow for flexibility of movement
support associative learning
danger of overwhelming the user
browser navigation aids
• They include
–
–
–
–
–
–
–
open
back
forward
history
bookmarks
prospective view
visited url color
• sites should not corrupt the browser.
navigation
• the ``you are here'' mark
–
–
–
–
pages should indicate site name
navigation should be consistent
navigation not to refer to current pages
highlight current page in a different way
• allow for lateral navigation
Types of navigational systems
• global hierarchical navigation systems
– text
– icon
• local navigation systems: integration with global
system can be challenging
• ad hoc navigation: clear label are required
Frames are problematic
•
•
•
•
potential waste of pages real estate
speed of display
disrupt the page model
complex design
remote navigation system I
• table of contents
–
–
–
–
good in a hierarchical web site
reinforce the hierarchy
facilitate known-item access
resist temptation to overwhelm user
• indexes
–
–
–
–
presents key term without hierarchy
key terms found from search behavior
links terms to final destination pages
use term rotation
remote navigation systems II
• site maps
–
–
–
–
–
is a graphical representationof the site's contents
new because no equivalent in print
there are automated tools to generate site maps
seldomly well-done
to be kept simple
• guided tours
– important for sites with restricted access
– should feature linear navigation
labelling
• a label is short expression that represents a
larger set of information.
• example: ``contact us''
• labelling is an outgrowth of site organization,
that we have discussed previously.
• labelling communicates the organization of the
site
Why bother
• we need to guess at how users respond to a
label
• users will not spend much time interpreting the
label
• appropriate tone, no ``hot'', cool'', `stuff''
• should reflect thinking of the user, not of the
owner
• it is easy to have unplanned labelling
Good labelling
• Sticking with the familiar
–
–
–
–
–
–
main, main page, home, home page
search, find
browse
contact, contact us, feedback
Help, FAQ, Frequently Asked Questions
About, About Us
• Labels may be augmented with scope notes
Grammatical consistency
• contact us, search our site, browse our content
• contact, search, browse
• contact information, search page, table of
contents
• (also good in student essays)
Labels as indexing terms
• use in <meta>tags, or in <title> tag
• use as controlled vocabulary in the database
• but some search, in fact almost all, engines do
not use metadata
Textual labels
born in Vöklingen, (Saarland) in 1965, I studied Economics and Social
Sciences at the universities of Toulouse, Paris, Exeter and Leicester.
Between Febrary 1993 and April 2001 I lectured in the Department
of Economics at the University of Surrey. In 1993 I founded NetEc, a
consortium of Internet projects for academic economists. In 1997, I
founded the RePEc dataset to document Economics. Between
October and December 2000, I held a visiting professorship at
Hitotsubashi University.
labels as headings
• good practice:
– consistency in terminology: wording on labels is
uniform and cohesive
– consistency in granularity
• chunks covered by labels at the same level is roughly equal
• chunks covered do not vary by their depth
Iconic labels
• There is only a limited ``vocabulary'' of
commonly understood labels
• it is fine for some key concepts
• labels need to be very consistently placed
• they can communicate a graphic identity for the
page
• they are easy to find on a page, provided that
page is not long
Designing labelling systems I
• start from existing one
– put in table or tree (on paper)
– make small changes towards consistency
• ``benevolent plagiarism'' from competitors and
academic sites
• use controlled vocabularies, example yellow
pages
Designing labeling systems II
• use a thesaurus, example legislative indexing
vocabulary
–
–
–
–
``see'' link
``see also'' links
broader terms
narrower terms
• labels from contents: best judged by an outsider
•
•
•
labels from query logs
labels from user interviews
labels from modeling user needs
fine tuning a labelling system
•
•
•
•
•
•
•
•
•
remove duplicates
sort alphabetically
homogenize case and punctuation and grammar
remove synonyms according to audience
make labels as different from one another as
possible
search for gaps
look into the future
keep scope focussed
consider granularity
why not make a site searchable
•
•
•
•
not a tool to satisfy all user's needs
not good on poor contents
not a cure for bad browsing!
needs good planning
why make a site searchable
• cope with bad organization (Foyle's)
• dynamic contents
• large contents
user needs
•
•
•
•
some want overview, others want detail
some need accuracy, others don‘t care much
some can wait, others need it now
some need some info, others need a
comprehensive answer
user's searching expectation
•
•
•
•
known-item searching
existence searching
exploratory searching
comprehensive searching
integrated searching and browsing
• literature deals with separate browsing and
searching systems
• browsing and searching in a single system
• with multiple iteration
• and associative learning takes place
designing search interfaces I
• level of expertise
– boolean?
– concept search?
• amount returned
– comprehensive?
– verbose?
• how much to make searchable
designing search interfaces II
• search target
– navigation pages?
– HTML only?
• are there specific types of data that users will
want multi-lingual?
• audience difference
•
•
•
•
features of sophisticated
search engines
fielded searches
sophisticated query languages
reusable results set
customizable relevance
Deal with problems
• getting too much: suggest boolean AND
• getting nothing: suggest boolean OR or
truncation
• bad answers: suggest to contact an expert, may
be not...
The Semantic Web
• The combination of Web Services and the Semantic
Web should give the Web the ability to turn any existing
Web Resource into a full node in a purposefully built
knowledge representation system with a functional
component that allows that knowledge to be acted on.
• And both are based on the simple Uniform Resource
Identifier.
example
• This statement says that the Resource identified by the URI
‘http://openlib.org/home/krichel’ was created by the person ‘Thomas
Krichel’:
<?xml version="1.0"?> <RDF xmlns="http://www.w3.org/1999/02/22rdf-syntax-ns#"> <Description
about="http://openlib.org/home/krichel"> <Creator
xmlns="http://description.org/schema/">Ora Lassila</Creator>
</Description> </RDF>
http://openlib.org/home/krichel
Thank you for your attention!