Transcript slides

Web Server Design
Week 3
Old Dominion University
Department of Computer Science
CS 495/595 Spring 2010
Martin Klein <[email protected]>
1/27/10
Entity Tags: “Etag”
> telnet www.cs.odu.edu 80 | tee one.dat
Trying 128.82.4.2...
Connected to xenon.cs.odu.edu.
Escape character is '^]'.
GET /~mklein/index.html HTTP/1.1
Connection: close
Host: www.cs.odu.edu
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 20:28:36 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 13 Jan 2010 17:55:23 GMT
ETag: "64371b-54b-47d0f797c18d9"
Accept-Ranges: bytes
Content-Length: 1355
Connection: close
Content-Type: text/html
<html>
<head><title>Martin Klein -- Old Dominion University</title></head>
<body>
[lots of html deleted]
Connection closed by foreign host.
What is an “Entity”?
section 1.3:
entity
The information transferred as the payload of a request or
response. An entity consists of metainformation in the form of
entity-header fields and content in the form of an entity-body, as
described in section 7.
section 7:
An entity
consists of entity-header fields and an entity-body, although some
responses will only include the entity-headers.
Looking at a Request/Response
(as much as we have learned so far)
Request
Response
GET /~mklein/index.html HTTP/1.1
HTTP/1.1 200 OK
Request Headers:
Host: www.cs.odu.edu
Response Headers:
Server: Apache...
General Headers:
Date: Wed, 27 Jan 2010 15:04:47 GMT
Connection: close
Entity Headers:
Content-Length: 0
Content-Type: text/html
...
CRLF
[ message-body ]
Entity Headers Are (Mostly) a
Subset of Response Headers
7.1 Entity Header Fields
Entity-header fields define metainformation about the entity-body or,
if no body is present, about the resource identified by the request.
Some of this metainformation is OPTIONAL; some might be REQUIRED by
portions of this specification.
entity-header = Allow
; Section 14.7
| Content-Encoding
; Section 14.11
| Content-Language
; Section 14.12
| Content-Length
; Section 14.13
| Content-Location
; Section 14.14
| Content-MD5
; Section 14.15
| Content-Range
; Section 14.16
| Content-Type
; Section 14.17
| Expires
; Section 14.21
| Last-Modified
; Section 14.29
| extension-header
extension-header = message-header
Section 3.11 - Entity Tags
• Etags used in
– request headers
– response headers
An entity tag consists of an opaque quoted string, possibly prefixed by
a weakness indicator.
[…]
A "strong entity tag" MAY be shared by two entities of a resource
only if they are equivalent by octet equality.
A "weak entity tag," indicated by the "W/" prefix, MAY be shared by
two entities of a resource only if the entities are equivalent and
could be substituted for each other with no significant change in
semantics. A weak entity tag can only be used for weak comparison.
An entity tag MUST be unique across all versions of all entities
associated with a particular resource. A given entity tag value MAY
be used for entities obtained by requests on different URIs. The use
of the same entity tag value in conjunction with entities obtained by
requests on different URIs does not imply the equivalence of those
entities.
Opaqueness
• A string / tag / pointer / data structure whose
semantics / implementation are hidden/local
• Q: What does “1c52-14ed-42992d1d” mean?
– A: it doesn’t matter…
• Examples:
– ATM & CC data strips
– Hotel & Flight reservation codes
– http cookies
Section 13.3.3
Weak and Strong Validators
Entity tags are normally "strong validators," but the protocol
provides a mechanism to tag an entity tag as "weak." One can think of
a strong validator as one that changes whenever the bits of an entity
changes, while a weak value changes whenever the meaning of an entity
changes. Alternatively, one can think of a strong validator as part
of an identifier for a specific entity, while a weak validator is
part of an identifier for a set of semantically equivalent entities.
Note: One example of a strong validator is an integer that is
incremented in stable storage every time an entity is changed.
An entity's modification time, if represented with one-second
resolution, could be a weak validator, since it is possible that
the resource might be modified twice during a single second.
Support for weak validators is optional. However, weak validators
allow for more efficient caching of equivalent objects; …
strong = exact match; weak = “good enough” match
Common Hash Functions
• Variable length input,
fixed length output
• Can’t be reversed
– small changes in input,
large changes in
output
• MD5
–
http://www.ietf.org/rfc/rfc1321.txt
• SHA-1
–
http://www.w3.org/PICS/DSig/SHA1_
1_0.html
(mln-web:~) mklein% cat aaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
(mln-web:~) mklein% cat aba
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
(mln-web:~) mklein% md5sum aaa
8655d873149db8d79106de20d1e89ffc aaa
(mln-web:~) mklein% md5sum aba
aafb727b9c4729a80694d6c16dfa92be aba
Possible Approaches
• Strong:
– md5(entitybody+entityheaders)
• Weak:
– md5(entitybody)
How Does Apache Do It?
• A configurable function with default inputs of
(inode, size, modification time):
– http://httpd.apache.org/docs/2.2/mod/core.html#fileetag
– Direct relationship to three parts of:
ETag: "1c52-14ed-42992d1d"
– ?? Probably, but look in the Apache source code to be
sure
• let’s run a test…
Black Box Test
request:
(mln-web:~) mklein% telnet www.cs.odu.edu 80
Trying 128.82.4.2...
Connected to xenon.cs.odu.edu.
Escape character is '^]'.
HEAD /~mklein/teaching/cs595-s10/etag-test/foo.txt HTTP/1.1
Connection: close
Host: www.cs.odu.edu
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:26:37 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:09:48 GMT
ETag: "102398-15-47e294ed23307"
Accept-Ranges: bytes
Content-Length: 21
Connection: close
% cat .htaccess
Content-Type: text/plain
FileETag INode MTime Size
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:27:21 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:09:48 GMT
ETag: "102398-47e294ed23307"
Accept-Ranges: bytes
Content-Length: 21
Connection: close
% cat .htaccess
Content-Type: text/plain
FileETag INode MTime
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:28:06 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:09:48 GMT
ETag: "102398"
Accept-Ranges: bytes
Content-Length: 21
Connection: close
% cat .htaccess
Content-Type: text/plain
FileETag INode
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:28:36 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:09:48 GMT
Accept-Ranges: bytes
Content-Length: 21
Connection: close
Content-Type: text/plain
% cat .htaccess
FileETag None
(contd)
(original) ETag: " 102398-15-47e294ed23307 "
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:30:18 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:30:04 GMT
ETag: "102398-15-47e29974ce300"
Accept-Ranges: bytes
Content-Length: 21
Connection: close
Content-Type: text/plain
% touch foo.txt
HTTP/1.1 200 OK
Date: Wed, 27 Jan 2010 18:31:08 GMT
Server: Apache/2.2.14 (Unix) DAV/2 PHP/5.2.11
Last-Modified: Wed, 27 Jan 2010 18:30:52 GMT
ETag: "102398-19-47e299a294f00"
Accept-Ranges: bytes
Content-Length: 25
Connection: close
Content-Type: text/plain
% echo "bar" >> foo.txt