CS193H: High Performance Web Sites Class 1

Download Report

Transcript CS193H: High Performance Web Sites Class 1

CS193H:
High Performance Web Sites
Lecture 8: Rule 4 – Gzip Components
Steve Souders
Google
[email protected]
Announcements
Web 100 Performance Profile (round 1) class
project has been graded – contact Aravind if
you want to know your grade
Compression (encoding)
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1
Host: www.blogger.com
User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1
Accept-Encoding: gzip,deflate
HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT
Content-Length: 6230
2066
Content-Encoding: gzip
function d(s) {...
XmoÛHþ\ÿFÖvã*wØoq...
typically reduces size by 70%
(6230-2066)/6230 = 67%
Gzip vs. Deflate
Gzip
Deflate
Size
Size Savings
Size Savings
Script
3.3K
1.1K
67%
1.1K
66%
Script
39.7K
14.5K
64%
16.6K
58%
Stylesheet
1.0K
0.4K
56%
0.5K
52%
Stylesheet
14.1K
3.7K
73%
4.7K
67%
gzip (default settings) compresses more
Pros and Cons
Pro:
smaller transfer size
Con:
CPU cycles – on client and server
Don't compress resources < 1K
Gzip configuration
Apache 1.3: mod_gzip
mod_gzip_item_include
mod_gzip_item_include
mod_gzip_item_include
mod_gzip_item_include
javascript$
mod_gzip_item_include
mod_gzip_item_include
file
mime
file
mime
\.html$
^text/html$
\.js$
^application/x-
file \.css$
mime ^text/css$
Apache 2.x: mod_deflate
AddOutputFilterByType DEFLATE text/html text/css
application/x-javascript
control compression level: DeflateCompressionLevel
http://httpd.apache.org/docs/2.0/mod/mod_deflate.html
Gzip: not just for HTML
HTML
Scripts
Stylesheets
amazon.com
aol.com
x
x
x
aol.com
ebay.com
x
some
some
cnn.com
facebook.com
x
x
x
ebay.com
google.com/search
x
x
na
froogle.google.com
search.live.com/results
x
x
x
msn.com
x
deflate
x
deflate
x
myspace.com
x
x
x
wikipedia.org
en.wikipedia.org/wiki
x
some
x
some
x
yahoo.com
x
x
x
youtube.com
x
some
x
some
x
gzip scripts, stylesheets, XML, JSON
(not images, Flash, PDF)
March 2008
2007
October
Edge Case: Proxies
Origin Server
Proxy
1
5
2
GET main.js
Accept-Encoding: gzip
main.js
Content-Encoding: gzip
6
GET main.js
(no Accept-Encoding)
7
main.js
Content-Encoding: gzip
GET main.js
Accept-Encoding: gzip
3
4
main.js
Content-Encoding: gzip
main.js
Content-Encoding: gzip
proxies may serve gzipped content to browsers
that don't support it, and vice versa
Edge Case: Proxies w/ Vary
Proxy
1
5
2
GET main.js
Accept-Encoding: gzip
7 GET main.js
(no Accept-Encoding)
3 main.js
Content-Encoding: gzip
Vary: Accept-Encoding
8 main.js
Vary: Accept-Encoding
GET main.js
Accept-Encoding: gzip
main.js
Content-Encoding: gzip
6
GET main.js
(no Accept-Encoding)
10
Origin Server
main.js
(no gzip)
4
11
GET main.js
Accept-Encoding: gzip
12 main.js
Content-Encoding: gzip
13
GET main.js
(no Accept-Encoding)
14 main.js
(no gzip)
main.js
Content-Encoding: gzip
[Accept-Encoding: gzip]
9 main.js
[Accept-Encoding: ]
add Vary: Accept-Encoding
Edge Case: Bad Browsers
< 1% of browsers have problems with gzip
IE 5.5:
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q313712
IE 6.0:
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q31249
Netscape 3.x, 4.x
http://www.schroepl.net/projekte/mod_gzip/browser.htm
User-Agent white list for gzip
Apache 1.3:
mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]"
mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]"
Apache 2.0:
BrowserMatch ^MSIE [6-9] gzip
BrowserMatch ^Mozilla/[5-9] gzip
Edge Case: Bad Browsers
(cont'd)
proxies could mix-up responses
give cached response from useragent1 to useragent2
could add Vary: User-Agent
so many possibilities, defeats proxy caching
better to add Cache-Control: Private
downside: disables all proxy caches
is it a serious problem?
hard to diagnose; problem getting smaller
Edge Case: ETags
what happens when proxy makes Conditional
GET requests?
Last-Modified date for gzipped vs. ungzipped is
different => If-Modified-Since works fine
ETag is the same in Apache for gzipped &
ungzipped => If-None-Match succeeds, proxy
could give browser mismatched content
remove Etags! (Rule 13)
http://issues.apache.org/bugzilla/show_bug.cgi?id=39727
Edge Case: ETags present
Proxy
1
5
2
GET main.js
Accept-Encoding: gzip
7 GET main.js
If-None-Match: "de158-e58-c7ee4140"
3 main.js
Content-Encoding: gzip
Cache-Control: max-age=0
ETag: "de158-e58-c7ee4140"
8 304 Not Modified
GET main.js
Accept-Encoding: gzip
main.js
Content-Encoding: gzip
6
GET main.js
(no Accept-Encoding)
9
main.js
Content-Encoding: gzip
Origin Server
4
main.js
Content-Encoding: gzip
Cache-Control: max-age=0
ETag: "de158-e58-c7ee4140"
proxy gives browser mismatched content
Edge Case: ETags removed
Proxy
1
5
GET main.js
Accept-Encoding: gzip
main.js
Content-Encoding: gzip
6
GET main.js
(no Accept-Encoding)
10
main.js
(no gzip)
2
Origin Server
GET main.js
Accept-Encoding: gzip
7 GET main.js
If-Modified-Since: Thu, 21 Aug 2008 23:53:57
3 main.js
GMT
Content-Encoding: gzip
Cache-Control: max-age=0
Last-Modified: Thu, 21 Aug
2008 23:53:57 GMT
8 main.js
Cache-Control: max-age=0
Last-Modified: Fri, 22 Aug
2008 09:43:15 GMT
4 main.js
Content-Encoding: gzip
Cache-Control: max-age=0
Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT
9 main.js
Cache-Control: max-age=0
Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT
removing ETags avoids the problem
Edge Case Fixes
Vary: AcceptEncoding
aol.com
x
ebay.com
x
facebook.com
x
google.com/search
search.live.com/results
ETag
x
x (IIS)
x
x
msn.com
myspace.com
Cache-Control:
private
x (IIS)
x (IIS)
x
en.wikipedia.org/wiki
x (Apa)
x (Apa)
yahoo.com
x
youtube.com
x
Vary: User-Agent – not used
some
March 2008
2007
October
Homework
"Improving Top Site" class project:
• add improvements for Rule 4
• measure improvements using Hammerhead
• record results in your personal Web 100 sheet
read Chapter 5 of HPWS for 10/17
Questions
How much are file sizes typically reduced by
using gzip compression?
What types of resources (images, scripts, etc.)
should not be compressed?
For the resource types that should be
compressed, should they always be
compressed?
How do you prevent proxies from serving gzipped
resources to browsers that don't support gzip?
How can ETags cause proxies to serve
mismatched content to browsers?