Search Engine Marketing

Download Report

Transcript Search Engine Marketing

301 Redirect:
How Do I Love You, Let Me Count the Ways
presented by Stephan Spencer,
Founder & President, Netconcepts
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Time To Drink From the Firehose!
 No need to take furious notes though. (Phew!)
 Download this Powerpoint right now from
www.netconcepts.com/learn/301-redirect.ppt
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Let’s Go Under the Hood with 301s
 In .htaccess (or httpd.conf), you can redirect individual
URLs, the contents of directories, entire domains… :
– Redirect 301 /old_url.htm
http://www.example.com/new_url.htm
– Redirect 301 /old_dir/ http://www.example.com/new_dir/
– Redirect 301 / http://www.example.com
 Pattern matching can be done with RedirectMatch 301
– RedirectMatch 301 ^/(.+)/index\.html$
http://www.example.com/$1/
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
301 Redirects via Rewrite Rules
 My preference is to use Apache’s mod_rewrite module
and set up rewrite rules that use the [R=301] flag. Or if on
Microsoft IIS Server, using ISAPI_Rewrite plugin.
 The rewrite rules go in either .htaccess or your Apache
config file (e.g. httpd.conf, sites_conf/…)
– Precede all the rewrite rules with the line “RewriteEngine on”
– If within .htaccess, also add another line “RewriteBase /”. Never
add to the server config). Use it and you won’t have to have “^/”
at the beginning of all your rules, just “^”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
An Example Rewrite Rule
 A simple example for httpd.conf
– RewriteRule ^(.*)/index\.html$ /$1/ [R=301,L]
 Store stuff in memory with () then access via variable $1
 A rough equivalent for .htaccess
– RewriteBase /
– RewriteRule ^(.*)/?index\.html$ /$1/ [R=301,L]
 Ah, but there’s an error with the rule immediately above.
Hint: “.*” is “greedy”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
The Magic of Regular Expressions
 You need to become a master of pattern matching
–
–
–
–
–
–
–
–
* means 0 or more of the immediately preceding character
+ means 1 or more of the immediately preceding character
? means 0 or 1 occurrence of the immediately preceding char
^ means the beginning of the string, $ means the end of it
. means any character (i.e. wildcard)
\ “escapes” the character that follows, e.g. \. means dot
[ ] is for character ranges, e.g. [A-Za-z].
^ inside [] brackets means “not”, e.g. [^/]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Regular Expression Errors
 Incredibly easy to make errors in regular expressions
 When debugging, RewriteLog and RewriteLogLevel
(4+) is your friend!
 Back to the previous example...
– RewriteRule ^(.*)/?index\.html$ /$1/ [L,R=301]
 What’s the problem? .* is greedy and so it will capture
the “/” within memory
– http://www.example.com/blah/index.html redirects to
http://www.example.com/blah//
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Regular Expression Gotchas
 “Greedy” expressions. Use [^ or .*? instead of .*
– e.g [^/]+/[^/] instead of .*/.*
– e.g ^(.*?)/ instead of ^(.*)/
 .* can match on nothing. Use .+ instead
– e.g. .+/ instead of .*/
 Unintentional substring matches because ^ or $ wasn’t
specified or . was used for a dot instead of \.
– e.g. ^/default\.htm$ instead of /default.htm
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Let’s Go Deeper Down the Rabbit Hole
 A more complex example
– RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
 [NC] flag makes the rewrite condition case-insensitive
 [L] flag saves on server processing
 [QSA] flag not needed. It’s implied when using R=301.
Don’t want the query string maintained, put ? at the end
of the destination URL in the rule.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Speaking of Tracking Parameters
 Here’s how to 301 static URLs with a tracking param
appended to its canonical equivalent (minus the param)
– RewriteCond %{QUERY_STRING} ^source=[a-z0-9]*$
– RewriteRule ^(.*)$ /$1? [L,R=301]
 And for dynamic URLs...
– RewriteCond %{QUERY_STRING} ^(.+)&source=[a-z0-9]+(&?.*)$
– RewriteRule ^(.*)$ /$1?%1%2 [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
More Fun with Tracking Parameters
 Need to do some fancy stuff with cookies before 301ing?
Invoke a script that cookies the user then 301s them to
the canonical URL.
– RewriteCond %{QUERY_STRING} ^source=([a-z0-9]*)$
– RewriteRule ^(.*)$ /cookiefirst.php?source=%1&dest=$1 [L]
 Note the lack of a R=301 flag above. That’s on purpose.
No need to expose this script to the user. Use a rewrite
and let the script send the 301 after it’s done its work.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
301 Retired Legacy URLs
 Got legacy dynamic URLs you’re trying to phase out
after switching to static URLs? 301 them...
– RewriteCond %{QUERY_STRING} id=([0-9]+)
– RewriteRule ^get_product.php$ /products/%1.html? [L,R=301]
 Switching to keyword URLs and the script can’t do
anything with the keywords if passed as params? Use
RewriteMap and have a lookup table as a text file.
– RewriteMap prodmap txt:/home/someusername/prodmap.txt
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1} [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
301 Retired Legacy URLs
 What would the lookup table for the above rule look like?
– 1001 /products/canon-g10-digital-camera
– 1002 /products/128-gig-ipod-classic
 DBM files are supported too. Faster than text file.
 You could use a script that takes the requested input and
delivers back its corresponding output.
– RewriteMap prodmap prg:/home/someusername/mapscript.pl
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1} [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Canonicalization
 Non-www and typo domains
– (The example mentioned earlier...)
– RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
 HTTPS
– (If you have a separate secure server, you can skip this first
line)
– RewriteCond %{HTTPS} on
– RewriteRule ^catalog/(.*) http://www.example.com/catalog/$1
[L,R=301]© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Canonicalization
 If trailing slash is missing, add it
– RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
– WordPress handles this by default. Yay WordPress!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Iterative URL Optimization
 When iteratively optimizing a page’s URL, 301 all
previous iterations directly to the latest iteration. Don’t
daisy chain 301s.
– WordPress handles this beautifully, and by default
– Tip: Use Netconcepts’ “SEO Title Tag” plugin to mass edit all
your permalink post URLs and let WordPress handle the
301s automagically. But don’t then “set it and forget it”.
Continue optimizing the URLs iteratively over time to
maximize search traffic.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
If You’re on Microsoft IIS Server




ISAPI_Rewrite not that different from mod_rewrite
Rewrite rules go in httpd.ini file
Precede first rewrite rule with “[ISAPI_Rewrite]”
Capitalization and IIS’ case insensitivity w.r.t. URLs
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
 Non-www and typo domains
– RewriteCond Host: (?!www\.example\.com)
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
More IIS Examples
 Drop the default
– RewriteRule (.*)/default.htm $1/ [I,RP,L]
 Add trailing slash if it’s missing
– RewriteCond Host: (.*)
– RewriteRule ([^.?]+[^.?/]) http\://$1$2/ [I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Conditional Redirects?
 Risky territory! Read Redirects: Good, Bad & Conditional
 To selectively redirect bots that request URLs with
session IDs to the URL sans session ID:
– RewriteCond %{QUERY_STRING} PHPSESSID
RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves
RewriteRule ^(.*)$ /$1 [R=301,L]
 browscap.ini provides spiders’ user agents
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Conditional Redirects Not Necessary
 Almost always another way (w/o using user agent or IP)
 In the above example, simply 301 everybody – bots and
humans alike – and stop appending PHPSESSID
– See http://yoast.com/phpsessid-url-redirect/ for more on this.
– If you have to keep session IDs for functionality reasons, you
could use a script to detect for whether the session has
expired, and 301 the URL to the canonical equivalent if it has.
 Matt Cutts will be talking about this topic tomorrow in
“Ask the Search Engines” session. Don’t miss it!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Capture PageRank on Dead Pages
 Traditional approach is to serve up a 404, which drops that
obsolete URL out of the index, squandering that URL’s link juice.
 But what if you 301 redirect to something valuable (e.g. your
home page or the category page one level up) and dynamically
include a small error notice?
 Or return a 200 status code instead, so that the spiders follow
the links on the error page? Then include a meta robots noindex
so the error page itself doesn’t get indexed.
 IMPORTANT: Don’t respond to garbage (nonsense) URLs with
anything but a 404 status code. Googlebot looks for this!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Thanks!
 This Powerpoint can be downloaded from
www.netconcepts.com/learn/301-redirect.ppt
 For 180 minute long screencast (including 90 minutes
of Q&A) on SEO for large dynamic websites –
including transcripts – email [email protected]
 Questions after the show? Email me at
[email protected]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]