Search Engine Marketing

Download Report

Transcript Search Engine Marketing

Unraveling URLs and
Demystifying Domains
presented by Stephan Spencer,
Founder & President, Netconcepts
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Subdomains vs. Subdirectories
 Matt's/Google's announcement – they'd essentially treat
them as the same
(www.mattcutts.com/blog/subdomains-andsubdirectories/)
 You shouldn't treat subdomains as a means of creating
tons of easy thin-content microsites. They're being
viewed as subdirectories. Yes, use them for managing
your website and doing load balancing. No, don't use
them purely for SEO reasons.
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Microsites
 Can be bad for your SEO if overly numerous or if they
contain substantial amounts of duplicate content
(merely changing the UI doesn’t count)
 Can be good when you’ll get more link love
– Hyphothetical example: stayinghealthy.com vs.
stayinghealthy.metlife.com
 Can also be beneficial in terms of demographic
targeting and focused keyword targeting
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Keywords in URLs
 Beneficial in Google regardless of whether in
filename/directory/subdirectory names versus variable
values in querystrings.
 In other search engines, more important that the
keyword be in filename/directory/subdirectory. And, the
closer the keyword(s) are to the root domain name,
apparently the more weight they will lend.
 Just because a keyword is bolded in the SERP doesn’t
mean it’s given extra weight in the ranking algo.
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Word Separators in URLs
 Hyphens are the best. Preferred over underscores.
– Historically to Google underscores were not word separators
– Bare spaces cannot be used in URLs. Character encoded
equivalents for "white space" character are + or %20. (e.g.
blue%20widgets.htm). Regardless, hyphen is preferred.
 Too much of a good thing looks like keyword stuffing
– Aim for fewer than a half dozen words (i.e. <5 hyphens)
– See my Matt Cutts interview
(stephanspencer.com/search-engines/matt-cutts-interview)
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URL Stability
 An annually recurring feature, like a Holiday Gift Buying
Guide, should have a stable URL
– When the current edition is to be retired and replaced with a
new edition, assign a new URL to the archived edition
 Otherwise link juice earned over time is not carried over
to future years’ editions
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Domain Age and Expiry
 Crusty old domains (and crusty old sites) are more
trusted by Google, as alluded to in Google’s
"Information retrieval based on historical data” patent
– Parked domains aren’t as trusted. Start the clock running.
 Number of years that your domain name has before
expiring may very well be a big quality indicator.
– Suggest increasing the registration period for your domain so
the expiration date will be further in the future
– Particularly for newer domains
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Domain Age and Expiry
– Domainers often have been known to do "tasting” (i.e.
registering domains for just a couple of days to see what
keyword traffic they get)
– Google just announced that they'll stop displaying AdSense
ads on domain tasting sites as a measure to try to fight the
practice
(www.informationweek.com/news/showArticle.jhtml?articleID
=205918984)
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Rewriting Your Spider-Unfriendly URLs
 3 approaches:
1) Use a “URL rewriting” server module / plugin – such as
mod_rewrite for Apache, or ISAPI_Rewrite for IIS Server
2) Recode your scripts to extract variables out of the “path_info”
part of the URL instead of the “query_string”
3) Or, if IT department involvement must be minimized, use a
proxy server based solution (e.g. Netconcepts' GravityStream)
– With (1) and (2), replace all occurrences of your old URLs in
links on your site with your new search-friendly URLs. 301
redirect the old to new URLs too, so no link juice is lost.
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Let’s Geek Out!
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URL Rewriting – Under the Hood
 If running Apache, place “rules” within .htaccess or your
Apache config file (e.g. httpd.conf, sites_conf/…)
–
–
–
–
RewriteEngine on
RewriteBase /
RewriteRule ^products/([0-9]+)/?$ /get_product.php?id=$1 [L]
RewriteRule ^([^/]+)/([^/]+)\.htm$
/webapp/wcs/stores/servlet/ProductDisplay?storeId=10001&c
atalogId=10001&langId=-1 &categoryID=$1&productID=$2
[QSA,P,L]
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URL Rewriting – Under the Hood
 The magic of regular expressions / pattern matching
–
–
–
–
–
–
–
–
* means 0 or more of the immediately preceding character
+ means 1 or more of the immediately preceding character
? means 0 or 1 occurrence of the immediately preceding char
^ means the beginning of the string, $ means the end of it
. means any character (i.e. wildcard)
\ “escapes” the character that follows, e.g. \. means dot
[ ] is for character ranges, e.g. [A-Za-z].
^ inside [] brackets means “not”, e.g. [^/]
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URL Rewriting – Under the Hood
– () puts whatever is wrapped within it into memory
– Access what’s in memory with $1 (what’s in first set of
parens), $2 (what’s in second set of parens), and so on
 Regular expression gotchas to beware of:
– “Greedy” expressions. Use [^ instead of .*
– .* can match on nothing. Use .+ instead
– Unintentional substring matches because ^ or $ wasn’t
specified
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URL Rewriting – Under the Hood
 Proxy page using [P] flag
– RewriteRule /blah\.html$ http://www.google.com/ [P]
 [QSA] flag is for when you don’t want query string
params dropped (like when you want a tracking param
preserved)
 [L] flag saves on server processing
 Got a huge pile of rewrites? Use RewriteMap and have
a lookup table as a text file
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
If You’re on Microsoft IIS Server
 ISAPI_Rewrite not that different from mod_rewrite
 In httpd.ini :
– [ISAPI_Rewrite]
RewriteRule ^/category/([0-9]+)\.htm$
/index.asp?PageAction=VIEWCATS&Category=$1 [L]
– Will rewrite a URL like
http://www.example.com/index.asp?PageAction=VIEWCATS
&Category=207 to something like
http://www.example.com/category/207.htm
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
301 Redirects – Under the Hood
 In .htaccess (or httpd.conf), you can redirect individual
URLs, the contents of directories, entire domains… :
– Redirect 301 /old_url.htm
http://www.example.com/new_url.htm
– Redirect 301 /old_dir/ http://www.example.com/new_dir/
– Redirect 301 / http://www.example.com
 Pattern matching can be done with RedirectMatch 301
– RedirectMatch 301 ^/(.+)/index\.html$
http://www.example.com/$1/
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
301 Redirects – Under the Hood
 Or use a rewrite rule with the [R=301] flag
– RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
– RewriteRule ^(.*)$ http://www.example.com/$1
[L,QSA,R=301]
 [NC] flag makes the rewrite condition case-insensitive
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Conditional Redirects, Under the Hood
 Selectively redirect bots that request URLs with session
IDs to the URL sans session ID:
– RewriteCond %{QUERY_STRING} PHPSESSID
RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves
RewriteRule ^/(.*)$ /$1 [R=301,L]
 Utilize browscap.ini instead of having to keep up with
each spider’s name and version changes
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
URLs that Lead to Error Pages
 Traditional approach is to serve up a 404, which drops
that obsolete or wrong URL out of the search indexes.
This squanders the link juice to that page.
 But what if you return a 200 status code instead, so that
the spiders follow the links? Then include a meta robots
noindex so the error page itself doesn’t get indexed.
 Or do a 301 redirect to something valuable (e.g. your
home page) and dynamically include a small error
notice?
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]
Thanks!
 This Powerpoint can be downloaded from
www.netconcepts.com/learn/unraveling-urls.ppt
 For 180 minute long screencast (including 90 minutes
of Q&A) on SEO for large dynamic websites (taught
by myself and Chris Smith) – including transcripts –
email [email protected]
 Questions after the show? Email me at
[email protected]
© 2008 Stephan M Spencer Netconcepts www.netconcepts.com [email protected]