Apr 7, 2009

TYPO3 SEO: keep your links working

One important and commonly overlooked way to make TYPO3 sites search engine friendly and well–ranked is too keep links working. In addition to SEO it also makes users happy: if they bookmark a page, they know they will get to the destination even if the page changes.

The problem

Most (may be all) content management systems do not offer any way to keep links working. When a page is renamed or moved, the site usually replies with a 404 error to search engines and regular visitors. Search engines remove a [well–ranked] page from the index and the site looses its position. Users become frustrated by a broken bookmark.

Automatic solution

Somewhat automatic solution exists for TYPO3. Latest versions of RealURL will redirect from old paths to new paths when the page is renamed in the Frontend. This works only for the renamed page, not for the subpages. Same principle works for moving: if the page is moved, old links will redirect to a new location.
The problem with this solution is obvious: subpages still show a 404 error.

Better manual solution

There are several ways to solve the problem. They work for different cases. There is no single preferred way. Everything described below has its own use case. Normally a site will use more than one of these ways to avoid broken links. Let's see them.

Shortcuts

When moving a single page that has no subpages, a shortcut to a new destination can be created. The page will continue working. As I wrote before, RealURL handles single page move/rename actions better, so shortcuts should be used only if your RealURL version is too old or you do not use RealURL at all.

External URL

Sometimes you have to supply parameters to pages. For example, you had a static page about the event and later decided to make it differently (move to a news item, for example). With external URL you can make provide parameters to the destination page.
This way is dangerous. If the destination of an external URL is modified, the link becomes broken again. Use with care. I usually keep a list of such pages in the OpenOffice Calc spreadsheet. So if I am going to rename or move the page, I can always find linked page. Not ideal but it works. The other way to track such cases could be a system note on the current page saying that an external URL should be updated.

Apache redirects

Apache redirects is my favorite method. They are faster. Apache redirects are especially helpful when many pages are moved or renamed.
Apache redirects happen through mod_rewrite. It is best of they go to the Apache configuration (not to .htaccess). Apache configuration is read once at startup while .htaccess is parsed for each request (performance improvement!).
A typical mod_rewrite rule includes zero or more RewriteCond directives and a RewriteRule directive. Some people use a Redirect directive but I never tried that.
I am not going to give a tutorial on mod_rewrite directives here. The official Apache documentation describes everything very well. Instead I will show what rules do I have for this site. It has more value than yet another mod_rewrite tutorial. Here is my rules:
RewriteEngine On

RewriteRule ^archives/ - [G,L]

RewriteCond %{REQUEST_URI} (^/(?:article|category)/.*_.*$)
RewriteRule (.*_.*) ${underscore2dash:%1} [L,R=301]

RewriteCond %{HTTP_HOST} ^(www\.)?typo3bloke\.net$
RewriteRule (.*) http://dmitry-dulepov.com/$1 [L,R=301]

RewriteCond %{HTTP_HOST} ^www\.dmitry-dulepov\.com$
RewriteRule (.*) http://dmitry-dulepov.com/$1 [L,R=301]

RewriteRule ^post(?:-|_)details/(.*)/$ /article/$1.html [L,R=301]
RewriteRule ^pages/book_reviews/(.*)/$ /reviews/$1.html [L,R=301]
RewriteRule ^pages(/.*)/$ $1.html [L,R=301]

RewriteRule ^(article/[^/]+)/category/ /$1.html [L,R=301]

RewriteRule index\.php\?id=11 /contact.html [L,R=301]
RewriteRule index\.php\?id=49 /reviews/cake-php-application-development.html [L,R=301]
RewriteRule index\.php\?id=22 /reviews/mastering-typoscript.html [L,R=301]
These rules redirect from old link structure to the new link structure. There are also fixes for "duplicate pages". Duplicated pages on my site are caused by tt_news category links.
Notice how do I redirect from old "post-details/" to a new "article/" segment and the same for reviews (that segment was moved and renamed). Also I convert underscore characters to dashes. Together all these rules ensure that all well–ranked articles are still accessible by Google and users are happy because they are redirected to a new location.
There are not so many rules. Why is that? It is because I like to plan my pages before I do them. So I do not need to change structure often and I can keep rules to minimum. 20% of planning saves me 80% of time later on.

Conclusion

It is not very difficult to make your site more search engines friendly and visitors more happy. Yes, it requires a little planning and a bit of work. But it is worth doing if you are serious about your web site and you care about your visitors.
Good luck in TYPO3 SEO!

4 comments:

  1. Hi Dmitry



    Are you sure that the lines:

    RewriteRule index\.php\?id=11 /contact.html [L,R=301]

    RewriteRule index\.php\?id=49 /reviews/cake-php-application-development.html [L,R=301]

    RewriteRule index\.php\?id=22 /reviews/mastering-typoscript.html [L,R=301]



    work? I thought that you had to use %{QUERY_STRING} to get it to work.



    Cheers.

    ReplyDelete
  2. This is how I got it to work with a querystring:



    RewriteCond %{QUERY_STRING} ^id=10

    RewriteRule ^/(.*)$ /aboutus/? [R=301,L]

    ReplyDelete
  3. Hi Dmitry,

    thanks for the nice summary!

    One suggestion for the RewriteRules: add a short line of comment explaining the http://from to http://to - that might help many people.



    Cheers

    Chris

    ReplyDelete
  4. Andy: it'll work even without QUERY_STRING and cond. Simply the engine will try to substitute the whole matched url with the given one.

    ReplyDelete