We are going to talk about NOT how to avoid duplicate content on already existing pages but how to avoid it, if for example we have bought old (expired) domain that has many indexed pages that would result in 404 error onaour site.

Some webmasters place the following code in .htaccess file:

ErrorDocument 403 index.php
ErrorDocument 404 index.php

which creates the problem with duplicate content, as these directives redirect all users that visited forbidden and not found pages to the home page but keeping the URL unchanged. Another method used is to redirect the users to specially designed 403 and 404 pages by placing that code:

ErrorDocument 403 /error403.html
ErrorDocument 404 /error404.html

which is not an elegant solution because instead of home page users are internally redirected to another page keeping again the URL unchanged.

If you like this method you better change the redirect from internal to external like this:

ErrorDocument 403 http://yourdomain.com/error403.html
ErrorDocument 404
http://yourdomain.com/error404.html

Sometimes hosting companies offer custom 403 and 404 pages. I use this tricky solution:

ErrorDocument 403 http://www.sajta-mi.com/
ErrorDocument 404 http://www.sajta-mi.com/

by external redirect (R=301) we “tell” the search engines, that these pages no more exist unlike first two examples.

What shall we do with URL parameters like q=, page=, id=, etc., which are very persistent? The solution:

We copy the code from previous article for avoiding the duplicate content on home/index page:

Options +FollowSymlinks -Indexes
RewriteEngine On

RewriteCond %{HTTP_HOST} ^yourhost.com$ [NC]
RewriteRule ^(.*)$ http://www.
yourthost.com/$1 [R=301,L]

RewriteCond %{THE_REQUEST} /index.php HTTP/
RewriteRule ^index.php$ / [R=301,L]

and continue on the next line:

RewriteCond %{QUERY_STRING} ^page=.*$ [OR]
RewriteCond %{QUERY_STRING} ^q=.*$ [OR]
RewriteCond %{QUERY_STRING} ^id=.*$
RewriteRule .* %{REQUEST_URI}? [R=301,L]

and parameters issue is solved. You may add OR remove lines with other parameters.

Keep in mind that you have to check very well the names of the parameters. Otherwise if you have installed applications using such parameters, they wont work properly!