Canonical pages are an important aspect of maintaining a website and ensure that search engine rankings are not affected by any duplicated content.

In *NIX based systems file names with varying capitalisation are treated as separate files. For example filename.txt is not the same file as FileName.TXT. This extends into the world of Apache where URLs are also case sensitive.

So that means that we really should pick a case for our URLs and force all browsers to redirect to our chosen scheme. Lowercase is the only sensible choice so my examples will only cover it.

mod_rewrite does not have an easy way to do this from a .htaccess file without a large amount of repeat (recursive) requests to itself for each letter. This is certainly not a desirable load to put Apache under when it can be handled in another way.

This is where your favourite programming language steps into save the day. I am using PHP in these examples as it is the most commonly paired language with Apache.

In your projects .htaccess file you’ll need to add the following rewrite rules.

RewriteEngine on
RewriteBase /

# force url to lowercase if upper case is found
RewriteCond %{REQUEST_URI} [A-Z]
# ensure it is not a file on the drive first
RewriteCond %{REQUEST_FILENAME} !-s
RewriteRule (.*) rewrite-strtolower.php?rewrite-strtolower-url=$1 [QSA,L]

To describe briefly what is happening here:

  1. Setup up the rewrite module
  2. Check if the incoming URL contains any uppercase letters
  3. Ensure that the incoming URL does not refer to a file on disk (you may want to host a file with upper case letters in its name - something like a PDF file that a client has uploaded through the CMS you have supplied them for instance)
  4. Send all the requests that match aforementioned rules are then rewritten to our script that will do the actual conversion to lowercase work. The only thing to note here is the QSA modifier, which makes sure all the GET “variables” are passed onto the script

Next up is the little snippet of PHP that does all the work! This is a file called rewrite-strtolower.php in the same directory as your .htaccess file mentioned above.

<?php
if(isset($_GET['rewrite-strtolower-url'])) {
    $url = $_GET['rewrite-strtolower-url'];
    unset($_GET['rewrite-strtolower-url']);
    $params = http_build_query($_GET);
    if(strlen($params)) {
        $params = '?' . $params;
    }
    // if you don't have SSL/a security certificate at the destination change https:// to http:// below
    header('Location: https://' . $_SERVER['HTTP_HOST'] . '/' . strtolower($url) . $params, true, 301);
    exit;
}
header("HTTP/1.0 404 Not Found");
die('Unable to convert the URL to lowercase. You must supply a URL to work upon.');

As you can quickly see this is a very simple script that simply takes in the URL passed to it from the rewrite rules above.

  1. It grabs the supplied URL and removes it from the $_GET variable to stop it from being passed to the destination page amongst the GET query parameters
  2. It then rebuilds the GET parameters into a query string for use in the redirect. If there are none then this will just be an empty string which will have no consequence on the final URL
  3. Finally the redirect is performed using PHP’s header() function

After these few steps have been completed as browser will always be redirected to the lowercase version of a URL. Try entering /mY-TEST-url and you’ll see instantly become /my-test.

I first came up with this technique a good few years ago so if you know of a better solution for Apache that has appeared in the interim then please let me know.

Alternatives

It is also worth noting that there are alternatives for those without the .htaccess file requirement.

If you are not on a shared hosting environment and happy to enter the rules directly into your Apache configuration you can use mod_rewrites RewriteMap directive to do the lowercase conversion:

RewriteMap lc int:tolower
RewriteRule (.*?[A-Z]+.*) ${lc:$1} [R]

For more information on this see the Apache manual: Redirect a URI to an all-lowercase version of itself. Although it is noted there that it is recommended to use mod_speling instead of this rewrite rule.