More on the .htaccess file

In part way response to a post from the forums, I belong to musing about the .htaccess file.

Here is my penny’s worth on htaccess files. Not meant to be the end-all of information we should always aim to gain as much info on a subject before acting but I tried to bring you a slightly informative post about it, maybe if your lucky (and click like) might go a little further……!

What is a .htaccess file?

Well according to Wikipedia its a (hypertext access) file is a directory-level configuration file supported by several web servers, that allows for decentralized management of web server configuration. They are placed inside the web tree and are able to override a subset of the server’s global configuration for the directory that they are in and all sub-directories.

The original purpose of .htaccess (reflected in its name) was to allow per-directory access control, by for example requiring a password to access the content. Nowadays, however, the .htaccess files can override many other configuration settings including content type and character set, CGI handlers, etc.

EH? Basically, an htaccess file is a configuration file used on Linux web servers running Apache. These files can be used to make configuration changes on per folder basis on the server. So, for instance, you could have something like this:

root
|___ .htaccess
|
|___ folder one
| |___
|
|___ folder two
|___ .htaccess

Now you have an htaccess file on the root so anything under that would be affected by that htaccess file so folder one would follow the same config, however, we also have an htaccess file under folder two so this overrides the root one and applies instead!

Got it so far? OK good, I’m going back to my coding, oh wait do you want more?

Sounds great right? Well yes and no!

If you have access to the main server config files and you wish to apply global changes then you should use them, it’s faster and well just better! Plus using htaccess files actually slows down your server as it has to look in every folder to check if there is one there, this is true whether or not you actually use them! But if you don’t, say you have resold Cpanel hosting or similar then it’s your only option.

So what kind of config changes can I make?

Well, anything really, the majority of changes you’ll see made are around the mod_rewrite directives and to quote the Apache documentation “This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule, to provide a really flexible and powerful URL manipulation mechanism.”, so from that statement, you can see that pretty much anything.

OK, so enough background.

So realistically what can it do for me?

Probably the main reason you would have used an htaccess file either knowingly or unknowingly is a redirect, not here go there more often than not a 301 but that is no means the only redirect (but that’s for another day!).

An example of a simple redirect to a new domain would be:

Options +FollowSymLinks
RewriteEngine on
RewriteRule (.*) http://www.yourdomain.co.uk/$1 [R=301,L]

OK, I think I hear another EH?

Let us break it down:

Options +FollowSymLinks
FollowSymLinks is a directive in your web server configuration that tells your web server to follow symbolic links and is actually short for Follow Symbolic Links. When used in an htaccess file allows us to override default server settings in particular folders (directories).

To put simply Symbolic links are like Windows shortcuts. You may have an image like <img src="/root/mysite/images/logo.png" /> which shows your logo. Now when you or a visitor browses to this location you see the logo. But, and here’s the fun bit if you were to log into the server and go to that folder you wouldn’t find the logo! Why you ask, well the file is not actually located in that folder, its in /root/images whaaa? How comes in my browser logo.png shows as in /root/mysite/images/ not /root/images/?

I think the quick ones would have picked this up already, a symbolic link is responsible for this behaviour. You have a symlink that tells your server “If a visitor requests /root/mysite/images/logo.png then show them /root/images/logo.png”

FollowSymLinks is actually quite important with relation to server security. When dealing with web servers, you can’t just leave things undefined. You have to define what can access what. FollowSymLinks tells your server it should (or not) follow symlinks. In other words, if FollowSymLinks was disabled browsing to the /root/mysite/images/logo.png file would return an error and dependant on other settings either a 403 (access forbidden) or 404 (not found).

RewriteEngine on
This basically enables mod_rewrite mentioned above.

RewriteRule (.*) http://www.yourdomain.co.uk/$1 [R=301,L]


This probably needs to be broken down into several pieces

(.*)
This needs to be broken down in its own right 
( start of the pattern
. one or more of
* any character 
) end of the pattern
http://www.yourdomain.co.uk/
Redirect here so in this example redirect to http://www.yourdomain.co.uk/

$1
Contains the content of the (.*) with out the domain

[R=301,L]
The first part ‘[R=301,’ relates to the actual redirection and in our case a 301 which is a permanent redirection, the second part ‘L]’ means stop here and do not apply any more rewriting rules.

Right, I think I’m going to stop here and let you absorb this before I go any further, well that and I want to eat my dinner and have another glass of wine!