Recently, I was forced to do a ton of research as a result of an unruly website I was working on. The problem turned out to be a small tidbit in the .htaccess file. Such a minor detail caused hours of delay in the project. I thought I would share all the details I compiled for your reference.
A .htaccess file is a file that works on Apache and other NCSA-compliant webservers. The name is actually a bit of a misnomer due to the fact that hyper-text access is only a small function of what it's capabilities are.
The .htaccess file affects the directory it is located in and all directories below it on the directory tree unless there is a .htaccess file contained within a directory, in which case it will take priority for that directory and all directories located below it in tree. Thus if a .htaccess file is contained within the root directory it will affect all directories on the webserver.
The basics are as follows. The .htaccess file is an ASCII (American Standard Code for Information Interchange), these files are most easily generated through notepad or anything that can type in simple text format. One of the most common questions about .htaccess files are what to name them, unfortunately they can have no name, and the extension (although uncommon) really is 8 characters long.
Creating the file is somewhat tricky because programs like Windows Operating System will not allow you to have a file with no name and only an extension. In order to get around this what you must do is name the file whatever you would like and after it has been uploaded to the server rename it to .htaccess. At this point however the file will become invisible to browsers and ftp clients (although it can still be navigated to and the contents of it viewed), this is due to the fact that any file with a period at the beginning of it's name is considered a hidden file.
When uploading the .htaccess file it is very important to make sure that you upload it as ASCII and not as binary. Also when it has been uploaded there are a few precautions you take to prevent it from being read by a browser, one is to CHMOD it's permissions to 644 (or RW-R--R--). The other's will be covered later on in more detail. Due to the nature of the information stored in the .htaccess file it is usually of the utmost importance to keep it secure.
When creating a .htaccess file for the first time there is one quick note to keep in mind, this is that most commands are typically meant to be placed on one line, so if you are using a text editor which has the word wrap feature it may be in your best interest to turn it off as this can input syntax that Apache does not understand and will cause your scripts to fail. Also note that .htaccess files will not work on a NT or Windows platform, there are various other methods of accomplishing the tasks that .htaccess provides, but none that are bundled together in such a nice little package.
.htaccess files are not globally accepted. Due to the fact that they can be used for security that can become very serious security holes. Due to this some webhosting companies have either limited the use of .htaccess or removed it all together. Before you take the time to create a .htaccess file or series of them you should always know what you can and cannot do.
Custom Error Pages / Request Pages
There are various client requests and error pages that can happen when someone is navigating a website. A brief list of them is as follows;
200 - OkayOn this list I have included some good and some bad things that custom pages could be set up for in a .htaccess file. For instance if you set up a customer page for the 200 request everytime someone successfully typed in a URL or accessed your website and it was successfully bringing up a page then it would refresh to the page you specified in the .htaccess file, as soon as it was successfully brought up it would then redirect back to the page specified in the .htaccess file, and so on infinitely. This would be an example of a bad way to use this feature. However, if you were to set it up for error 404 then when someone typed in an incorrect url or a link to a page has become outdated then someone could be redirected to a nice professional looking page which could also be useful and provide links back to your mainpage or to a help section within your website.
201 - Created
202 - Accepted
203 - Non-Authorative Information
204 - No Content
205 - Reset Content
206 - Partial Content
400 - Bad Request
401 - Authorization Required
402 - Payment Required
403 - Forbidden
404 - Not Found
405 - Method Not Allowed
406 - Not Acceptable
407 - Proxy Authentication Required
408 - Request Timed Out
409 - Conflicting Request
410 - Gone
411 - Content Length Required
412 - Precondition Failed
413 - Request Entity Too Long
414 - Request URI Too Long
415 - Unsupported Media Type
The coding used to within a .htaccess file to redirect upon the completion of a request or error is as follows (and only goes on a single line);
ErrorDocument code /directory/filename.extFor instance this could look like;
ErrorDocument 404 /errors/404.htmlThis would redirect anyone who got a 404 error on my website to a folder called errors and then to a file named 404.html.
You also have the ability to add html to the .htaccess file for these, for instance you could add;
ErrorDocument 404 " The page you are requesting is not here, please use your back button to return.
Notice that there are quotation marks before the html code but not at the end of it. This is as it should be for the Apache to read it correctly. Also make sure that it is all on one line so turn off your wordwrap when inputting it.
Password Protecting Folders
In order to password protect any directory you will require two files, .htaccess file and a .htpasswd file. The naming convention is identical to the .htaccess file.
Within the .htpasswd you will need to put in the username and password (although the password must be encrypted) you would like to use, for instance, if we use the username of username and the password of password it would look like this.
username:66yGQHg8KA7jwIn order to encrypt a password you can go to http://www.earthlink.net/cgi-bin/pwgenerator.pl or do a search on google for password encryptor.
For security purposes it is recommended that you do not place your .htpasswd file in a directory that is not web accessible, rather try and place it above your root www directory. And also make sure that you upload the .htpasswd file as ASCII instead of binary.
Now you must add the code to the .htaccess file which will be located within the directory you would like to password protect;
AuthUserFile /home/users/web/b2278/ph.dprouse/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic
require user usernameThe AuthUserFile line deals with the absolute location (not the web location) of the .htpasswd file, there is no set standard for this so always make sure you double check with your webhost provider.
The AuthName line is arbitrary, it can say whatever you would like to put in there within reason (no spaces).
The AuthType is basic because we are using a HTTP login.
The final line is require user and then the customer's username, this is setup as though each user has their own seperate directory they can have access to, if you have multiple users that would like to access the same directory you change the last line to read;
require valid-user
Enabling SSI Through .htaccessMany webhosts do not allow SSI access, this is due to the fact that there are many SSI hacks out there and it is a large vulnerability. There is a way to allow it, although you should always contact your host and make sure that this is permitted as it can be a breach of your terms of service.
The following lines must be added to your .htaccess file;
AddType text/x-server-parsed-html htm htmlThe AddType line adds a MIME type to the text category and the extension is .shtml. This allows them to be seen on the server, even though most hosts do allow this it is always better to add it to the code to make sure.
The AddHandler line makes sure that all .shtml files are server-parsed for server side commands.
If you do not feel like renaming all of your .html files to .shtml you can add this line between the first and second lines above;
AddHandler server-parsed .htmlThis line is not overly recommended as it will cause the server to parse every file with the .html file extension. This adds extra load time to every page you have as well as extra server strain, if you are worried about load time it is always better to only use the .shtml files.
If you are planning on using the .shtml extension and would like to use SSI on your index page you must add another line of code into your .htaccess file;
DirectoryIndex index.shtml index.htmlThis line of code will allow your index file to be index.shtml and if it does not find one it will automatically check for a index.html.
Blocking Users By IP Address
If you were to need to block someone or a group of people from accessing your website it would be as simple as adding the following lines of code to your .htaccess file;
order allow,denyThe first line sets the order of steps, the first step is to allow, then to deny.
deny from xxx.xxx.xxx.xxx
deny from xxx.xxx.xxx
allow from all
The second line is the first line of denials, there can be as many as you require. This line will prevent anyone from IP address xxx.xxx.xxx.xxx from entering this directory (or website).
The third line will block everyone from an IP range, anyone at xxx.xxx.xxx.??? will be blocked, such as xxx.xxx.xxx.1, xxx.xxx.xxx.2 ... xxx.xxx.xxx.255.
The last line will allow everyone else to enter, however, if you chose to prevent everyone you could set this line to read;
deny from allYou may also allow or deny by domain name, such as;
deny from .purehost.comThis will prevent all users from this domain to be blocked, it also includes all sub-domains (such as username.purehost.com).
Changing Your Default Directory
If you have a problem setting your homepage to index.html you may want to look into using this piece of code in you .htaccess file;
DirectoryIndex filename.extWhat this will make happen is when someone accesses your website they will be directed to the filename listed instead of the typical index.html file. You can also setup priorities on this too, if you were to list multiple files it would check for the first one and if unable to find it, it would then move on to the second one and so forth.
For example;
DirectoryIndex danny.html index.pl home.php index.htmlThis would first check for the daniscool.html file and if unable to find it check for the index.pl file and if unable to locate it check for the home.php file and if unable to find it check for the index.html file. Once it has exhausted all of these then it would display a 404 error (hopefully you have already set up a custom one using your .htaccess file).
.htaccess Redirects
Although redirects can be coded through many different means, such as http-equiv, javascript, or any type of dynamic scripting it is typically more efficient to do it through a .htaccess file. The reason being that the coding for all your redirects can be done through a single file instead of having to add code to multiple files. This can save time, which ultimately can mean the difference between someone coming to your site and finding broken links or not seeing updated information.
htaccess uses redirect to look for any request for a specific page (or a non-specific location, though this can cause infinite loops) and if it finds that request, it forwards it to a new page you have specified:
Redirect /folder1/file1.html http://site.com/folder2/file2.htmlNotice there are three separate yet required parts to this line of code. The first part is the Redirect command, this informs the browser that when a specific file or folder is accessed the browser is going to be redirected to a new location. The second part is the address of the file or folder you want to redirect from relative to your root directory. The third and final step is to indicate the file or folder that you want to redirect to, this should be indicated by the complete path to it.
As with most .htaccess commands all three sections of this are seperated by a single space but located on one line. This command will often be used if there are massive changes to a website, for instance you have created an entire new site, which is located in a separate folder. You would use the redirect command and specify the old folder and then specify the new folder.
Hiding Your .htaccess
Because your .htaccess file can often contain information that is very pertinent to your website or information that can be potentially a security risk it is always better to limit access to it as much as possible. If you have set incorrect permissions or if your server is not as secure as it could be, a browser has the potential to view an htaccess file through a standard web interface and thus compromise your site/server. This, of course, would be a bad thing. However, it is possible to prevent an htaccess file from being viewed in this manner:
order allow,denyThe first line specifies that the file named .htaccess is having this rule applied to it. You could use this for other purposes as well if you get creative enough. If you use this in your htaccess file, a person trying to see that file would get returned (under most server configurations) a 403 error code. You can also set permissions for your htaccess file via CHMOD, which would also prevent this from happening, as an added measure of security: 644 or RW-R--R--.
deny from all
Adding MIME Types
IF you are using a file extension that is not set on the servers, which can be a common occurrence with MP3 or even SWF files, you can specify what type of file it is by adding this line of code to your .htaccess file;
AddType application/x-shockwave-flash swfAddType is specifying that you are adding a MIME type. The application string is the actual parameter of the MIME you are adding, and the final little bit is the default extension for the MIME type you just added, in our example this is swf for ShockWave File.
If you need to find the application string of the file you are adding most of them are located at filext.com. Also, if you want to have a file who’s extension is specified on the server to open with something and you would rather have that downloaded (for instance .xml) you can specify the application string as;
application/octet-streamPreventing Hot Linking
Hot linking refers to someone outside of your website using the path to one of the images on your website. This is considered very rude for two major reasons; the first is that you may have spent many hours working on a particular image and do not want it used by someone else, and the second is that everytime someone accesses that other person’s page it uses your bandwidth. If the site were to have many visitors it could end up that your website actually goes down to bandwidth over usage.
Using .htaccess, you can disallow hot linking on your server, so those attempting to link to an image or CSS file on your site, for example, is either blocked (failed request, such as a broken image) or served a different content (for example a different picture) .
Here's how to disable hot linking of certain file types on your site, the case below takes into account images, JavaScript (js) and CSS (css) files on your site. Simply add the below code to your .htaccess file, and upload the file either to your root directory, or a particular subdirectory to localize the effect to just one section of your site;
RewriteEngine onBe sure to replace "domain.com" with your own. The above code creates a failed request when hot linking of the specified file types occurs. In the case of images, a broken image is shown instead.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/.*$ [NC]
RewriteRule \.(gif|jpg|js|css)$ - [F]
You can set up your .htaccess file to actually serve up different content when hot linking occurs. This is more commonly done with images, such as serving up an alternate image in place of the hot linked one. The code for this is;
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/.*$ [NC]
RewriteRule \.(gif|jpg)$ http://www.domain.com/alternatepicture.gif [R,L]
This was a great explanation of .htaccess files. I have just started to need to understand them and you have compiled great things into this post, dugg, stumbled, delicioused, and even read it.
Very nice post, I liked Custom Error Pages / Request Pages information... Great explanation... thanks dear...
offshore software development
Hi,
Thankyou very much for the information. I have bookmarked it for future reference :)
Thanks again!
Regards