Apr 12 2009

INCREASE SEO with this mod_rewrite tutorial. Only the useful.

Since I didn’t really delve into how to actually use mod_rewrite to do something useful in my last post I will now.
The main reason I got mod_rewrite going is to improve my search ranking for a few sites so I’ll show how to do this with a dynamic php site.
What I primarily aimed at was to get $_GET variables to look like directories.

The .htaccess file from the previous post was

1. RewriteEngine on
2. RewriteRule ^/?test\.html$ test.php [L]

First line obviously turns on the .htaccess Rewrite Engine. .htaccess files can be used for other stuff. Check here for an example of a different use of a .htaccess file.

The second line is rewriting test.html to test.php. The text before the space represents the string to search and replace using regular expressions. The text after the space specifies what to replace the aforementioned string with.

This rule is really simple. The special characters that denote the regular expression are:

‘^’ This symbol, the caret, signifies the start of the URL.
This is under the current directory. Think of it like the ‘~’ character on the command line.
If your site is http://www.quick-content.com then
the ‘^’ in these regular expressions are the equivalent of the that URL
(as long as the htaccess file is in the root directory for that site).

‘$’ This symbol, the dollar sign, signifies the end of the URL.

“\.’ This is just a period. There is nothing special about this because it is ‘escaped’ by the slash.
For the search this will look just tell apache to treat it like a normal period.
This is necessary because the period has a special meaning and in this case we just
want to look for a period and not any character (which is the ‘.’s normal meaning).

It’s great and all to redirect from one file to another using mod_rewrite and apache but it’s not that helpful for SEO.
Here is another search and replace rewrite rule:

RewriteRule ^posts/([^/\.]+)\.html$ single_post.php?post_name=$1 [L]

This rule isn’t that simple.
First we want to match all urls that start with ‘posts/’.
Then we want to capture the characters that come after ‘post/’ but before ‘.html’ and use them as the get variable for the single_post php script.

This will rewrite pages like :

The ‘()’ brackets tell apache to take whatever is inside of them and put it in a temporary location that can be acessed by the replacement string.
In this case the string ‘somepost’ is stored as $1. If you had multiple parenthesis then the $2,$3,.. would be used also.

The square brackets signify an expression of a sort. [0-9] will match any digit between 0 and 9. [^0-9] will match any character that is not between 0 and 9.
So [^/\.]+ matches one or more characters that are not a slash or a dot.

After that the ‘\.html$’ searches for the .html file extension so that url will look like a basic old html file.

A general overview of the structure of a mod_rewrite RewriteRule:

RewriteRule Pattern Substitution [OptionalFlags]

Fairly simple right? RewriteRules are dissected as follows:

This is just the name of the command for apache.

This is a regular expression which will be applied to the current URL.

Substitution occurs in the same way as it does in Perl or PHP.

You can include backreferences and server variable names in the substitution. Backreferences to this RewriteRule should be written as $N, whereas backreferences to the previous RewriteCond should be written as %N.

A special substitution is -. This substitution tells Apache to not perform any substitution.


Any flags should be surrounded in square brackets and comma separated. The most useful are:

F – Forbidden. The user will receive a 403 error.
L – Last Rule. No more rules will be proccessed if this one was successful.
R[=code] – Redirect. The user’s web browser will be visibly redirected to the substituted URL.
If you use this flag, you must prefix the substitution with http://www.site.com/

There are more flags but I haven’t had a use for them yet.

I did up a few other more complex rewrites for one of my dynamic sites.

RewriteRule ^([0-9]+)/([0-9]+)/?$ /index.php?num_posts=$1&start=$2  [L]

This one is pretty straight-forward if you understood the previous example.
It is looking for two numerical strings separated by slashes. It then takes those two values and places them as the
$_GET[‘num_posts’] and $_GET[‘start’] variables. The ‘/?’ at the end allows for a possible slash.
[0-9]+ means: match one or more characters that are between 0 and 9.

The main problem that I ran into was that after a rewrite all of the images and includes would not work because I’m lazy and use relative paths.
So when I type a url like www.quick-content.com/10/11023/ I don’t get any images or style sheets because the browser is trying to find images
at www.quick-content.com/10/images/. I had to write two rules to make this work properly:

RewriteRule ^.*/?.+/images/(.+)$ /images/$1 [L]
RewriteRule ^.*/?.+/includes/(.+)$ /includes/$1 [L]

These rules check for one or more directories before the image or includes directory and if they exist then it rewrites the url to either /images or /includes.
The ‘.*/?’ means 0 or more characters before an optional slash (the ‘?’ following the slash means that it is optional).
‘.+/images/’ and ‘.+/includes/ tries to match one or more characters before the image or includes directory.
The ‘.’ means any character. So ‘.*’ means 0 or more any characters which is essentially any string.

Hopefully that wasn’t too confusing. There are a lot of great resources out there for learning regular expressions if you still don’t get it.
-Cody Taylor


Apr 12 2009

Setting up mod_rewrite on ubuntu example. Only the useful.

mod_rewrite is a way to change one url to another. It is very useful for making dynamic php sites better for search engine optimization (SEO).

Obviously you’re going to need apache installed. I’m using Apache 2.0 with php5. This is how I did it.

I had hoped that it was already installed and I could just start writing regular expressions in an htaccess file and have everything work. So I went to check if it was installed by calling the phpinfo function. I created a file called asdf.php which contained:

I then did a search for rewrite in my browser. No go. So it’s either not installed or not enabled.
I checked the /etc/apache2 directory for any files that look like mod_rewrite.
I found rewrite.load in the mods-available directory. There is also a mods-enabled directory so I copied rewrite.load to mods-enabled with this command.

cody@taylor:/etc/apache2$: sudo cp rewrite.load ../mods-enabled/rewrite.load

I thought this may do it so I restarted apache2 with this command:

cody@taylor:/etc/apache2$: /etc/init.d/apache2 restart

I created two text files to test whether it was working:

filename: test.html
This is a HTML File

filename: test.php
This is a PHP File

I then created a .htaccess file in my web root directory which contained this text:

RewriteEngine on
RewriteRule ^/?test\.html$ test.php

I went to http://localhost/test.html hoping to see that it’s a php file. Didn’t work obviously.
I checked the sites-enabled directory and opened 000-default in vim.

It contained :

ServerAdmin webmaster@localhost
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
ErrorLog /var/log/apache2/error.log

# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel warn

CustomLog /var/log/apache2/access.log combined

Alias /doc/ “/usr/share/doc/”
Options Indexes MultiViews FollowSymLinks
AllowOverride None
Order deny,allow
Deny from all
Allow from ::1/128

AllowOverride is what I was looking for so I change AllowOverride to all and restart apache again.
This time when i check http://localhost/test.html I get that it is a php file which is good.

I noticed I was getting this error when I restarted apache:

cody@taylor:/etc/apache2$: sudo /etc/init.d/apache2 restart
* Restarting web server apache2
apache2: Could not reliably determine the server's fully qualified domain name, using for ServerName
... waiting apache2: Could not reliably determine the server's fully qualified domain name, using for ServerName

I fixed this by adding this line to the apache2.conf file:

ServerName "codytaylor.org"

A restart fixed it all up.

-Cody Taylor


Apr 1 2009

Bridge Virtual Windows XP Installation to Linux so VirtualBox Can Access Your Network Shares. Only the useful instructions.

This quick tutorial or example displays how to quickly bridge your windows xp install in virtual box running on Linux to your local area network so you can stream video or just acces your network shares.

I set up a windows xp virtual box on my ubuntu 8.04 desktop so I could stream H.264 high def content to my xbox 360 using tversity.
Because Virtual Box is by default set up on a different subnet this would not work as easily as I would like. Here is only the useful information on how to set up a network bridge between Linux and Virtualbox.

Install virtualbox on ubuntu and set up windows xp. For anyone trying to follow these instructions then this should be simple. Google it and use apt-get.

These packages are required for the bridge.

sudo apt-get install uml-utilities bridge-utils

Now create TAP interface. The tap interface is a software loopback mechanism.

sudo tunctl -t tap1 -u username

where username is obviously your username.

Create a bridge named br0

sudo brctl addbr br0

Put your physical network interface into promiscuous mode

sudo ifconfig eth0 promisc

Link your physical interface to the bridge br0 that we created
sudo brctl addif br0 eth0

Give the bridge br0 a static IP address or use DHCP

sudo ifconfig br0


sudo dhclient br0

Link the tap interface to your bridge br0

sudo brctl addif br0 tap1

You will now need to activate the tap interface using the ifconfig command

sudo ifconfig tap1 up

Alter the permissions of the tunnel device

sudo chmod 0666 /dev/net/tun

Now, assuming that you got virtual box installed and working, (If you can’t then you need to play with linux more, It’s really not that hard)
Start VirtualBox but don’t start your windows installation. Click on Settings, select Network and then set ‘Attached to’ to “Host Interface” and add tap1 to Interface Name.

Boot up your virtual windows xp installation and change the IP address of the local area connection in network connections to an ip on the same subnet.
You will have to set the proper gateway and possibly the dns servers. Although typically these can just be set to the same as the gateway.

when you reboot Linux and want to use the windows install on the same network then you will have to repeat most of that stuff. I use a shell script that runs it on boot.
Here is an example of one that you could use. Fill in your own info of course.

#must run as root
tunctl -t tap1 -u codytaylor
brctl addbr br0
ifconfig eth0 promisc
brctl addif br0 eth0
sleep 10
dhclient br0
brctl addif br0 tap1
ifconfig tap1 up
chmod 0666 /dev/net/tun

-Cody Taylor


Mar 26 2009

Quick htpasswd Creation on Linux. Only The Useful.

Quick htpasswd Protection Setup On Linux.

cody@taylor:~$  <--This is the prompt. You do not need to type this

In the console get into the directory that you wish to protect.

cody@taylor:~$ cd /var/www/protected

Get the full path to the directory if you don’t already know it.

cody@taylor:~$ pwd

Create the htaccess file and add the following lines.

cody@taylor:~$ vim .htaccess
AuthUserFile /var/www/protected/.htpasswd
AuthName "This Site"
AuthType Basic
Require valid-user

Next use the htpasswd linux command which should be in the most common distros.

cody@taylor:~$ htpasswd -c .htpasswd someguy
New password: 
Re-type new password:

Almost done. Now you need to set the permissions on the files we just created.

cody@taylor:~$ chmod 755 .htaccess .htpasswd

All good.