nemrod.se Various guides and experiments

Beautiful URLs with mod_rewrite and PHP


If you’re reading this you’re probably already well aware of the benefits of good URLs – usability, accessibility and for search engine optimization purposes. In this guide I’m going to explain how I went from example.com/?p=4&s=12 to nemrod.se/chrome-extensions/pagerank-display using mod_rewrite, PHP and MySQL in the background.

Necessary back-end changes

If your URLs contain the ID of things instead of the titles, you’re gonna have to do a lot more changes than just mod_rewriting. I don’t know how others have solved it, but the way I did it was by adding an extra field in my database called sanitized_title or sanitized_name etc. depending on what it is. I then wrote a quick function to sanitize the wanted field when adding new posts, pages or whatever it may be and then add it to the database along with the rest of the data. This is what my function looks like:

function sanitize($str) {
	$str = strtolower($str);
	$str = strip_tags($str);

	$str = preg_replace('/[^a-z0-9]+/i', ' ', $str);

	$str = preg_replace('/\s+/i', ' ', $str);
	$str = trim($str);

	$str = str_replace(' ', '-', $str);
	return $str;
}

It first converts the whole string to lower-case and strips it from HTML tags. It then uses regular expression replace to replace everything not a letter or a number with spaces. It then replaces all whitespace with a single space (if the string had “bla – bla bla” that would be three spaces – now it’s one) and trims whitespace from the beginning and end of the string. Lastly it replaces the spaces with dashes before returning the string.

If your language has got any additional characters, such as the Swedish å, ä and ö you’re going to have to add them like this, preferably somewhere near the end:

	$str = str_replace('å', 'a', $str);
	$str = str_replace('ä', 'a', $str);
	$str = str_replace('ö', 'o', $str);

Modify the administration interface to add the sanitized string to the database, like this for example:

$title = mysql_real_escape_string($_POST['title']);
$sanitized_title = sanitize($_POST['title']);
mysql_query('INSERT INTO post (title, sanitized_title) VALUES (\'' . $title . '\', \'' . $sanitized_title . '\')');

If you’ve already got a large database without the sanitized field I suggest writing a PHP script that simply loops through all the records in the database, calls the sanitize function on the title, and updates the post. Something like this:

$result = mysql_query('SELECT * FROM post');
while($row = mysql_fetch_array($result))
	mysql_query('UPDATE post SET sanitized_title=' . sanitize($row['title']) . ' WHERE id=' . $row['id']);

Now all you have to do is change all the occurences of the ID in the code to the sanitized title. It’s usually pretty easy since it’s generated from the database, all you have to do is echo the sanitized_title column instead of the id one! After doing so your site should have URLs similar to example.com/index.php?p=chrome-extensions&s=pagerank-display. It’s not quite where we want to go, but the hardest part is over!

mod_rewrite and .htaccess

Now that we’re using the sanitized names in the PHP code, the rest is a breeze. All we have to do is make the URLs look pretty for the users with mod_rewrite, modify the PHP code to match, and we’re done!

The way I used to do my mod_rewrite was using a prefix. In other words, I rewrote the URLs so they looked like example.com/prefix/chrome-extensions/pagerank-display. The reason is because I didn’t know how to be able to go to other folders on the server, they would all be rewritten if I didn’t have that prefix! After getting the mission to rewrite the URLs on another site, the SEO (search engine optimisation) company I work at, I thought that there has to be a better way than using a prefix and went investigating. I remembered that WordPress had pretty URLs, and went looking in their .htaccess.

What I found was a pretty obvious solution, but something I hadn’t known could be done with .htaccess files before. What it does is check if the file or folder you’re trying to go to exists, and if not, it sends you to index.php. No more prefix yet it’s still possible to get to other files and folders on the server! :)

The code you put in .htaccess looks like this:

RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ index.php/$i [L,QSA]

What the lines do:

  • Activates the rewrite engine
  • Sets the rewrite base, i.e. where we’re doing the rewriting
  • Checks if the requested URL is a real file
  • Checks if the requested URL is a real folder
  • If it’s not, we attach the requested URL (the part after the domain name) to index.php, which then further processes the data

Now we’ve successfully sent the pretty URLs to index.php, now we just have to process it and change the links to match!

The PHP processing

If you go to nemrod.se/chrome-extensions/pagerank-display and chrome-extensions/pagerank-display isn’t a real file or folder the real URL after rewriting will be index.php/chrome-extensions/pagerank-display. The way I process this in PHP is by exploding the REQUEST_URI server variable over / and then setting the appropiate variables to their values. For nemrod.se, for example, the first variable is the page, and the second is the name of a post. This is how my code looks:

$uri = explode('/', $_SERVER['REQUEST_URI']);
if(sizeof($uri) > 1 && !empty($uri[1]))
	$page = $uri[1];
if(sizeof($uri) > 2 && !empty($uri[2]))
	$post = $uri[2];

If I echo the $page and $post variables after going to the above URL, they’ll be chrome-extensions and pagerank-display respectively. The rest is just your ordinary menu generation and stuff, if(!isset($page)) set $page to the first page in the menu generating loop, and so on.

No error 404?!

Yes, that’s right. If the file doesn’t exist you’ll be sent to index.php to do all the processing, so you won’t be getting any error 404! To fix this I added some PHP code to make sure the server returned a 404 if the URL really is a real 404. I’m sure there are a lot of ways to do it, but in my case I just set a variable to true ($e404 = true;) and if the page was really found when generating the menu etc. I simply set it to false. After the check is done I have an if statement like this:

if($e404)
	header('HTTP/1.0 404 Not Found');

Where the content of the page is supposed to be displayed I also have an if, but instead of sending another header I simply inform the visitor that the page he tried to access doesn’t exist. This also makes for an error 404 with easy access to your site, since it’ll be displayed right in the middle of it, with the menu accessible and everything. :)

A tip:

If you’ve got URLs with a lot of traffic you might want to do a 301 redirect to your new, pretty, URLs. One of my most visited URLs back when I used prefixes was nemrod.se/n/chrome_extensions/pagerank_display_v1.9. So as to not lose all that traffic this (along with some other URLs) was what I added to my .htaccess:

RewriteRule ^n/chrome_extensions/pagerank_display_v1.9[\/]?$ http://nemrod.se/chrome-extensions/pagerank-display [R=301,L]

There are more automated ways to do it, if you’re already using URLs like index.php?page=name&post=title you can use this, for example (untested):

RewriteRule ^index.php?page=(.*?)$ http://example.com/$1 [R=301,L]
RewriteRule ^index.php?page=(.*?)&post=(.*?)$ http://example.com/$1/$2 [R=301,L]

Posted in Guides | 8 Comments

8 Responses to “Beautiful URLs with mod_rewrite and PHP”

  1. shoulder bag says:

    Thanks for taking time for sharing this article, i enjoy reading this site,it was excellent and very informative. as a first time visitor to your blog I am very impressed. I found a lot of informative stuff in your article. Keep it up. Thank you.

  2. DampeS8N says:

    “$title = mysql_real_escape_string($_POST['title']);
    $sanitized_title = sanitize($_POST['title']);”

    This code contains a bug. One that could potentially lead to a security vulnerability. This is why parameter binding with MySQLi is the preferred method of securing MySQL in PHP. Because if you do it always, you can’t create bugs like the above.

  3. nemrod says:

    @DampeS8N: Very true, after escaping the title I for some reason used the original string in the example, thanks for pointing it out!

    Normally I would use a database abstraction layer to avoid any mistakes like this, which is probably a better idea than using MySQLi as well. :)

  4. [...] Beautiful URLs with mod_rewrite and PHP | nemrod.se. Rate this: Share this:TwitterFacebookStumbleUponPrintEmailLinkedInLike this:LikeBe the first to like this post. This entry was posted in Apache, PHP, Programming by francescodifusco. Bookmark the permalink. [...]

  5. [...] مقاله ای در این مورد بخوانید: Beautiful URLs with mod_rewrite and PHP [...]

  6. […] مقاله ای در این مورد بخوانید: Beautiful URLs with mod_rewrite and PHP […]

Leave a Reply