The conception, birth, and first steps of an application named Charlie

Subscribe: Atom or RSS

Charlie has Cool URLs - Part 3

by Alister Jones (SomeNewKid)

Let’s presume that Charlie is serving a photographer’s website. The website comprises an extensive photo gallery, the photographer’s weblog, an about page, and a contact page. Let’s look at some of the URLs that might apply to the photo gallery.

www.example.com/photos

www.example.com/photos/landscapes
www.example.com/photos/weddings

www.example.com/photos/weddings/susan-and-mark
www.example.com/photos/weddings/susan-and-mark/kiss
www.example.com/photos/weddings/susan-and-mark/rings

All of these pages will be handled by Charlie’s PhotoGallery plugin. You and I can see that quite clearly, since the URLs are all children of the parent /photos URL. But how can Charlie know that all of these URLs are to be handled by Charlie’s PhotoGallery plugin? There are at least three ways.

The first way would be for Charlie’s database to keep a record of every single URL and the Plugin associated with the URL. That’s not as easy as it sounds, because many of the URLs might be generated on-the-fly. For example, if the visitor is looking at the photo of Susan’s ring, there may be a link to view a closeup of the photo. Clicking on the link might navigate to the following URL:

www.example.com/photos/weddings/susan-and-mark/rings/closeup

The PhotoGallery plugin may use lots of these on-the-fly URLs, and it would be extremely difficult if Charlie had to record every single possible URL. This is not really a viable approach.

The second way for Charlie to associate URLs with Plugins is to use a wildcard system. For example, Charlie’s database could record the following single URL, and associate this record with the PhotoGallery plugin.

/photos/*

To state the obvious, any child of the /photos URL will be handled by the PhotoGallery plugin. This approach accommodates any of the on-the-fly URLs created by the PhotoGallery plugin, so this second approach is better than the first. However, it does carry a limitation. What if we don't want every single child of /photos to be handled by the PhotoGallery plugin? We may want /photos/order to be handled by the Payment plugin, and /photos/copyright to be handled by the SimpleHtml plugin.

In terms of code, we could use a system of regular expressions in order to perform our wildcard matches. By using regular expressions, the wildcard search can “exclude” certain URLs (such as /photos/order and /photos/copyright) from the wildcard test (such as /photos/*). However, that is getting very finicky, and is not a very flexible or sustainable approach.

The third way for Charlie to associate URLs with Plugins is to use a fall-through system. First, Charlie maintains a record of parent URLs. These are not necessarily actual URLs like in the first option, and are not wildcard URLs like in the second option, but are a record of parent URLs and the associated Plugin. Like this:

/photos              - PhotoGallery plugin
/photos/order        - Payment plugin
/photos/copyright    - SimpleHtml plugin

Charlie looks at the requested URL, which may be the following:

/photos/weddings/susan-and-mark

Charlie checks whether this URL has a match in its list of parent URLs. It does not, so Charlie lops off the last part of the URL, to be left with this:

/photos/weddings

Charlie checks whether this URL has a match in its list of parent URLs. It does not, so Charlie now lops off more of the original URL, to be left with this:

/photos

Now a match will be found, and Charlie knows to hand this request off to its PhotoGallery plugin. Even more helpful is that anything that was lopped off the original URL is now considered a parameter. When the PhotoGallery plugin receives this request, it will be split like this:

document = /photos
parameter = /weddings/susan-and-mark

The PhotoGallery plugin can look at this parameter to know exactly what it needs to display.

Now let’s consider a different URL.

/photos/copyright

Charlie checks whether this URL has a match in its list of parent URLs. It does, and it is associated with the SimpleHtml plugin.

So this third approach keeps lopping off the ends of the incoming URL until a parent-URL match is found. This is just as flexible as the second, wildcard approach, but overcomes the limitation of that second approach, which was how to exclude certain URLs from its wildcard matches. This third approach would be awkward if each test required a trip to the database. However, Charlie loads up a DocumentMap (like a site map) for each Domain it serves, and places this DocumentMap in its high-priority cache. These fall-through checks therefore incur little performance penalty, but provide great flexibility in associating URLs with Plugins.

Need I state that it is the third approach that has been adopted for Charlie?

by Alister Jones | Next up: Charlie has Cool URLs - Part 4

0 comments

----