The conception, birth, and first steps of an application named Charlie

Subscribe: Atom or RSS

The WebHandler Object

by Alister Jones (SomeNewKid)

In the previous weblog entry, I discussed how Charlie used static file handlers to serve existing files. For example, the CssHandler would pick up an existing .css file, maybe update its content, and then return the file.

The next step was to create dynamic file handlers. Whereas the static file handlers are used when there is an existing file to return, the dynamic file handlers are used when there is no existing file to return. Rather, the response needs to be entirely generated from scratch.

This was a relatively simple exercise, since I had a single existing dynamic handler; namely, the PageEngine that used Presenters. I took the existing PageEngine and Presenter objects, and reworked them so that they could present documents in more than just HTML. I will admit that I did a shit job with my earlier attempts to describe the process by which a page is built in Charlie. Let me try again, this time using the new WebHandler object that replaced the previous PageEngine object.

Let’s consider how Charlie handles the following two requests.

/about
/about.rss.xml

The first thing to note is that the same document is being requested (the About page), but the first request is for the document to be in HTML format, while the second request is for the document to be in RSS format.

The first thing Charlie does is hand these requests off to its WebContextBuilder object. The WebContextBuilder looks at the details of the request, and figures out which Document business object is being requested. In both cases, it is the About Document. At this stage, Charlie is unconcerned with the format requested; it is only concerned with which Document is being requested. The requested Document is then loaded up (pulling information from the database or cache) and attached to the current WebContext object.

The second thing that Charlie does is figure out what format has been requested. In these examples, the first request is for HTML format, and the second request is for RSS format. So at this point, Charlie knows which Document business object to present, and in which format to present the document.

The next step is for Charlie to introduce a WebHandler object. The task of this WebHandler is to take the Document business object, and present that Document in the HTML or RSS or other format requested. The WebHandler starts with a blank page that derives from System.Web.UI.Page. In the illustration below, the hand represents the Handler. (That should be easy to remember.)

The first thing the WebHandler does is say to the Document business object, “What template do you want to use?” The Document business object will return the name of the template that the website’s developer has specified. The template will be a simple name such as ‘Silver’ or ‘Playful’. The WebHandler will then add an extension that represents the format in which the document is to be presented. So in the first case .html will be added, while in the second case .rss will be added. The WebHandler then adds the .ascx extension, since templates in Charlie are based on ASP.NET User Controls. The WebHandler loads up the template (such as silver.rss.ascx), and applies it to the blank page.

Within that template may be any number of Placeholder controls that will define where the content is to be placed. For example, a template for an HTML webpage might include Placeholders for the header, content, and footer regions.

At this point, the page is half-constructed, and has slots where the content can go (the Placeholders). The WebHandler then goes back to the Document business object and grabs its collection of Containers. Each Container holds a business object that needs to be presented. For example, one container may hold an Article business object. Another container may hold a Photo business object. But at this point, all the WebHandler does is grab the Containers from the Document business object.

What the WebHandler does next is go through each Container and “tell” it what document format is being presented. So the first Container, which is holding an Article to be presented, knows whether the article is to be presented as HTML or as RSS or as some other format. With that knowledge, the Container says to its Presenter, “Here’s the Article to present, and we need to present it in HTML format.”

The WebHandler then says to the Presenter, “Give me an ASP.NET server control.” If you refer back to the weblog entry on the Presenter object, you’ll see that Charlie calls these server controls Views. So in the example being discussed, the Presenter of the first Container will return an ArticleHtmlView, with the Article business object attached. The second Container is presenting a photo, so its Presenter will return a PhotoHtmlView, with the Photo business object attached.

What this means is that the WebHandler will pull out a single server control, called a View, from each Container.

The final step for the WebHandler is to drop these View server controls onto the templated page. Each Container tells the WebHandler which Placeholder to use, and the position (first, second, third, or later) within that Placeholder.

The WebHandler has now created an ASP.NET Page complete with a Template and a bunch of arranged View server controls. The normal processing of an ASP.NET Page then occurs, and each View server control renders out the appropriate markup. An ArticleHtmlView will render an Article as an HTML element, while an ArticleRssView will render the same Article as an RSS item.

Sigh. I think this weblog entry is just as shit as the earlier entry. Oh well, since I am not being paid to write this weblog, I cannot justify rewriting this entry.

The end of this little story is that by introducing a new WebHandler object, and making the existing Presenter objects accept requests for all manner of different document formats, Charlie is now able to present its documents in multiple formats. What is notable—if I may say so myself—is that this flexibility is now built into Charlie’s architecture. If a client wants his or her pages exposed as Atom feeds, there is no need to write “fudge” code. Rather, by introducing a new ArticleAtomView and a PhotoAtomView, the rest of Charlie remains unchanged (including all existing security and localization).

I have now updated the sample site to present its documents as RSS and Atom feeds. I don’t yet know whether the XML generated is valid, because this is just a proof of concept. But as a proof of concept, I feel that it illustrates Charlie’s flexibility.

by Alister Jones | Next up: The Valley of Data Access - Part 1

0 comments

----

Static File Handlers

by Alister Jones (SomeNewKid)

In the second of the four weblog entries concerning Charlie’s cool URLs, I noted that Charlie uses a number of custom file handlers. The examples provided were a CSS Handler, a JavaScript Handler, a GIF Handler, and a JPEG Handler.

If you know a little bit about HTTP responses, you will know that there are three common response types. If the response is valid, it is a 200 OK response. If the file could not be found, it is a 404 Not Found response. If the file is found, but it has not been modified since the browser last requested it, it is a 304 Not Modified response. So that I did not have to duplicate this logic in each of my separate file handlers, I started by creating a base StaticHandler class. Here is the code.

using System;
using System.IO;
using System.Web;
using System.Security;

namespace Charlie.Framework.Interface.Handlers
{
   public abstract class StaticHandler : IHttpHandler
   {
   
      // Members required by IHttpHandler interface
   
      public void ProcessRequest(HttpContext context)
      {
         FileInfo fileInfo = new FileInfo(context.Request.PhysicalPath);
         if (IsFileValid(fileInfo) == false)
         {
            ReturnNotFoundResponse();
         }
         else if (IsFileModified(fileInfo, context) == false)
         {
            ReturnNotModifiedResponse(context);
         }
         else
         {
            ReturnFileResponse(fileInfo, context);
         }
      }
      
      public Boolean IsReusable
      {
         get
         {
            return true;
         }
      }
      
      // File Tests
      
      private Boolean IsFileModified(FileInfo fileInfo, HttpContext context)
      {
         Boolean isModified = true;
         String modifiedSince = context.Request.Headers["If-Modified-Since"];
         if (modifiedSince != null)
         {
            try
            {
               DateTime lastModified = fileInfo.LastAccessTimeUtc;
               DateTime lastReceived = Convert.ToDateTime(modifiedSince);
               if (lastModified == lastReceived)
               {
                  isModified = false;
               }
            }
            catch
            {
            }
         }
         return isModified;
      }

      private Boolean IsFileValid(FileInfo fileInfo)
      {
         if (File.Exists(fileInfo.FullName))
         {
            return true;
         }
         return false;
      }      
      
      // Responses

      private void ReturnNotFoundResponse()
      {
         throw new HttpException(404, "File not found.");
      }

      private void ReturnNotModifiedResponse(HttpContext context)
      {
         context.Response.StatusCode = 304;
         context.Response.SuppressContent = true;
      }
      
      private void ReturnFileResponse(FileInfo fileInfo, HttpContext context)
      {
         try
         {
            HttpResponse response = context.Response;
            this.ReturnFile(fileInfo, response);
            response.Cache.SetLastModified(fileInfo.LastWriteTime);
            response.Cache.SetCacheability(this.Cacheability);
            response.Cache.SetExpires(this.CacheExpiry);
         }
         catch (SecurityException)
         {
            throw new HttpException(401, "Access to file denied.");
         }
         catch
         {
            // it is more secure to not signfiy an error
            throw new HttpException(404, "Unable to serve file.");
         }
      }
      
      // Default Cache Values

      protected virtual HttpCacheability Cacheability
      {
         get
         {
            return HttpCacheability.Public;
         }
      }

      protected virtual DateTime CacheExpiry
      {
         get
         {
            return DateTime.Now.AddDays(1);
         }
      }
      
      // Abstract Member
      
      protected abstract void ReturnFile
            (FileInfo fileInfo, HttpResponse response);
   }
}

The code is very simple, and parts of it come from Milan Negovan’s Adding Variables To Style Sheets article. There is a single abstract ReturnFile method which a concrete handler class must implement. By creating this base class that forces a concrete handler to implement just one method, let’s see just how simple those handlers can be.

Here is the code for Charlie’s CSS Handler.

using System;
using System.IO;
using System.Web;

namespace Charlie.Framework.Interface.Handlers
{
   public class CssHandler : StaticHandler
   {
      protected override void ReturnFile
            (FileInfo fileInfo, HttpResponse response)
      {
         StreamReader reader = fileInfo.OpenText();
         String css = reader.ReadToEnd();
         reader.Close();
         response.Write(css);
         response.ContentType = "text/css";
      }
   }
}

Can’t get much simpler than that, can you?

Here is the code for Charlie’s JavaScript Handler.

using System;
using System.IO;
using System.Web;

namespace Charlie.Framework.Interface.Handlers
{
   public class JavaScriptHandler : StaticHandler
   {
      protected override void ReturnFile
            (FileInfo fileInfo, HttpResponse response)
      {
         StreamReader reader = fileInfo.OpenText();
         String script = reader.ReadToEnd();
         reader.Close();
         response.Write(script);
         response.ContentType = "text/javascript";
      }
   }
}

And here is the code for Charlie’s GIF Handler:

using System;
using System.IO;
using System.Web;
using System.Drawing;
using System.Drawing.Imaging;

namespace Charlie.Framework.Interface.Handlers
{
   public class GifHandler : StaticHandler
   {
      protected override void ReturnFile
            (FileInfo fileInfo, HttpResponse response)
      {
         Bitmap bitmap = new Bitmap(fileInfo.FullName);
         bitmap.Save(response.OutputStream, ImageFormat.Gif);
         bitmap.Dispose();
         response.ContentType = "image/gif";
      }
   }
}

The advanced developers in the audience will be thinking to themselves, “Well that’s fine, Alister, but the built-in StaticFileHandler would have done the same for you.” That’s true, but only because I have not yet updated the handlers to do anything special. But as a test, I updated the JPEG Handler to stamp a copyright notice on any returned JPEG image. Here is the code.

using System;
using System.IO;
using System.Web;
using System.Drawing;
using System.Drawing.Imaging;

namespace Charlie.Framework.Interface.Handlers
{
   public class JpegHandler : StaticHandler
   {
      protected override void ReturnFile
            (FileInfo fileInfo, HttpResponse response)
      {
         Bitmap bitmap = new Bitmap(fileInfo.FullName);

         // add copyright notice
         Font font = new Font("Verdana", 10);
         Brush brush = new SolidBrush(Color.Black);
         String copyright = "Copyright";
         Graphics canvas = Graphics.FromImage(bitmap);
         canvas.DrawString(copyright, font, brush, 0, 0);

         bitmap.Save(response.OutputStream, ImageFormat.Jpeg);
         bitmap.Dispose();
         response.ContentType = "image/jpg";
      }
   }
}

As you will see, the concrete handler classes are about as simple as you could want. Yet, they allow Charlie to have full control over every file used in a website. For example, tokens can be placed within a CSS style sheet (such as “[[CorporateColour]]”) and then used in a simple find-and-replace operation before the CSS file is returned. Another example was provided above, where a copyright notice is stamped on the JPEG image returned.

The only other thing to show is the code needed in the Web.config file that points the incoming file requests to the appropriate handler. To minimise the width of the code, I have replaced the real namespace with the word “Namespace”.

<system.web>
   <httpHandlers>
      <add verb="*" path="*.asxx" 
           type="Namespace.ViewHandler, Charlie.Framework"/>
      <add verb="GET" path="*.js" 
           type="Namespace.JavaScriptHandler, Charlie.Framework"/>
      <add verb="GET" path="*.css" 
           type="Namespace.CssHandler, Charlie.Framework"/>
      <add verb="GET" path="*.gif" 
           type="Namespace.GifHandler, Charlie.Framework"/>
      <add verb="GET" path="*.jpg" 
           type="Namespace.JpegHandler, Charlie.Framework"/>
      <add verb="GET" path="*.jpeg" 
           type="Namespace.JpegHandler, Charlie.Framework"/>
   </httpHandlers>
</system.web>

If you visit the little sample site that I have put online, keep in mind that Charlie is serving every single file used in that site—not just the two pages that are currently viewable. And if you want to do the same on your own website, you now have the code.

by Alister Jones | Next up: The WebHandler Object

2 comments

----

Charlie’s Birth

by Alister Jones (SomeNewKid)

Charlie has been born. Like any new-born baby, he is far from fully developed. Still, he is now out there in the world, ready to grow, ready to learn, ready to make friends.

You may see him at www.edition3.net.

Right now there’s not too much to see, and what there is to see is very basic. The HTML and CSS came from my first foray into CSS-based design, some two years ago. The content comes from Wikipedia. In other words, what you will see there is nonsense content, because I am still working on beneath-the-skin stuff.

So have a play with the baby, and let me know if he poops or cries.

by Alister Jones | Next up: Static File Handlers

2 comments

----

Think Big

by Alister Jones (SomeNewKid)

I made a mistake by thinking small. But I learned a lesson that I would like to share.

I was looking at the feature set for Charlie, and I came to the following item:

  • allow for different document types (webpage, PDF, Word, RSS, etc.)

It occurred to me that I should get this flexibility in place. Otherwise, I would flesh out Charlie’s ability to serve webpages, and then have to retrofit the flexibility to serve other types of responses. And if there is one thing I have learned so far, it is to get flexibility in place as early as possible.

The only article I have ever read on serving different response types is One Site, Many Faces. It is a great article, wherein the author details how you can have one HttpHandler for each response type that your website is to serve. So, you’d have one TextHandler, one RssHandler, and so on. But the article does have a notable limitation. The limitation is perfectly acceptable, since it kept the article to a reasonable length. But the limitation would have manifested itself in Charlie. The limitation in the article is that each handler knows exactly what business object it is to present, which is an Article. The Article business object looks roughly like this:

public class Article
{
    public String Title;
    public String Body;
    public String Writer;
}

Because each handler knows exactly what business object it is to serve, the code to do so is very simple. Here is a snippet of how the TextHandler might look:

public class TextHandler
{
    public void PrepareResponse(Article article)
    {
        String response;
        response  = article.Title + NewLine;
        response += article.Writer + NewLine;
        response += article.Body;
        SendResponse(response);
    }
}

And here is a snippet of how the RssHandler might look:

public class RssHandler
{
    public void PrepareResponse(Article article)
    {
        String response;
        response  = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>";
        response += "<rss>";
        response += "   <title>" + article.Title + "</title>";
        response += "   <author>" + article.Writer + "</author>";
        response += "   <description>" + article.Body + "</description>";
        response += "</rss>";
        SendResponse(response);
    }
}

So, by working with a single, known business object (an Article), each of the separate Handlers were kept simple, and the website is able to serve the same article in many different ways.

The problem I faced was how to overcome this limitation of a single, known business object? To illustrate the problem, we can consider just two separate business objects, an Article and a Weblog.

public class Article            public class Weblog
{                               {
    public String Title;            public String Subject;
    public String Body;             public String Entry;
    public String Writer;           public Int32  BloggerId;
}                               }

What you will see is that the two separate business objects do not share a single property. So, the above TextHandler and RssHandler could not work with a Weblog business object. In theory I could have introduced a new set of handlers for every new business object, so that I’d end up with an ArticleTextHandler and ArticleRssHandler, and a WeblogTextHandler and WeblogRssHandler. That would work, but that’s not flexibility, that’complexity. Instead, it would more flexible if I introduced an interface that defined what sort of object could be presented by the TextHandler, and what sort of object could be presented by the RssHandler. Looking just at the RssHandler, here is how that interface might look.

public interface IRssItem
{
    String Title;
    String Author;
    String Description;
}

With that interface in place, we can update the RssHandler to look like this:

public class RssHandler
{
    public void PrepareResponse(IRssItem item)
    {
        String response;
        response  = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>";
        response += "<rss>";
        response += "   <title>" + item.Title + "</title>";
        response += "   <author>" + item.Author + "</author>";
        response += "   <description>"+item.Description+"</description>";
        response += "</rss>";
        SendResponse(response);
    }
}

This interface means that the RssHandler can now work with any business object that implements the IRssItem interface. By using interfaces this way, I could give Charlie one TextHandler, one RssHandler, one AtomHandler, and so on. And these handlers could work with any business object, so long as that business object implemented the required interface. This was the sort of flexibility I was looking for.

But here is where I made a mistake in my approach to this problem. By looking at and thinking about the code I have just shown you, my thoughts were down at the “small” details. Namely, I started thinking about how I could update the business objects so that they each presented the same IRssItem interface. Here again are the business objects in question:

public class Article            public class Weblog
{                               {
    public String Title;            public String Subject;
    public String Body;             public String Entry;
    public String Writer;           public Int32  BloggerId;
}                               }

The straight forward approach would be to implement that interface directly in each business object:

public class Article            public class Weblog
    : IRssItem                      : IRssItem 
{                               {
    public String Title;            public String Subject;
    public String Body;             public String Entry;
    public String Writer;           public Int32  BloggerId;
    
    IRssItem.Title                  IRssItem.Title
      { return this.Title }           { return this.Subject }
    IRssItem.Author                 IRssItem.Author
      { return this.Writer }          { return GetName(this.BloggerId) }
    IRssItem.Description            IRssItem.Description
      { return this.Body }            { return this.Entry }
}                               }

I only had to think about that for a few seconds before I realized that that was not an option at all. Each time I introduced a new handler and interface, I’d have to go in and change every business object that wanted to be presented by that handler. That too is not flexibility, that’s complexity.

I will admit that I banged about for nearly four hours trying to think of different solutions. The reason that I struggled for so long was that I was thinking about the small details. It was only when I went for a drive (I’ve crashed my motorbike) that I started thinking big, and a solution occurred to me.

Because I was driving and could not write down the code for a business object, I visualised each business object as a box. And because the Article business object and the Weblog business object have different properties, they expose different interfaces.

I then pictured the RssHandler as being a box, which has a defined IRssItem interface for any business object that it is to display.

The problem to be solved then was how can vastly different business objects, each exposing a different interface, be made acceptable to the RssHandler which requires each business object to expose the defined IRssItem Interface? The following picture formed in my mind:

Once that picture formed in my mind, the solution became obvious. I needed an adapter from the Adapter Pattern.

By using an adapter, I never needed to change the business objects in any way whatsoever. Even better, a given business object could be made to plug into any different type of handler. Putting the Weblog business object to one side, let’s see how the Article business object can be made to work with any and every handler, including its own specialised ArticleHandler, again without having to change the business object in any way.

So, using the code examples above, what would the ArticleRssAdapter look like?

public class ArticleRssAdapter : IRssItem
{
    public ArticleRssAdapter(Article article)
    {
        this.article = article;
    }
    private Article article;
    
    public String IRssItem.Title
    {
        get 
        { 
            return article.Title; 
        }
    }

    public String IRssItem.Author
    {
        get 
        {
            return article.Writer; 
        }
    }

    public String IRssItem.Description
    {
        get 
        { 
            return article.Body; 
        }
    }
}

You will see that this code is extremely simple. But even though it is simple, the use of adapters means that any business object can be made to work with any handler object. This is infinitely flexible.

The end solution is so simple that I really should have seen it. But, I was thinking small, and looking at the code. I was not thinking big, and looking at the objects. So that’s the lesson for today kids: think big.

by Alister Jones | Next up: Charlie’s Birth

0 comments

----

Story Time

by Alister Jones (SomeNewKid)

I have been told by an experienced developer that my writing sucks and my explanations are incomprehensible. Judging by the lack of responses I receive to my weblog entries, it would seem that she is right. I enjoy maintaining this weblog, but I don’t want to think that I’m just babbling to myself. I need to find a writing style that is more interesting and more accessible.

To that end, I would like to point you to a reply that I have just now submitted to the ASP.NET Forums, where I use the moniker ‘SomeNewKid’. Is its story style any better? I don’t know how successful the story style would be in the describing the life of Charlie, but I’m willing to give it a go.

Please, let me know how I can make the life of Charlie more interesting. If you’d rather tell me privately, please feel free to contact me. But do please tell me.

by Alister Jones | Next up: Think Big

4 comments

----

Charlie has Cool URLs - Part 4

by Alister Jones (SomeNewKid)

I lied to you earlier. It was a white lie, with no harm intended. But it was a lie nonetheless, and for that I’m sorry. If I explain, will you forgive me?

In my first and second look at Charlie’s URLs, I said that Charlie uses a slight-of-hand to add the .aspx extension to any incoming URL that has no extension, and that the Web.config file has been updated so that any request with an .aspx extension gets handled by the PageBuilder class. This process is indeed what happens, but it is not an .aspx extension that Charlie uses to route page requests to its PageBuilder class. Rather, it is a custom .asxx extension, because I wanted to leave the built-in .aspx extension untouched. Let’s see why.

Toward the end of my second weblog entry on cool URLs, we saw the following diagram.

Consider that a website is presenting a weblog entry, and within that weblog entry is a photo of Pamela Anderson before she was attacked by a bicycle pump. The URL for that photo may be the following:

/weblog/2006/apr/14/a-rare-photo-indeed.jpg

If you look at the diagram above, you will see that the request will be handled by Charlie’s own JPEG Handler. The JPEG Handler will just pick up the .jpg file and return it, which is all we want it to do.

Consider now a website that is presenting a photographer’s photo gallery, and within that gallery is a photo of the exquisite Michelle Pfeiffer. The URL for that photo may be the following:

/photos/celebrities/michelle-pfeiffer/batman-premiere.jpg

If you recall that Charlie uses a fall-through approach of matching URLs with plugins, you will see that the above URL may be associated with the PhotoGallery plugin. And here is where it gets interesting.

Most plugins do not care about requests for GIF images and JPEG photos, so they will not want to handle such requests. These plugins will simply let the request go through to the Charlie’s own designated handler (see the diagram above).

Some plugins, however, do care about requests for GIF images and JPEG photos. There plugins will want to handle such requests themselves, rather than allowing the request to go through to Charlie’s own default handlers. The PhotoGallery plugin, for example, may wish to present the .jpg image not by itself, but on a page with an attractive border, a title, and a copyright notice. The PhotoGallery plugin may also wish to record how many times a particular photo has been viewed. Put simply, the PhotoGallery plugin should be able to do whatever it likes with requests for photos.

So, how can we update Charlie so that a plugin can take control of a request if it wants control, but otherwise allow a request to flow through to the default handlers? If we look at again at the diagram above, we see an ideal point at which we can do this.

At the point indicated, all authentication and authorization checks have been performed, but the handler has not yet been chosen. This is the point at which Charlie says to the Plugin associated with the request, “Hey buddy, if you want to handle this request yourself, tell me now, otherwise I’ll handle it myself.” The way Charlie does this is by giving the associated Plugin the current WebContext object, which allows the Plugin to do whatever it likes to the HttpContext, HttpRequest, and HttpResponse objects it holds. The PhotoGallery plugin can therefore “catch” any JPEG requests, and re-route them to its own handler. Here is a simple example:

public void FilterWebContext(WebContext webContext)
{
    if (webContext.WebRequest.Address.Extension == "jpg")
    {
        String handler = "~/plugins/photogallery/jpgHandler.aspx";
        webContext.HttpContext.RewritePath(handler);
    }
}

There’s quite a bit wrong with the above code example, but it gives the basic idea of how Charlie allows its own plugins to update the above diagram in the following way.

This explains why I wanted to leave the .aspx extension untouched. If a plugin wishes, it can redirect an incoming request to one of its own .aspx pages, in which case ASP.NET’s own PageHandlerFactory takes control of loading up the .aspx page and any code-behind file. Charlie’s own PageBuilder class is associated only with the custom .asxx extension that Charlie adds to any request that does not otherwise have an extension.

What was particularly nice here is that updating Charlie to allow plugins to “take control” of an incoming request was a further example of emergence. The existing Web System and Plugin System presented the fertile soil from which this new feature could be grown. No part of Charlie’s architecture or design had to be changed or fudged in any way. The solution emerged naturally from its existing architecture and existing design. I am not patting myself on the back here. I am sharing the lesson that I am learning again and again: a little bit of object-oriented design goes a long, long way.

This weblog entry concludes the look at Charlie’s cool URLs, and how those URLs tie in with Charlie and its plugins.

by Alister Jones | Next up: Story Time

2 comments

----

Charlie has Cool URLs - Part 3

by Alister Jones (SomeNewKid)

Let’s presume that Charlie is serving a photographer’s website. The website comprises an extensive photo gallery, the photographer’s weblog, an about page, and a contact page. Let’s look at some of the URLs that might apply to the photo gallery.

www.example.com/photos

www.example.com/photos/landscapes
www.example.com/photos/weddings

www.example.com/photos/weddings/susan-and-mark
www.example.com/photos/weddings/susan-and-mark/kiss
www.example.com/photos/weddings/susan-and-mark/rings

All of these pages will be handled by Charlie’s PhotoGallery plugin. You and I can see that quite clearly, since the URLs are all children of the parent /photos URL. But how can Charlie know that all of these URLs are to be handled by Charlie’s PhotoGallery plugin? There are at least three ways.

The first way would be for Charlie’s database to keep a record of every single URL and the Plugin associated with the URL. That’s not as easy as it sounds, because many of the URLs might be generated on-the-fly. For example, if the visitor is looking at the photo of Susan’s ring, there may be a link to view a closeup of the photo. Clicking on the link might navigate to the following URL:

www.example.com/photos/weddings/susan-and-mark/rings/closeup

The PhotoGallery plugin may use lots of these on-the-fly URLs, and it would be extremely difficult if Charlie had to record every single possible URL. This is not really a viable approach.

The second way for Charlie to associate URLs with Plugins is to use a wildcard system. For example, Charlie’s database could record the following single URL, and associate this record with the PhotoGallery plugin.

/photos/*

To state the obvious, any child of the /photos URL will be handled by the PhotoGallery plugin. This approach accommodates any of the on-the-fly URLs created by the PhotoGallery plugin, so this second approach is better than the first. However, it does carry a limitation. What if we don't want every single child of /photos to be handled by the PhotoGallery plugin? We may want /photos/order to be handled by the Payment plugin, and /photos/copyright to be handled by the SimpleHtml plugin.

In terms of code, we could use a system of regular expressions in order to perform our wildcard matches. By using regular expressions, the wildcard search can “exclude” certain URLs (such as /photos/order and /photos/copyright) from the wildcard test (such as /photos/*). However, that is getting very finicky, and is not a very flexible or sustainable approach.

The third way for Charlie to associate URLs with Plugins is to use a fall-through system. First, Charlie maintains a record of parent URLs. These are not necessarily actual URLs like in the first option, and are not wildcard URLs like in the second option, but are a record of parent URLs and the associated Plugin. Like this:

/photos              - PhotoGallery plugin
/photos/order        - Payment plugin
/photos/copyright    - SimpleHtml plugin

Charlie looks at the requested URL, which may be the following:

/photos/weddings/susan-and-mark

Charlie checks whether this URL has a match in its list of parent URLs. It does not, so Charlie lops off the last part of the URL, to be left with this:

/photos/weddings

Charlie checks whether this URL has a match in its list of parent URLs. It does not, so Charlie now lops off more of the original URL, to be left with this:

/photos

Now a match will be found, and Charlie knows to hand this request off to its PhotoGallery plugin. Even more helpful is that anything that was lopped off the original URL is now considered a parameter. When the PhotoGallery plugin receives this request, it will be split like this:

document = /photos
parameter = /weddings/susan-and-mark

The PhotoGallery plugin can look at this parameter to know exactly what it needs to display.

Now let’s consider a different URL.

/photos/copyright

Charlie checks whether this URL has a match in its list of parent URLs. It does, and it is associated with the SimpleHtml plugin.

So this third approach keeps lopping off the ends of the incoming URL until a parent-URL match is found. This is just as flexible as the second, wildcard approach, but overcomes the limitation of that second approach, which was how to exclude certain URLs from its wildcard matches. This third approach would be awkward if each test required a trip to the database. However, Charlie loads up a DocumentMap (like a site map) for each Domain it serves, and places this DocumentMap in its high-priority cache. These fall-through checks therefore incur little performance penalty, but provide great flexibility in associating URLs with Plugins.

Need I state that it is the third approach that has been adopted for Charlie?

by Alister Jones | Next up: Charlie has Cool URLs - Part 4

0 comments

----

Charlie has Cool URLs - Part 2

by Alister Jones (SomeNewKid)

In my previous weblog entry, I said that it has been a four-step process by which I’ve given Charlie cool URLs. The first step was to implement URL rewriting, and was the topic of the previous entry. The second step was to remove any extensions from URLs, and that’s what I’ll talk about here.

Let’s look an one of Apple’s cool URLs:

www.apple.com/ipod

What is significant about this URL is that it contains no extension. The page name is just ipod, and not ipod.htm or ipod.aspx. Internet Information Services isn’t too keen on such URLs. If IIS receives a request for ipod.asp, it passes the request off to ASP for processing. If IIS receives a request for ipod.aspx, it passes the request off to ASP.NET for processing. If IIS receives a request for ipod.gif, it handles the request itself by simply returning the requested GIF image. But, by default, IIS does not know what to do with requests that have no extension. (In reality, IIS “tests” certain default documents, but that’s a bit beside the point here.)

But there is a trick to get IIS to pass to ASP.NET any request for a file with no extension. The trick is to use what is known as a wildcard mapping (.*) in IIS that points to ASP.NET. This wildcard mapping says to IIS, “Whatever request you get, no matter what its extension, or even if it has no extension, pass that request on to ASP.NET.”

So with this wildcard mapping, cool URLs with no extension can be used in an ASP.NET website. But there is a catch. The wildcard mapping has told IIS to pass all requests to ASP.NET. That means the ASP.NET application will handle requests for style.css, script.js, image.gif, photo.jpg, and so on. For the most part, this presents no problem. ASP.NET will use its StaticFileHandler to simply pick up the requested file and send it back, which is precisely what IIS would have done anyway. But even requests for image.gif and photo.jpg will be subject to the same processing as for page.aspx, including authentication and authorization checks. Moreover, if your application is logging requests, you’ll be logging every request for every little file.

If you’re aware of this catch, then it also presents an opportunity. The ASP.NET application can now adjust those image or text files before being returned, or can create them from scratch. If your application presents a photographer’s photos, the application can stamp a copyright notice on every returned JPEG file. This is an opportunity that Charlie seizes, but I’ll come back to it in a moment.

There is another gotcha about using URLs with no extension. Previously, it was IIS that didn’t know what to do with a request for a file named ipod with no extension. We solved this by giving IIS a wildcard mapping that tells it to pass all requests onto ASP.NET. But this just passes onto the ASP.NET application the same problem. What should it do with a file named simply ipod?

Rather than let ASP.NET deal with this problem, Charlie will add the .aspx extension to any incoming request that does not have an extension. So, while the visitor sees /ipod in his or her browser, Charlie changes this to be /ipod.aspx before full processing of the request occurs. In the previous weblog entry, I described how a new line in the Web.config file told ASP.NET that all requests for an .aspx page should go to the PageBuilder class. So that is the process by which the user sees cool URLs with no extensions, but ASP.NET sees the .aspx extensions that it knows and loves. Charlie just performs a little slight of hand, adding the .aspx extension when ASP.NET is not looking, and thereby gets a cool URL such as /ipod to be handled by its PageBuilder class.

Getting back to the opportunity that presents itself with a wildcard mapping in IIS, Charlie must handle all requests for all file types. To save having to write a thousand words, I’ll draw a diagram that shows what happens, by default, to five separate requests to the Charlie application. Because IIS has been instructed to pass all requests through to ASP.NET, Charlie will need to handle the requests for page.aspx, style.css, script.js, image.gif, and photo.jpg. By default, the last four will be handled by ASP.NET’s built-in StaticFileHandler.

But Charlie would like to seize the opportunity that presents itself. Namely, Charlie can implement its own way of handling requests for CSS Stylesheets, JavaScript files, GIF images, and JPEG photos. That way, Charlie can elect what to do. Charlie can just pick up the requested file and return it, just as the StaticFileHandler would do. Or, Charlie can pick up the requested file and adjust it (such as adding a copyright notice) before returning it. Or, Charlie can dynamically create the file requested (such as creating a GIF graph). ASP.NET makes this easy. Here are the instructions we write in the Web.config file, followed by a diagram showing the result.

<httpHandlers>
    <add verb="*" path="*.aspx" type="PageHandler, AssemblyName"/>
    <add verb="*" path="*.css"  type="CssHandler,  AssemblyName"/>
    <add verb="*" path="*.js"   type="JSHandler,   AssemblyName"/>
    <add verb="*" path="*.gif"  type="GifHandler,  AssemblyName"/>
    <add verb="*" path="*.jpg"  type="JpegHandler, AssemblyName"/>
</httpHandlers>

As I write this, I have not yet created the extra handlers. However, my little alisterjones.com website has these extra handlers, so it will be a simple matter. I’ll write about the handlers when I add them to Charlie.

So there you have a description of how Charlie’s URLs are so cool they don't even have extensions in them, and how Charlie is able to handle all requests for all files for all websites it serves.

by Alister Jones | Next up: Charlie has Cool URLs - Part 3

1 comments

----

Charlie has Cool URLs - Part 1

by Alister Jones (SomeNewKid)

An ongoing joke in Beavis and Butthead is that, to the boys, everything is either cool or it sucks. There is no inbetween. It is cool. Or it sucks.

I have the same black-and-white approach to URLs. A given URL is either cool, or it sucks. Apple has cool URLs:

www.apple.com/ipod

The URLs for Creative suck:

www.creative.com/products/product.asp?category=213&subcategory=214&product=11519

Charlie has cool URLs, if I do say so myself. It has been a four-step process to give Charlie cool URLs, and I’ll dedicate one weblog entry to each step.

The first step to giving Charlie cool URLs was to implement URL rewriting. This is the process by which the incoming cool URL (such as /ipod) is rewritten to the “real” file that handles the request (which may be /products.aspx?id=10). A common technique is described in URL Rewriting in ASP.NET, whereby all incoming page requests are rewritten to one of a handful of actual .aspx pages. The code looks like this:

String coolUrl = HttpContext.Current.Request.Path;
String targetUrl = UrlRewriter.GetHandler(coolUrl);
HttpContext.Current.RewritePath(targetUrl);

In such a URL rewriting system, the developer maintains a handful of .aspx pages, such as ShowProduct.aspx, ShowCategory.aspx, and ShowPerson.aspx. The details of what to show are passed in via QueryString variables. So, the above targetUrl string might look something like this:

ShowProduct.aspx?category=16&product=128&view=details

For most web applications, the process of sheperding all incoming requests to one of a handful of .aspx pages is all that is needed. But Charlie has a special requirement that means this common approach isn’t quite right for my project. The special requirement is that Charlie must host multiple websites from a single installation, and each website must be able to generates pages that have nothing at all in common with pages from another website. For this requirement to be met, the pages must be completely dynamic in their creation. With an .aspx page, some of it will be static (whatever is already on the .aspx page) and some of it will be dynamic. So having a handful of .aspx pages would not meet this requirement of Charlie. So, what are the alternatives? Well, there are two least two.

The first option available for a web application to dynamically create the entire page is to have a single .aspx page with nothing other than a Page directive that points to a page-building code-behind file:

<%@ Page language="C#" AutoEventWireup="false" Inherits="PageBuilder" %>

The page has nothing on it, so its entire content must be generated dynamically. If we place this .aspx in the root folder of the web application, the above URL rewriting code would be adjusted to look something like the following:

String coolUrl = HttpContext.Current.Request.Path;
String querystring = UrlRewriter.GetQueryString(coolUrl);
HttpContext.Current.RewritePath("~/default.aspx?" + querystring);

The QueryString might look something like this:

page=36&template=blue&category=16&product=128&view=details

So, whereas the technique from the article means the developer maintains a handful of separate .aspx pages, this first alternative means the developer has just a single .aspx that, frankly, does nothing—the work is done by the page-building code-behind.

The second option available for a web application to dynamically create the entire page is a variation of the first option just described. Whereas the first option used a single .aspx and a single code-behind, the second option does away with the .aspx page altogether. The .aspx page does nothing except point to the code-behind, yet there’s another way we can point to the code-behind. Namely, we can use the Web.config file to direct all .aspx page requests to that code-behind. Here is the relevant section of the Web.config file:

<httpHandlers>
    <add verb="*" 
         path="*.aspx" 
         type="PageBuilder, AssemblyName"/>
</httpHandlers>

What this means is that any request with an .aspx extension will be handled by the PageBuilder class. There is no need for any .aspx page to exist, which means that the above URL rewriting code can be modified like this:

String coolUrl = HttpContext.Current.Request.Path;
String targetUrl = UrlRewriter.GetHandler(coolUrl);
HttpContext.Current.RewritePath(targetUrl);

Yes, this is the same code as that shown first. However, previously the targetUrl had to point to an existing .aspx page. This time, the targetUrl does not need to point to any real .aspx page, and could look like this:

show-a-porsche-with-a-black-background.aspx?otherwise=ferrari

Using a virtually-empty .aspx file and using a Web.config entry both do the same thing: point to the code-behind that dynamically creates the entire page. While I could have used either option, I went with the second option of using the Web.config to point to the code-behind. First, I didn’t fancy having a do-nothing .aspx page. Second, and for reasons I’ll come to in my next weblog entry, Charlie would end up needing quite a number of these do-nothing pages. The Web.config option makes the do-nothing pages unnecessary, so that is the approach I took.

by Alister Jones | Next up: Charlie has Cool URLs - Part 2

0 comments

----

The Two Devils of SQL

by Alister Jones (SomeNewKid)

Charlie and I have a real love-hate relationship with SQL Server. Charlie loves how fast and how flexible it is. I hate how hard it is to use. I have been using ASP.NET for about three years now, and in that time I have tried to install innumerable free and commercial ASP.NET applications. I have succeeded only about ten percent of the time, with the problem always being the database. This triggered for a me a deep dislike for SQL Server.

When I first started thinking about creating a website framework, I was very tempted to use XML files as the data store. I like and understand XML, whereas I did not like and did not understand SQL Server. Fortunately, one company and one person came to my aid, and made SQL Server a viable data store for Charlie.

The company that came to my aid was Microsoft. Last year Microsoft deemed me worthy of an MVP award, and with that award came a subscription to the MSDN Network. I was able to download SQL Server 2000 and, with its Enterprise Manager, finally have a user interface to the database. The interface sucks, but at least I finally had one.

The person that came to my aid was Terri Morton. Poor Terri had to endure a hundred questions—and as many expletives—while I fumbled about with installing, configuring, and finally using SQL Server. Fortunately, Terri is both smarter and more patient than I am, and stuck with me until I finally had the SQL beast under control. Thank you, Terri.

So Charlie was undertaken with SQL Server as its data store. While I can get SQL Server to do what I need, I still dislike working with the database. The easy stuff, such as writing CRUD methods, is so boring and so repetitive that it begs to be automated. The hard stuff, such as schema design and stored procedures, is so difficult for me that I know I’ll never do a good job with it. With the hard stuff, I have taken a pragmatic approach that would surely please any seasoned architect: I’m just not going to worry about it. If I can get Charlie working with a dirt-simple database design and no stored procedures, then that is good enough. If I ever get the darn thing finished, I’ll engage a database person to come in and rework the database for Charlie version 2.0. Until that time, if my dirt-simple database works, then that’s good enough.

It is the easy stuff that concerns me. Writing CRUD methods is so boring and so repetitive that I have given serious consideration to using the WilsonORMapper. This leads me to a choice between two devils. The devil I know is hand-coded CRUD methods. The devil I don’t know is Object-Relational Mapping.

Taking the advice of the aardvark—that simple is better than complicated—I have decided to stick with the devil I know, and just hand-code all CRUD methods. Sure it’s boring, sure it’s repetitive, but it is also simple and flexible. I have considered refactoring the database code, so that some of the repetitive code can be moved to either a base class or a utility class. However, I have decided that I will just keep the code simple, if repetitive. If my yet-to-be-engaged database specialist wants to introduce changes, that will be his or her prerogative.

I have however decided to give a friendly wink to the devil I don’t know. One of the nice features of Paul Wilson’s ORMapper is that it supports an IObjectHelper interface. Without this interface, object-relational mapping is performed by reflection, which is a relatively slow process. With this interface, the object-relational mapping is performed through a known indexer:

public interface IObjectHelper
{
    Object this[String memberName] { get; set; }
}

Here is a very simple business object that implements this IObjectHelper interface:

public class Person : IObjectHelper
{
    public Int32 ID
    {
        get
        {
            return this.id;
        }
    }
    private Int32 id;
    
    public String Name
    {
        get
        {
            return this.name;
        }
        set
        {
            this.name = value;
        }
    }
    private String name;
    
    public Object this[String memberName] 
    {
        get 
        {
            switch (memberName) 
            {
                case "id": return this.id;
                case "name": return this.name;
                default: throw new ArgumentException
                         ("Invalid Member", memberName);
            }
        }
        set
        {
            switch (memberName) 
            {
                case "id": this.id = (Int32)value; break;
                case "name": this.name = (String)value; break;
                default: throw new ArgumentException
                         ("Invalid Member", memberName);
            }
        }
    }
}

There are two great features here. First, the IObjectHelper interface is optional. If a business object does not implement this interface, then Paul’s WilsonORMapper will simply use reflection to populate the business object. If a business object does implement this interface, then the WilsonORMapper will use it to avoid the costs of reflection. You can read more about this in Paul’s weblog entry on O/R Mappers: Avoiding Reflection. The second great feature is that the interface provides a single point at which the Persistence layer interacts with the Business layer. This is of more benefit than is immediately apparent. To see why, have a look at a property of Charlie’s Article entity:

public String Title
{
    get
    {
        return this.title;
    }
    set
    {
        if (this.title != value)
        {
            this.title = value;
            MarkDirty();
        }
    }
}
private String title = String.Empty;

For reasons of performance and user experience, Charlie will not commit a business object to the database unless that object has actually changed (in which case its data is considered “dirty”). By using reflection, we could alter the private title field, so the object would not be marked as dirty. But reflection comes with the penalty of performance and complexity. Without relection, our Data Access code must work through the public Title property, so the object will be marked as dirty. But, when the object has been freshly retrieved from the database, it is not dirty, so this would an erroneous dirty flag.

The current solution in Charlie is to have the EntityManager “reset” the dirty flag on a freshly-retrieved object.

entity = this.Mapper.Retrieve(entity, criteria);
entity.MarkAfterLoad();
return entity;

This approach is a little clumsy, but it works just fine. And after all, a solution that works is a working solution. Even though this works, I have decided to replace this solution with the IObjectHelper interface. This way, my hand-coded CRUD methods have a single point of working with Charlie’s business objects. Later, if I switch to using the WilsonORMapper, the business objects will not need to change at all. That seems to be a good compromise between the devil I know and the devil I don’t.

by Alister Jones | Next up: Charlie has Cool URLs - Part 1

4 comments

----

Charlie’s Alarm Clock

by Alister Jones (SomeNewKid)

Readers, if you have not already subscribed to WilsonDotNet.com, go and do it now. For just $50, you get so much useful stuff. Even if you don’t care to look at the code, the components themselves are worth the price. But the real value comes from looking at the code of those components. You are nearly guaranteed to learn a few things that will save you many hours on your own projects. If you place any value on your own time, you will appreciate that joining Paul Wilson’s website is simply the smart thing to do—the subscription will pay for itself many times over.

Sure, the above is a sales pitch. But this weblog is an honest account of everything I have learned while designing Charlie, and the benefit of subscribing to Paul’s website is something I am learning over and over again.

I have just now added Paul’s KeepAlive code to Charlie. The WebApplication class has had the following static constructor introduced:

static WebApplication()
{
    Timer.Instance.Elapsed += new EventHandler(KeepAlive);
}

Both the Timer class on the left and the KeepAlive handler on the right are lifted straight out of Paul’s WebPortal project. Every fifteen minutes, the Timer elapses and Charlie requests one of its own pages. This page request is all that is needed to stop ASP.NET from unloading the application. So this code effectively introduces an alarm clock that will wake Charlie up before it has the chance to fall asleep.

Because the Charlie application is kept awake, this means that the Cache is not unloaded. The Cache too was enhanced with an idea or two from the Wilson WebPortal.

So in the last two days, I have added three separate ideas from the Wilson WebPortal. Because all of Paul’s components are provided with source code, I was able to open Paul’s code in one window, open Charlie’s code in another window, and virtually copy the code. The code was not literally copied, since I tweaked Paul’s code to work with Charlie’s design. But having a source-provided component to work from meant that it took me only about an hour to accelerate Charlie’s performance. It would have taken me a few hours to first come up with the same ideas, and then many more hours to research the ideas and write the code. That is my subscription paid for, right there. And that is not yet taking into account the WilsonORMapper, which I will talk about in my next weblog entry.

In these days of spam and affiliate sponsors and viral marketing and so on, it is very hard to make a strong recommendation without it coming across as insincere. Yet the simple truth is that I have no vested interest Paul’s websites, but Charlie has benefitted so much from my subscription to his website that I must give credit where it is due. I have benefitted from my subscription, Charlie has benefitted, and you will too.

Thank you, Paul Wilson.

by Alister Jones | Next up: The Two Devils of SQL

1 comments

----

The EntityCache Object

by Alister Jones (SomeNewKid)

When I first designed the Entity System for Charlie, I gave the EntityManager properties for IsCacheable, CacheDependency, CacheItemPriority, CacheSlidingExpiration, and so on. With these properties, each concete Manager object can specify how caching should be applied to any Entity or EntityCollection objects it returns. For example, the RoleManager can specify whether the Role objects it returns should be cached and, if so, how they should be cached.

To start with, I set the IsCacheable property to false for every Manager. By effectively disabling the cache, I could concentrate on the functionality of the business objects, leaving caching as a final optimisation step. As I explained in my earlier weblog entry, I brought forward the plan to implement caching of the business objects. I left all of the caching properties in the base EntityManager class, but moved the actual caching work to a new EntityCache class. That way, I can change the caching mechanism without affecting the EntityManager object, which doesn’t really care how caching is implemented.

Let’s take a quick look at how caching works in the context of the Entity System. Specifically, it is the job of the base EntityManager object to test whether the requested Entity already exists in cache. Here is a cut-down version of the relevant code:

public abstract class EntityManager
{
    protected Entity LoadEntity(EntityCriteria criteria)
    {
        String cacheKey = // to be discussed
        Object cached = this.GetEntityFromCache(cacheKey);
        
        if (cached != null && cached is Entity)
        {
            Object clone = ((Entity)cached).Clone();
            entity = (Entity)clone;
        }
        else
        {
            entity = this.Mapper.Retrieve(entity, criteria);
            this.AddEntityToCache(cacheKey, entity);
        }
        return entity;
    }
}

To see why the cached Entity is cloned before being returned to the calling class, please refer to my previous weblog entry.

The very first problem to be solved is what key value can we use to determine whether the cache contains the Entity being requested? It is not enough that we simply use the ID value of the Entity being requested. After all, an ArticleEntity with an ID value of 32 is not the same as a WeblogEntity with an ID value of 32. Further, it is not enough to use a key representing the entity type and the ID value (such as “Charlie.Articles.Article.32”), since localisation means that the cached entity may be the English version, when the incoming request is for the French version. This suggests that we could use a key that is a composite of the entity type, the culture, and the ID (such as “Charlie.Articles.Article.32.en-US”). The problem here is that the concrete ArticleManager may require more fine-grained caching, such as caching one version of the Article together with comments by visitors, and another version of the Article without comments by visitors. But the base EntityManager class cannot possibly know of all the caching variations required by the concrete manager classes. So, how can the base EntityManager class implement caching, when it cannot know of the variations required by the concrete manager classes?

Fortunately, this was a very easy problem to solve. We need only look closely at the LoadEntity method of the base EntityManager class to see that there is something that uniquely describes the Entity being requested, and that that something properly reflects all of the caching variations needed:

public abstract class EntityManager
{
    protected Entity LoadEntity(EntityCriteria criteria)
    {
        String cacheKey = criteria.ToString();
        // rest of the class
    }
}

To understand how the criteria reflects all of the caching variations needed, here is a pretend ForumCriteria class:

public class ForumCriteria
{
    public Int32   ID;
    public Boolean LoadById;
    
    public String  Name;
    public Boolean LoadByName;

    public Culture Culture;
    public Boolean LoadByCulture;

    public Boolean LoadReplies;

    public Boolean LoadAvatars;
    
    public override String ToString()
    {
        return
            "Charlie.Forums.Forum" +
            ID + LoadById +
            Name + LoadByName +
            Culture + LoadByCulture +
            LoadReplies +
            LoadAvatars;
    }
}

When the base EntityManager class calls criteria.ToString(), it will end up with a unique cache key, such as:

Charlie.Forums.Forum.32.True..False.en-US.True.True.False

This means that the base EntityManager will always cache entities in a way that precisely reflects how that entity was requested by the incoming criteria. Problem solved. We could actually use reflection instead of forcing the developer to implement a custom ToString() method. However, in the comments in my code, I have said the following:

//  Yeah yeah, we could use reflection to inspect this criteria class.
//  But, what's the point of using Cache for speed if we use slow old
//  reflection to get the cache key?

The second problem to be solved was getting the EntityCache class to return lazy-loaded entity objects, and not fully-loaded entity objects. My last two weblog entries described the problems I faced here, and the solution provided by cloning entities.

The final problem to be solved was whether to use the intrinsic ASP.NET cache, or implement a custom caching mechanism. Initially the plan was to use the intrinsic cache, which is why the EntityManager class provides the following properties (simplified as fields):

public abstract class EntityManager
{
    protected Boolean                  IsCacheable;
    protected CacheDependency          CacheDependency;
    protected CacheItemPriority        CacheItemPriority;
    protected CacheItemRemovedCallback CacheItemRemovedCallback;
    protected DateTime                 CacheAbsoluteExpiration;
    protected TimeSpan                 CacheSlidingExpiration;
}

This allows each concrete manager (such as the RoleManager) to precisely describe whether the entity handled by this manager (the Role entity) should be cached and, if so, how it should be cached. If you consider that the CacheDependency property can be a SqlCacheDependency, you will appreciate just how flexible is the intrinsic ASP.NET cache.

I then considered using a custom cache, based on that used by Paul Wilson in his WebPortal project. But I eventually decided against this approach. To start with, my recent foray into caching proved that I don’t have a great understanding of reference types, and I had also learned that I don’t have a solid understanding of static classes. More importantly, I felt that the flexibility of the intrinsic ASP.NET cache was too beneficial to give up. I decided that Charlie would use the built-in cache, but with two enhancements.

The first enhancement was to use the idea provided by the Wilson WebPortal, even though I would not use its implementation. Specifically, if the concrete manager specified that the CacheItemPriority was High (which is the highest setting available), then the EntityCache would not only add a clone to the built-in cache, it would also add the same clone to a custom HighPriorityCache. This custom HighPriorityCache was nothing more than a static class that exposed a Hashtable. Because the cached item was being referenced by a static class, that item would not be removed from the cache. Without that reference, the item could be removed from the cache. All this really means is that when the EntityCache sees that an entity has a CacheItemPriority of High, it performs a little trick that moves the priority from high to permanent—that item simply will not be expired from cache within the lifetime of the Charlie application.

The second enhancement was to come up with a way to immediately expire objects from the cache. With the intrinsic ASP.NET cache, it is not possible to forcibly expire an item from the cache. You cannot expire the item, nor can you set it to null. However, I desperately wanted to give Charlie the ability to instantly expire any cached Entity or EntityCollection. The reason I wanted to do this is because experience on other websites has shown that it is a pain in the ass when you make changes to the website, but those changes are not reflected for up to 15 minutes (or whenever the cache expires). I don’t want to subject Charlie’s website owners to that same frustrating experience.

The solution was very simple. The EntityCache class works only with the EntityManager class, which in turn works only with Entity and EntityCollection objects. So, I gave both the Entity and the EntityCollection classes an internal IsExpired property.

internal Boolean IsExpired
{
    get
    {
        return this.isExpired;
    }
    set
    {
        this.isExpired = value;
    }
}
private Boolean isExpired;

Then, when the EntityCache class receives a request to remove an item from its cache, it looks first to see whether that item is in cache. If that item is in cache, it cannot actually delete that item—the ASP.NET cache does not allow this. Instead, the EntityCache class marks the cached Entity or EntityCollection as being expired.

internal void Remove(String cacheKey)
{
    Object cached = HttpContext.Current.Cache[thisKey];
    if (cached != null)
    {
        if (cached is Entity)
        {
            ((Entity)cached).IsExpired = true;
        }
        else if (cached is EntityCollection)
        {
            ((EntityCollection)cached).IsExpired = true;
        }
    }
}

Then, when the EntityCache receives a request for a cached Entity or EntityCollection, it will look to see whether there is a cached item and, if so, whether that item is marked as expired. If there is no cache item, or if there is a cached item but it is marked as expired, the EntityCache will return a null value.

internal Object Get(String cacheKey)
{
    Object cached = HttpContext.Current.Cache[cacheKey];
    if (cached == null)
        return null;
    if (cached is Entity)
    {
        if (((Entity)cached).IsExpired)
        {
            return null;
        }
        else
        {
            return cached;
        }
    }
    else if (cached is EntityCollection)
    {
        if (((EntityCollection)cached).IsExpired)
        {
            return null;
        }
        else
        {
            return cached;
        }
    }
    return null;
}

When the EntityManager class receives a request to Save an Entity or EntityCollection, the manager will automatically expired any cached copies. When the Entity or EntityCollection is next requested, it will be drawn from the database, which means it will always reflect the most recently committed changes. This was the problem to be solved, and it was solved with minimal code changes. (By the way, I know that the above code could be compressed. However, I feel no motivation to compress code if it would make it harder to understand what is going on. A lot of my code will look relatively verbose, but that’s fine by me. It’s fine by the compiler, too.)

After a few bumbling steps, Charlie’s Entity System is now pretty darn fast. It now makes use of an enhanced cache that stops high priority items from expiring, and that allows for immediate expiration of updated items. And there is no O/R mapping or reflection to slow down the Entity System.

by Alister Jones | Next up: Charlie’s Alarm Clock

0 comments

----

A Word on Lazy Loading

by Alister Jones (SomeNewKid)

In my previous weblog entry, I described my fundamental misunderstanding of how the ASP.NET Cache works. But while I finally understood that the cache works like a shadow, I needed Charlie’s cache to work like a box. The reason I needed a box is that Charlie’s business objects use lazy loading. And what, you ask, is lazy loading?

One of the fundamental business objects in Charlie is the Document entity, which represents a document on the internet. The Document entity includes a Containers property, which exposes a ContainerCollection. For example, the Homepage document might include a Header container, a Menu container, a Content container, and a Footer container. Another object can “get” these containers through the Document.Containers property. So, a fully-loaded Document object would look like this:

What you will see from the above diagram is that the Document object contains to the Containers object. You will also notice that the Containers object is a much bigger object than the Document object itself. (If you took away the Containers object, the Document object would collapse to a much smaller size.) The Document object just includes simple properties such as its URL and its Title. The Containers however include Models, Views, Controllers, and much more.

If a particular Document object is being presented to the user, then we do want these Containers. The Containers will hold the header, content, footer, and everything else in the Document. So to display the Document, we need its Containers.

However, if we are presenting a site map page or similar, then we do not need these Containers. All we want is the Document with its URL and its Title. Loading up all of those Containers for every Document on the site map page will incur a major database hit, when we won’t be using the Containers anyway.

Lazy loading is a technique by which we load up just the main business object, and not all of its child business objects. In this example, we would load up the Document object, but leave the Containers property as null. Here is how the code looks, followed by an illustration of the half-loaded Document business object.

public class Document
{
    public ContainerCollection Containers
    {
        get
        {
            // We'll come back to this
        }
    }
    private ContainerCollection containers = null;
    
    // rest of class
}

You will see that, to start with, the private containers field is null. A given Document’s containers are not loaded when the Document is retrieved from the database. This means that any Document used is nice and light, and does not carry a heavy Containers collection. Again, if we are working with a site map or similar page, the Containers will not be used, so it is a waste to load them. However, if another class requests the Containers from a given Document, then that is the point at which the Document will have to go and fetch its containers from the database. Here is how that looks in code, and how you might envisage the process.

public class Document
{
    public ContainerCollection Containers
    {
        get
        {
            if (this.containers == null)
            {
                this.containers = 
                    ContainerManager.GetCollectionByDocumentId(this.Id);
            }
            return this.containers;
        }
    }
    private ContainerCollection containers = null;
    
    // rest of class
}

This process of deferred loading of child business objects is known as lazy loading. I have said that lazy loading means that Charlie wants the cache to work like a box, and not like a shadow. Why?

Let’s look at how a Document might be loaded. (This is not how it looks in Charlie. This is just a simple example.)

public class DocumentManager
{
    public Document GetDocumentById(Int32 id)
    {
        String cacheKey = "Charlie.Document." + id.ToString();
        
        // If we have the document in cache, return it
        Object cached = HttpContext.Current.Cache[cacheKey];
        if (cached != null && cached is Document)
        {
            return cached as Document;
        }
        
        // Get the document from the database
        Document document = DataAccessLayer.LoadDocumentById(id);
            
        // Store the document in cache
        HttpContext.Current.Cache.Insert(cacheKey, document);
        
        // Return the newly-loaded document
        return document;
    }
    
    // rest of class
}

To put the code into words, the DocumentManager first checks whether the requested Document has been loaded before and has been placed into the cache. If so, that cached version will be returned. If the requested Document has not been loaded recently, then it will be fetched from the database. That newly-fetched Document will be stored into cache, so that the next request for the same Document can be served from the cache rather than incurring another database hit.

Let us presume that the DocumentManager receives a request for a Document that has not been loaded recently. It requests the document from the Data Access layer. Keeping in mind that the Document uses lazy loading, it has a whopping hole where its Containers would be. Here then is a diagram of this newly-loaded Document.

The DocumentManager will store this new Document in the cache.

The original document is returned to the application. The application is presenting a single full page, and not presenting a site map or similar page. So, it will call the Document.Containers property in order to get the containers it needs to present the Document. Because the Containers property is lazy loaded, this will force the Containers to be fetched separately, and used to “fill out” the original Document. What is important to note is that because the cache works like a shadow, the cached version of the document is also filled out.

The document will be returned to Peter, who is a visitor to the website. Now, Samantha requests the same document. This time, the DocumentManager will see that it does have a cached version of the same Document, so it will return the Document from cache, rather than loading it from the database. However, because the cached version shadowed changes to the last-requested original version, then the cache will return the filled-out version of the Document, and not the half-empty, lazy-loaded version of the same Document.

This is no good. We don’t want the DocumentManager to sometimes return a lazy-loaded version, and sometimes return a fully-loaded version. Peter and Samantha may have different authorization roles, or may have different user preferences, so the Document needs slightly different Containers for each user. (Or Peter may be an administrator, and has changed the Containers or their content.) To allow Peter and Samantha to view slightly different versions of the same Document, we need the DocumentManager to always return a lazy-loaded Document, and never return a fully-loaded Document. Moreover, the class that requests a Document from the DocumentManager should never have to worry about the possibility that the returned Document may be an old, filled-out version. That calling class should be able to presume that the returned Document is always a new, lazy-loaded version. For this to be true, we need the cache to work like a box, and not like a shadow.

Before Charlie, I had never used lazy-loaded business objects. Before Charlie, I had never used localised business objects. Before Charlie, I had never used fine-grained security on business objects. So, when my inexperience with secured, personalized, localised, lazy-loaded business objects was combined with my misunderstanding of the cache, all hell broke loose in Charlie. The ultimate solution was to clone an object going into the cache, and then clone an object coming out of the cache.

public class DocumentManager
{
    public Document GetDocumentById(Int32 id)
    {
        String cacheKey = "Charlie.Document." + id.ToString();
        
        // If we have the document in cache, get it
        Object cached = HttpContext.Current.Cache[cacheKey];
        if (cached != null && cached is Document)
        {
            Document original = (Document)cached;

            // So that changes are not shadowed, 
            // return a clone, not the cached original.
            Document clone = original.Clone();
            return clone;
        }
        
        // Get the document from the database
        Document original = DataAccessLayer.LoadDocumentById(id);
        
        // Clone the original document
        Document clone = original.Clone();
            
        // Store the clone in cache
        HttpContext.Current.Cache.Insert(cacheKey, clone);
        
        // Return the newly-loaded, original document
        return original;
    }
    
    // rest of class
}

By cloning an object before it is placed in the cache, and then cloning that object as it comes out of the cache, the cache works like a box and not like a shadow. And by having the cache work like a box and not like a shadow, the integrity of lazy-loaded business objects is preserved.

One final note is that cloning a business object is not as easy as calling the Clone method, since by default there is no such method. The Clone method in Charlie’s business objects is a custom method of the underlying Entity System. This custom method is derived from Expert C# Business Objects.

Wow, what a long weblog entry. It is my hope that it saves someone else from experiencing the same problems that I experienced with caching lazy-loaded business objects.

by Alister Jones | Next up: The EntityCache Object

0 comments

----

The Cache is a Shadow, Not a Box

by Alister Jones (SomeNewKid)

Ahhh, the trials of being a self-taught developer. Just when you think you’re getting a handle on things, you receive fresh evidence that you don’t know jack. One week ago I saw Beginning Programming for Dummies in my city’s technical bookshop, and I allowed myself a wry smile, thinking, “I’m way beyond that.” Today, I plan to go into the city and buy that book. In the last week I have discovered that I really don’t understand the fundamentals of programming.

I have long known that application optimisation should be a final polishing step, not an initial design step. For that reason, my Entity System for Charlie included a few basic caching features, but I had disabled them. The plan was to enable the caching after the rest of Charlie had been developed, as part of a general optimisation phase. However, I brought forward the caching phase, as I wanted to make use of the custom cache code from Paul Wilson’s WebPortal project.

So I enabled the existing caching features of the Entity System and—snap!—the security stopped working. I could not figure out why. Worse, I sent myself off on a wild goose chase. My debugging efforts suggested that the problem was how my business objects were being populated. (If populated from the database, they worked. If populated from the cache, they did not work.) But I could not figure out where I had gone wrong.

Last night I was tossing and turning in my bed, worrying about Charlie. I’d like to say that a supermodel elbowed me in the side and told me to cut it out. But that had been the night before. Last night it suddenly occurred to me that perhaps I was misunderstanding the basics of the cache. Now this may seem so obvious that I should be ashamed of myself. However I had long been using the cache without problem, so it did not occur to me that I might have a gross misunderstanding of it. I mean, we stick an object into cache, and later get it out again. Who could get that wrong? Turns out, I was getting that wrong.

Let me tell you how I thought the cache worked. We’ll start by creating a simple ArrayList object. Here is the code, and a diagram illustrating how I viewed this object in my mind’s eye.

ArrayList original = new ArrayList();
original.Append("1");

Let’s put the object in cache. Again, here is the code, followed by a diagram illustrating how I envisaged the cache working.

Cache.Insert("original", original);

Clearly, I viewed the cache as being like a box into which we can place objects that we later want to retrieve. To show more clearly how I saw the box working, let’s add a few more items to the original object.

original.Append("2");
original.Append("3");

In my mind’s eye, I saw the cached object as being unaffected. Because that is how I saw the cache working, my belief was that if I retrieved the object from cache, it would be in the same state as it went into the cache.

ArrayList copy = Cache["original"] as ArrayList;

As I have suggested, I took it as a given that the cache worked like a box. When the object goes in, it remains unchanged until we take it out again. This seems so natural to me that it never occurred to me that the cache could work any other way. (But if I properly understood reference types, I might have realised my error. This is why I think I need to read Programming for Dummies after all.)

The belated insight I had last night was that the cache may not work like a box—that my basic assumption was wrong. I have just now created a test webpage in Charlie, and used its logging plugin to record what happens to an object placed in its cache. Here is what I discovered, starting again with the original ArrayList object.

ArrayList original = new ArrayList();
original.Append("1");

Again we put the original object in cache. This time, we envisage the cached object as being a shadow of the original object.

Cache.Insert("original", original);

If we see the cached copy as working like a shadow, and not like a box, we can properly predict what will happen if we then change the original object.

original.Append("2");
original.Append("3");

With this correct mental picture, we can also understand what happens when we retrieve the cached object.

ArrayList copy = Cache["original"] as ArrayList;

Because I had the wrong mental picture of the cache, I was using it improperly in the Entity System for Charlie. The subsequent problem to be solved is that the Entity System needs the cache to work like a box, not like a shadow. Fortunately, this is easy. If you don’t want the cached object to shadow the changes to the original object, you need to cache a clone of the original object. The clone will be unaffected by any changes to the original object.

ArrayList clone = original.Clone() as ArrayList;
Cache.Insert("clone", clone);

I cannot believe that I am the only person to have mistakenly viewed the cache as working like a box. If so, I hope this weblog entry may help others.

by Alister Jones | Next up: A Word on Lazy Loading

3 comments

----