Proper Rack Middleware Ordering

It occurred to me the other day, that I should take a look at the middleware I use on this blog. I don't know what it was. My spidey senses just tingled.

Boy was I right. I totally had it backwards.

Rack middleware is a fantastic thing. It's like a little encapsulated rack application that you can use to filter, process, or otherwise mess with responses. There is middleware to add etags, configure caching, catch and log exceptions, deal with cookies, handle SSO, and pretty much anything else you can think of. Oh, and they work on any rack application; it is rack middleware after all. And in case you missed it, rails is a rack application. Create a new rails app and run

% rake middleware

You'll see all the middleware that is included by default.

Anyway. The thing with rack middleware is that it runs in the order you specify them, top to bottom, and then by nature of how they work, they sort of rewind out.

Okay so WTF does that mean? A basic middleware looks kind of like this:

	require 'digest/md5'

	module Rack
	# Automatically sets the ETag header on all String bodies
	class ETag
	def initialize(app)
	@app = app
	end

	def call(env)
	status, headers, body = @app.call(env)

	if !headers.has_key?('ETag')
	parts = []
	body.each { \|part\| parts << part.to_s }
	headers['ETag'] = %("#{Digest::MD5.hexdigest(parts.join(""))}")
	[status, headers, parts]
	else
	[status, headers, body]
	end
	end
	end
	end

view raw etag.rb hosted with ❤ by GitHub

That's the etag middleware. It adds an etag value to responses. The required parts are the initialize method taking the application (which is a rails app, sinatra app, whatever), and the call method, taking an environment. Initialize sets things up, and call is what happens when a request comes in. The whole idea is you do:

@app.call(env)

In your call method, where @app could be another middleware, or the actual application, but regardless it eventually gets all the way down to the real application. As the methods return, it comes back up with a response body, headers, and status code. In the etag example, @app.call(env) is done immediately and the results processed; the etag value is set in the headers.

So let's think about this for a second. Image you have some setup like this:

use Rack::Etag
use Rack::ResponseTimeInjector
use Rack::Hoptoad

Does that really make sense? When you use middleware, you're telling your framework or whatever to append that middleware to the chain. So request comes in, goes through middleware, then hits your application.

In this case:

We come into Etag...
...which calls the ResponseTimeInjector middleware...
...which calls the Hoptoad middleware to catch exceptions...
...which calls your application...
...which returns to the Hoptoad middleware...
...which returns to the ResponseTimeInjector middleware...
...which inserts the response time into the body...
...which then returns (with the modified body) to the Etag middleware...
...which calculates the etag value and puts it in the headers...
...which returns and lets rack send the response back.

Whew! Lots of steps there, but this might make more sense:

Okay so what's the problem? The etag is calculated after the response time is injected, so that's fine (imagine if the etag middleware was at the bottom). What about poor Hoptoad? What if there is an exception thrown in the ResponseTimeInjector or Etag middleware? Hoptoad isn't going to catch it! The Hoptoad middleware doesn't modify anything in the response, so it needs to be up higher; it needs to be first.

use Rack::Hoptoad
use Rack::Etag
use Rack::ResponseTimeInjector

Diagram time:

That's better! This is basically the problem I had, except worse. I don't know what I was thinking, but my middleware was all out of order: before and after.

See that? It's gross! I had Etag near the bottom, my exception logger was all the way at the bottom. The only one that was in a remotely right place was CanonicalHost.

The really terrible part about the original was that the body was being modified by 4 different middleware classes after the etag middleware runs and returns, hence the etag was wrong.

So hopefully if you are a rack middleware nerd already, you probably knew this stuff and stopped reading a while ago, or you are laughing at me. Otherwise, you might consider thinking twice about how your middleware is organized. Maybe go take a look at your middleware stack anyway and see if anything can be optimized.

Now go use some rack middleware! Ba dum tiss! (See that, see what I did there? In code you use middleware, and at a higher level as a developer you use middleware, so ... ah nevermind)