Chapter 2. Configuration

Table of Contents

1. Handlers
1.1. Script Handlers
1.2. RHTML Handlers
1.3. Framework Handlers
1.4. Access Handlers
1.5. Summary
2. Variables and Contexts
2.1. Configuration Contexts
2.2. Configuration Variables
2.3. Environmental Variables
3. Conclusion

ModRuby is designed to be extremely flexible. Its main function besides embedding Ruby within Apache is to provide a configurable interface with which to invoke Ruby code from Apache in a variety of ways — from executing the most bare-bones CGI scripts to hooking in entire web frameworks.

Connecting ModRuby to your code is all done via Apache configuration directives. There are endless ways you can do this, but it’s all ultimately pretty simple. We will cover the necessary Apache configuration mechanics and work through many different scenarios which should give you a good feel for how everything works and what you can do.

1. Handlers

Everything is done in the context of handlers. Apache has many different kinds of handlers for different stages of the request handling process, but the most basic and common by far is the “content generator.” That’s where you, well, generate content ... which is basically what everyone thinks of server-side web programming anyway. That said, all handlers in ModRuby are in Apache parlance “content generators.” We will call these “module handlers.” These handlers are written in C and reside in the ModRuby module and they can be invoked from the Apache configuration as canonical Apache handlers. When you load the mod_ruby.so module in Apache, these handlers become available for you to use.

To be useful, a module handler must be linked up to a Ruby method of some kind. We refer to these Ruby methods as just “Ruby handlers” — they are what handles the request. There are three parameters that must be specified to link a module handler to a Ruby method:

  • Module: defines the Ruby module specified via the Ruby require directive

  • Class: specifies the class name within the respective module. This can be qualified with one or more module namespace prefixes (e.g. Juji::Fruit)

  • Method: specifies the method name within the class. This method takes exactly one argument which is used to pass in the Apache request object for the specific request being handled.

With these, the ModRuby module can load the Ruby module, instantiate an object of the given class, and then call the specified method, passing in the current request, which is an instance of the Apache::Request class. From that point on, it is up to the Ruby method to service the request (e.g. generate content). That’s the whole process. So configuration is thus the process of connecting module handlers (C functions in the mod_ruby Apache module) to Ruby methods in your files, and this is all done by specifying these connections in the Apache configuration file(s).

As with any Apache configuration, you can define all the file types, locations, directories and various other conditions under which your Ruby handler(s) should be called. There are almost limitless possibilities for how you can route requests to handlers. The whole point of all the different module handlers to simply to route the request to the right Ruby handler, and from that point on, it’s all Ruby. Module handlers are just Apache-configurable connectors that give you fine grained control over (1) what handlers are invoked, (2) under what circumstances, (3) in what specific configuration contexts.

In the Apache configuration file, there are two independent phases you must address in order to connect a web request to your Ruby handler:

  1. You specify the specific condition(s) in which the module handler is to be invoked. This connects the Apache request with a specific ModRuby handler.

  2. You associate the ModRuby handler with a specific Ruby method that is to service the request. This completes the route from Apache request to Ruby code.

That said, there are two classes of module handlers: script handlers and framework handlers. Script handlers exist specifically for processing RHTML and optimized Ruby (CGI) scripts. Framework handlers, on the other hand, are completely open-ended: they allow for a high degree of specificity under which conditions they are run and allow you to add custom configuration variables for each handler that are passed into the handler’s Ruby environment. They are designed to connect to bigger, more complex environments (e.g. frameworks).

1.1. Script Handlers

The script handlers are targeted at processing RHTML and Ruby CGI scripts. They consist of the ruby-rhtml-handler and ruby-script-handler configuration values, respectively. To use them, you would do something like the following in your Apache configuration:

<IfModule ruby_module>

AddHandler ruby-rhtml-handler .rhtml
AddHandler ruby-script-handler .rb .ruby

# Or perhaps
<Files ~ "\.(rhtml)$">
    SetHandler ruby-rhtml-handler
</Files>

# And maybe
<Location /ruby-cgi>
    SetHandler ruby-script-handler
</Files>

</IfModule>

You have to understand a little about Apache configuration file syntax for this to be crystal clear. Let’s take just the first two lines (the remaining examples of Apache directives will be covered later). The first two lines use Apache's AddHandler directive to tell Apache that if it sees a file with the extension of .rhtml to call the ruby-rhtml-handler handler. Similarly, if it sees a file with an extension of .rb or .ruby, to call the ruby-script-handler. Both of these handlers are just C functions in the mod_ruby.so module. Basically, these directives will cause Apache to pass control to these C functions for requests that ask for files with .rhtml, .rb or .ruby extensions. This takes care of phase 1.

Now we have to connect these module handlers to Ruby handlers (phase 2). With script handlers, we do this using two ModRuby directives: RubyDefaultHandlerModule and RubyDefaultHandlerClass. They specify the Ruby module and the class within that module that contains the handlers. ModRuby includes a default module that implements both handlers, called modruby/handler. Its implementation is as follows:

module ModRuby

# This implements the generic request handlers, specifically the (Ruby) script
# and RHTML handlers.

class Handler

  # RHTML script handler
  def rhtml(req)
    Runner.new(req).runRhtml()
  end

  # Ruby script handler
  def script(req)
    Runner.new(req).runScript()
  end
end

end # module ModRuby

There is one method for the RHTML handler and one method for the script handler. The Runner class is just a cleanroom environment with which to run code in. The environment (global namespace) in which the code runs will be completely erased when the Runner object destructs, and all objects and memory allocated by the handler freed. Thus each script runs in a clean, self-contained environment which is completely disposed of when the Ruby handler finishes.

So, returning to the configuration example (using only the first two lines we covered) we make the full connection — phase 1 and phase 2 — with the following:

<IfModule ruby_module>

# Phase 1 -- Apache to ModRuby
AddHandler ruby-rhtml-handler .rhtml
AddHandler ruby-script-handler .rb .ruby

# Phase 2 -- ModRuby to Ruby
RubyDefaultHandlerModule 'modruby'
RubyDefaultHandlerClass  'ModRuby::Handler'

</IfModule>

Notice in phase 2 that we have only specified the module and class. But what about the methods? Well, the script handlers are hard-coded to use fixed method names. That is, the ruby-rhtml-handler is hard coded to always call a method named rhtml() and likewise the ruby-script-handler always calls a method named script(). They are fixed. So you can specify any module (RubyDefaultHandlerModule) and class (RubyDefaultHandlerClass) you want, but within that class, the script handlers will look for methods within the class that are named rhtml() or script() (depending on the handler).

Furthermore, since they both use the same module and class specification, if you want to override the default RHTML or script handlers, your override class has to provide both methods. However, there is nothing to stop you from delegating the method you don’t want to implement to ModRuby::Handler. For example, say you wanted to write your own script() handler, but wanted to use the default ModRuby RHTML handler. You could create your module/handler as follows:

require 'modruby/handler'

module MyModule

class Handler

  # Ruby script handler
  def script(req)

    # Your script handler implementation here

  end

  # Use the default ModRuby RHTML script handler
  def rhtml(req)
    handler = ModRuby::Handler.new()
    return handler.rhtml(req)
  end

end

end # module 

Assuming you place it in a file called mymodule.rb with the Ruby path, you would then update your Apache configuration to point to it as follows:

<IfModule ruby_module>

# Phase 1 -- Apache to ModRuby
AddHandler ruby-rhtml-handler .rhtml
AddHandler ruby-script-handler .rb .ruby

# Phase 2 -- ModRuby to Ruby
RubyDefaultHandlerModule 'mymodule'
RubyDefaultHandlerClass  'MyModule::Handler'

</IfModule>

Now all RHTML and CGI script handlers will be handled by your module.

1.2. RHTML Handlers

ModRuby uses an RHTML framework that works exactly like eRuby to parse RHTML files. ModRuby originally used eRuby for RHTML processing, but there were problems getting it to work with Ruby 1.9.1 when it came out. This led to the development of an eRuby clone — the ModRuby RHTML parser — a flex-based scanner which is compiled directly into mod_ruby as a native Ruby C extension. It exactly follows the rules of eRuby. Thus all of the rules in eRuby apply to creating RHTML documents in ModRuby. Just to be complete, let's cover the gamut of RHTML (which may take at most a paragraph).

Note

ModRuby includes an alternate RHTML delimiter syntax. In addition to angle brackets, braces can be used instead. This form supports embeddeding RHTML within XHTML, XML, etc. where the angle brackets can cause problems.

Let’s begin with the obligatory example: helloworld.rhtml, which is as follows:

Hello World. This is <%=ModRuby.name%>

If you don’t already have the page loaded, you this link and look at the result. Simple enough.

RHTML allows you to embed Ruby code in text and run it in the order that it appears. There are three basic constructs for embedding code. First there is embedding an entire chunk of code, as follows:

Here is some text
<%
puts "Inside these delimiters, here is Ruby code"
%>

Outside it's just plain text again.

Then there is inline printing:

Printing text <%="in line"%>

Whenever RHTML sees the <%= opening tag, it prints everything that follows. You can think of the equals sign as shorthand for a puts command. Then there is a third construct which is really a function of the first. You can express blocks over several segements of code and text as follows:

<%array.each do |something|%>
This entire text block will be printed with the current
value of <%="something%> for each iteration of the loop.
<%end%>

And that's about it — RHTML in a nutshell.

1.3. Framework Handlers

Framework handlers are more flexible than script handlers and have extended features. There is no default implementation for framework handlers. Basically, ModRuby passes your Ruby method an Apache request object and you’re on your own.

Framework handlers are implemented using the ruby-handler C function in the ModRuby module. One way to declare a framework handler is by using the AddHandler directive like above. Say for instance we wanted to create a handler for “sheepdip” files (which we designate as having a .dip extension). We could start by doing the following:

<IfModule ruby_module>

AddHandler ruby-handler .dip

</IfModule>

There’s phase 1. Going this route, we now use the DefaultHandler variables for phase 2 — connecting the framework handler to a Ruby method. We specify the module/class/handler using the RubyDefaultHandlerModule, RubyDefaultHandlerClass and the (new) RubyDefaultHandlerMethod directive, respectively. Notice already one difference between framework handlers and script handlers: you can specify the Ruby method name via the RubyDefaultHandlerMethod directive. Whereas the method name is fixed in script handlers, it’s parameterized in framework handlers.

Say our fictional sheepdip files use our fictional sheepdip/handler module, which contains our Sheepdip::Handler class which contains a handle() method. One way to use it would be to set our configuration as follows:

<IfModule ruby_module>

# Phase 1 -- Apache to ModRuby
AddHandler ruby-handler .dip

# Phase 2 -- ModRuby to Ruby
RubyDefaultHandlerModule sheepdip/handler
RubyDefaultHandlerClass  Sheepdip::Handler
RubyDefaultHandlerMethod handle

</IfModule>

To summarize, the three default handler directives are as follows:

  • RubyDefaultHandlerModule: specifies the Ruby handler module to use via the Ruby require directive

  • RubyDefaultHandlerClass: specifies the Ruby handler class name within the respective module

  • RubyDefaultHandlerMethod: specifies the Ruby handler method name within the respective class

But you say “yeah, but isn’t setting the RubyDefaultHandlerModule and RubyDefaultHandlerClass parameters going to redefine how script handlers run as well?” Yes, it will. Unless you also include rhtml() and script() methods in your new default module/class, which is kind of silly and painful, it will break the default behavior used for servicing script handlers. That’s why there is another (better) way to go about connecting framework handlers.

You can skirt the DefaultHandler directives entirely and set up your own independent handler with the more general RubyHandler directives. These directives are as follows:

  • RubyHandlerDeclare: Declares a new framework handler.

  • RubyHandlerModule: Defines the module for a given framework handler.

  • RubyHandlerClass: Defines the class for a given framework handler.

  • RubyHandlerMethod: Defines the handler method for a given framework handler.

These work similarly to the default handler directives, but they take as an argument the name of the framework handler to define. They are not defaults; they are specifics. To use this approach, we would change our configuration to something like the following:

<IfModule ruby_module>

# Phase 0 -- Define framework handler
RubyHandlerDeclare SHEEPDIP
RubyHandlerModule  SHEEPDIP sheepdip/handler
RubyHandlerClass   SHEEPDIP Sheepdip::Handler
RubyHandlerMethod  SHEEPDIP handle

# Phase 1 -- Apache to ModRuby
<Files ~ "\.(dip)$">
    SetHandler ruby-handler

    # Phase 2 -- ModRuby to Ruby
    RubyHandler SHEEPDIP
</Files>

</IfModule>

In "Phase 0", the SHEEPDIP argument is just a unique key used to declare and reference a given framework handler. When you declare a framework hander, the ModRuby module creates an entry in an internal hash table of framework handlers, using the name as the key. Thus you can define as many framework handlers as you like, the only constraint being that they must all have unique names. The framework hander’s entry (value) in the hash table is itself a hash table, and can thus store an unlimited number of key/value pairs just for that handler. So what you have is a hashtable of hashtables as follows:

The outermost hashtable is the internal ModRuby framework handlers table, which is filled with each framework handler you define. In this example there are two: SHEEPDIP and JUJIFRUIT. Each defined framework handler in turn has its own name (e.g. SHEEPDIP) and hashtable in which its respective Ruby module, class and handler method variables are stored. But the framework handler’s hashtable can also hold an unlimited number of other key/value pairs as well, which you can use to customize and/or configure your framework handler. In this example, there is a custom GEM_PATH variable set in the SHEEPDIP handler’s table, and a PASSWORD variable JUJIFRUIT’s. These are added using the RubyHandlerConfig directive, which is covered later. For now, all you need to know is that the RubyHandler directives are stored there and provide all the information needed to invoke the handler.

So with the first four lines (phase 0) we have defined our SHEEPDIP framework handler. Next we need to tell Apache to associate .dip files to the ruby-handler (phase 1), and from there connect it (ruby-handler) to our SHEEPDIP handler (phase 2). To do this, we are going to do things a little different here. We are not using the AddHandler directive. We could, but here we are going to use a different approach just to illustrate we have with Apache’s various configuration directives. As we are interested in only files with extension .dip, we use a Files directive to match this extension. Within the block we use SetHandler, which unconditionally sets the Apache handler to ruby-handler, rather than AddHandler outside. This just keeps us from having to repeat ourselves, as AddHandler requires us to again specify the extension, which we’ve already done in the Files directive. This approach is just a little cleaner. Either way, we are just associating the specific file extension .dip to ruby-handler — that’s the point here, and you can do that however you like.[1] Next is phase 2, where we weld the ruby-handler to the SHEEPDIP framework handler using the RubyHandler directive. The RubyHandler directive is a block-level directive which unconditionally sets the specified framework handler to use in that scope.

So based on the contents of the Files block, when Apache sees a file with extension .dip, it will invoke ruby-handler. When ruby-handler executes, it in turn will see the RubyHandler directive set internally to use the SHEEPDIP Ruby handler. Knowing this, when ruby-handler processes the request, it will pull the SHEEPDIP handler entry from the internal handlers table, extract the RubyHandlerModule, RubyHandlerClass and RubyHandlerMethod entries stored there and use them to invoke the Ruby module/class/method, effectively routing the request to our Sheepdip handler. From there, it’s all up to our Ruby code in the Sheepdip::Handler::handle() method. The Ruby handler will have access to the SHEEPDIP handler table, which includes the custom configuration variables (i.e. the GEM_PATH variable in this example). We will cover how that works in the next section.

1.4. Access Handlers

An access handler implements the Apache ap_hook_check_access_ex() handler. This is a special handler that inspects the request headers or request body and makes an authentication and authorization decision. This handler is run before other handlers and is a nice method to separate authentication code from content.

If the user passes the checks, nothing happens and the RubyHandler or any other handler is run. This is also compatible with any other content handlers like mod_cgi, mod_dir, mod_autoindex, etc.

If the user fails your authentication/authorization check, the error response can be an HTTP 401 Unauthorized response to request Basic auth or OAuth2 bearer tokens. Or something more simple like a redirect to an authentication system.

Here is an Apache config example for setting up a Framework Handler that is used as an Access Handler:

RubyHandlerDeclare ACCESS_TEST
RubyHandlerModule  ACCESS_TEST "/var/www/html/access_test.rb"
RubyHandlerClass   ACCESS_TEST AccessTest::Handler
RubyHandlerMethod  ACCESS_TEST check_access

<Directory "/var/www/cgi-bin">
  RubyAccessHandler ACCESS_TEST
</Directory>
<Directory "/var/www/html">
  RubyAccessHandler ACCESS_TEST
</Directory>

RubyHandlerDeclare DECLINED
RubyHandlerModule  DECLINED "/var/www/html/declined.rb"
RubyHandlerClass   DECLINED Declined::Handler
RubyHandlerMethod  DECLINED check_access

<Location "/assets">
  RubyAccessHandler DECLINED
</Location

Some pseudocode that might be used as access_test.rb:

module AccessTest
  class Handler
    def check_access(request)
      if authorized?
        request.set_user(@username)
      else
        r.setStatus(401)
        r.headers_out['WWW-Authenticate'] = "Basic"
      end
    end
  end
end

In some cases, you might want to exclude some paths from authentication, like assets. Since you can’t remove a RubyAccessHandler, we can set a new one that does nothing but return. It’s lightweight, and doesn’t add the extra overhead of authentication.

module Declined
  class Handler
    def check_access(request)
    end
  end
end

1.5. Summary

There are two classes of module handlers: script handlers and framework handlers. There are two subclasses of script handlers: RHTML and CGI. Script handlers are created for convienience. Their purpose, along with the DefaultHandler directives are just to make it as easy as possible to get something going. They exist just for the sake of simplicity. They require very little knowledge or work to get Ruby code running within Apache.

Framework handlers offer greater control, specificity, and features than script handlers, but require a little more effort and understanding. They slipstream into Apache’s native configuration directives such as Directory, Location, and File, giving you tremendous flexibility and fine-grained control over what handlers fire and under what conditions, with the option of adding custom configuration settings with each handler. We will soon see that they even enable you to build upon, override and merge directives for different directories, locations and files.

To set up a for framework handler, you have to follow three basic steps:

  • Declare and define one or more framework handlers using RubyHandlerDeclare, RubyHandlerModule, RubyHandlerClass and RubyHandlerMethod directives.

  • Tell Apache to call the ruby-handler handler using <Files>, <Location> <Directory> contexts with either the AddHandler or SetHandler directives. This is phase 1.

  • Connect the ruby-handler to the specific framework handler using the RubyHandler directive. This is phase 2.

As a result, you can create an unlimited number of different handlers for different files, directories, locations, extensions and contexts.



[1] For more information on AddHandler verses SetHandler, see the Apache documentation.