2. The Apache Request Class

The Apache request is the largest data structure in the Apache module API and encompasses every aspect of the HTTP request, including all other associated data structures such as the connection, process, and server structures. It is primary means by which to interact with Apache.

This data structure originates in the Apache API as the request_req structure (documented here). It is encapsulated, and somewhat extended in C++ form in mod_rsp by the apache::Request class (request.cpp). This, in turn, is then encapsulated in Ruby by the Apache::Ruby class, defined in the Ruby extension ruby_request.cpp.

2.1. Method Documentation

Table B.1, “Apache Request Methods” contains the exhaustive list of methods, in alphabetical order. The documentation of each method follows.

Table B.1. Apache Request Methods

TypeNameArgs
method allow_options() 0
method allow_overrides() 0
method allowed() 0
method allowed=() 0
method args() 0
method assbackwards() 0
method auth_type() 0
method sent_bodyct() 0
method bytes_sent() 0
method chunked() 0
method clength() 0
method connection() 0
method content() 0
method content_encoding() 0
method content_languages() 0
method content_type() 0
method default_type() 0
method discard_request_body() 0
method document_root() 0
method err_headers_out() 0
method filename() 0
method finalize_request_protocol() 0
method finfo() 0
method get_remote_logname() 0
method get_server_name() 0
method handler() 0
method header_only() 0
method headers_in() 0
method headers_out() 0
method hostname() 0
method internal_redirect() 1
method internal_redirect_handler() 1
method is_initial_req() 0
method log() 5
method m_user() 0
method main() 0
method make_content_type() 0
method make_etag() 1
method meets_conditions() 0
method method() 0
method method_number() 0
method mtime() 0
method next() 0
method no_cache() 0
method no_local_copy() 0
method note_auth_failure() 0
method note_basic_auth_failure() 0
method note_digest_auth_failure() 0
method notes() 0
method out() 1
method params() 0
method parsed_uri() 0
method path_info() 0
method prev() 0
method print() 1
method proto_num() 0
method protocol() 0
method proxyreq() 0
method puts() 1
method queries() 0
method range() 0
method rationalize_mtime() 1
method read() 1
method readline() 0
method read_body() 0
method read_chunked() 0
method read_length() 0
method remaining() 0
method request_time() 0
method rflush() 0
method rputs() 1
method send_file() 1
method send_error_response() 1
method server() 0
method set_content_length() 0
method set_content_type() 1
method set_etag() 0
method set_keepalive() 0
method set_max_content_length() 1
method set_status() 1
method setup_client_block() 1
method should_client_block() 0
method some_auth_required() 0
method status() 0
method status_line() 0
method subprocess_env() 0
method the_request() 0
method unparsed_uri() 0
method uri() 0
method vlist_validator() 0
method write() 2


allow_options()

allow_options() returns a bitmap specifying all of the options set for this request (e.g. indexes, includes, sym links, execcgi). This corresponds to the Options directive for a given directory in the Apache configuration file, which controls which server features are available in a particular directory. The options bitmap in this case returns the the set of options governing the directory encompassing the current request.

The possible options returned, which are defined in httpd_core.h of the Apache source, are as xfollows:

#define OPT_NONE         0
#define OPT_INDEXES      1
#define OPT_INCLUDES     2
#define OPT_SYM_LINKS    4
#define OPT_EXECCGI      8
#define OPT_UNSET       16
#define OPT_INCNOEXEC   32
#define OPT_SYM_OWNER   64
#define OPT_MULTI      128
#define OPT_ALL        OPT_INDEXES|OPT_INCLUDES|OPT_SYM_LINKS|OPT_EXECCGI

More information on this setting is available in the Apache Core Features documentation.

allow_overrides()

allow_overrides() returns a bitmap describing the values in the governing AllowOverrides setting. This bitmap is composed of the same set of options defined in the section called “allow_options()”.

From the Apache documentation:

When the Apache finds an .htaccess file (as specified by AccessFileName) it needs to know which directives declared in that file can override earlier configuration directives.

When this directive is set to None, then .htaccess files are completely ignored. In this case, the server will not even attempt to read .htaccess files in the filesystem.

When this directive is set to All, then any directive which has the .htaccess Context is allowed in .htaccess files.

More information on this setting is available in the Apache Core Features documenation.

allowed()

allowed() returns a bitvector of the allowed methods.

From the Apache source documentation:

A handler must ensure that the request method is one that it is capable of handling. Generally modules should DECLINE (HTTP response code) any request methods they do not handle. Prior to aborting the handler in this case, the handler should set allowed to the list of methods that it is willing to handle. This bitvector is used to construct the Allow header required for OPTIONS requests as well as HTTP_METHOD_NOT_ALLOWED and HTTP_NOT_IMPLEMENTED status codes.

Since the default handler deals with OPTIONS, all modules can usually decline to deal with it. TRACE is always allowed, modules don't need to set it explicitly.

Since the default_handler will always handle a GET, a module which does not implement GET should probably return HTTP_METHOD_NOT_ALLOWED. Unfortunately this means that a Script GET handler can't be installed by mod_actions.

The bit vector map, defined in httpd.h, is as follows:

#define M_GET                   0       /** RFC 2616: HTTP */
#define M_PUT                   1       /*  :             */
#define M_POST                  2
#define M_DELETE                3
#define M_CONNECT               4
#define M_OPTIONS               5
#define M_TRACE                 6       /** RFC 2616: HTTP */
#define M_PATCH                 7       /** no rfc(!)  ### remove this one? */
#define M_PROPFIND              8       /** RFC 2518: WebDAV */
#define M_PROPPATCH             9       /*  :               */
#define M_MKCOL                 10
#define M_COPY                  11
#define M_MOVE                  12
#define M_LOCK                  13
#define M_UNLOCK                14      /** RFC 2518: WebDAV */
#define M_VERSION_CONTROL       15      /** RFC 3253: WebDAV Versioning */
#define M_CHECKOUT              16      /*  :                          */
#define M_UNCHECKOUT            17
#define M_CHECKIN               18
#define M_UPDATE                19
#define M_LABEL                 20
#define M_REPORT                21
#define M_MKWORKSPACE           22
#define M_MKACTIVITY            23
#define M_BASELINE_CONTROL      24
#define M_MERGE                 25
#define M_INVALID               26      /** RFC 3253: WebDAV Versioning */

To check whether a given method is support, you use bitwise operations. For example, to check whether the PUT command is allowed, you would do the following:

req->allowed & (1 << M_PUT)

allowed=()

allowed=() takes a integer value as a mitmask representation of all allowed methods and sets it as the handler's bitvector of the allowed methods.

For example, to set the allowed options to just GET, PUT, OPTIONS and TRACE, you would do the following:

req->allowed = M_GET | M_POST | M_OPTIONS | M_TRACE

See the section called “allowed()” for more information.

args()

args() returns the query args extracted from URL. You can then parse them with the Ruby CGI class.

In addition to args() you can use the convenience queries() method instead, which decodes, parses and returns the query in the form of an APR table. See the section called “queries()” for more information.

assbackwards()

assbackwards() sets the handler to return an HTTP/0.9 "simple" request (e.g. GET /foo with no headers).

auth_type()

auth_type() If an authentication check was made, this gets set to the auth type (the values "Basic" or "Digest").

More information on this setting is available in the Apache Core Features documenation.

sent_bodyct()

sent_bodyct() returns whether the byte count in stream is for body. It's not clear if anybody really knows what this is for. It indicates whether the byte count returned by bytes_sent() refers to the size of the body. The value 1 means it does, 0 means it does not. Exactly what it refers to in the case of 0 is anybody's guess, as there seems to be absolutely zero documentation anywhere and very few uses of it in source. It may simply indicate if there is a body at all. Like in the case where the client request specifies headers only (HEAD), then sent_bodyct() would be 0.

bytes_sent()

bytes_sent() sends the size in bytes of the body. This is the number of bytes in the body sent back to the client. This value will be zero until some actual data is flushed (sent) to the client (i.e. you haven't previously called rflush()).

chunked()

chunked() returns whether the handler is sending the response using chunked encoding. By default, the request will not used chunked encoding, which is useful if you wish to return the request back to the client in pieces. To employ chunked encoding, you need only call rflush(), which will send all buffered data back to the client in a chunk. Each subsequent call will send another chunk. When the request completes, any remaining data will be sent as the last chunk.

clength()

clength() returns the actual length in bytes of the response body. This will always be zero. The reason is that the RSP handler works as the content generator — it creates all of the content.

Furthermore, this value is not set until the RSP handler completes, whereupon the content eventually passes through Apache's content length filter (ap_content_length_filter()), which then computes and set the content length, in turn setting the Content-Length header. Therefore, this value is for all intents and purposes useless as you can't set it, and it will always be zero during the handler phase.

connection()

connection() returns the Connection object associated with this request. See Section 4, “The Apache Connection Class” for more information.

content()

content() provides a means to get at the request content, specifically for processing data in POST request. There are two ways to use it. The first is simply to call it, and it will return a string containing the entire request body.

The second is to provide a block with a single argument. In this approach, content() will then funnel out the request body by yielding 1024 bytes chunks to the block, as in the following example:

f = open('/tmp/file.dat','wb')

@request.content() do |chunk|
  f.write(chunk)
end

f.close()

This is of course a trivial example. In the case of a file upload you would have to parse the multipart MIME format (RFC 2388) in place, locate the file part, and save that to the file (this is left as an exercise for the reader). In any case, this latter approach makes processing large file uploads much more efficient, as the file can be read and written to disk in increments, rather than having to first be held in memory. For information on handling multi-part file uploads, see the section called “read()”.

content_encoding()

content_encoding() returns the content encoding.

More information on this setting is available in the Apache MIME Module documenation.

content_languages()

content_languages() returns an APR Array of strings representing content languages.

More information on this setting is available in the Apache MIME Module documenation.

content_type()

content_type() returns the content type of the request. This will always be "text/html" for .rhtml and .rsp files (assuming the administrator properly configured the MIME types in Apache).

More information on this setting is available in the Apache MIME Module documenation.

default_type()

default_type() returns the default content type from the configuration, or text/plain if none is set.

discard_request_body()

discard_request_body() returns whether the request body will be discarded.

From the Apache source documentation:

In HTTP/1.1, any method can have a body. However, most GET handlers wouldn't know what to do with a request body if they received one. This helper routine tests for and reads any message body in the request, simply discarding whatever it receives. We need to do this because failing to read the request body would cause it to be interpreted as the next request on a persistent connection.

document_root()

document_root() returns the document root from the configuration (not necessarily the active one for this request)

Warning

This is in to be backward compatible, but is not always accurate (e.g. if mod_userdir is enabled). This should not be used.

err_headers_out()

err_headers_out() returns an APR table containing the MIME headers from the response printed, even on error.

While you might be wondering whether these headers are worth anything, as it turns out, they are. Apache requries all headers be sent to err_headers_out rather than headers_out if the response code is anything other than 200. This fact is critical to redirects.

filename()

filename() returns the filename on disk corresponding to this response.

finalize_request_protocol()

finalize_request_protocol() is cryptic, undocumented, and unimportant. The Apache source say it is "Called at completion of sending the response. It sends the terminating protocol information." What that means, I don't know. If you ever rigure out what it does, right ahead and use it.

finfo()

finfo() returns an APR FileInfo object for the request file (if it exists), otherwise it setsit to zero if no such file. See Section 8, “The APR FileInfo Class” for information.

get_remote_logname()

get_remote_logname() returns the login name of the remote user. Undefined if it cannot be determined.

get_server_name()

get_server_name() returns return the server name from the request.

handler()

handler() returns the name of the handler assigned this request.

header_only()

header_only() returns 1 if this is a HEAD request, as opposed to GET.

headers_in()

headers_in() returns an APR table containing the MIME headers from the request.

headers_out()

headers_out() returns an APR table containing the MIME headers for the reponse.

hostname()

hostname() returns host name as set by full URI or host.

internal_redirect()

internal_redirect() initiates an internal redirect to another URI in this server

Definition

def internal_redirect(uri)

  • uri: the new uri to redirect to.

internal_redirect_handler()

internal_redirect_handler() is designed for things like actions or CGI scripts, when using AddHandler and you want to preserve the content type across an internal redirect.

Definition

def internal_redirect_handler(uri)

  • uri: the URI to replace the current request with.

is_initial_req()

is_initial_req() returns 1 is this is the main request or 0 if it is a subrequest.

log()

log() logs a message to the Apache logfile.

Definition

def log(level, message)

  • level: the log level, which can be one the following:

    #define APLOG_EMERG     0       /* system is unusable */
    #define APLOG_ALERT     1       /* action must be taken immediately */
    #define APLOG_CRIT      2       /* critical conditions */
    #define APLOG_ERR       3       /* error conditions */
    #define APLOG_WARNING   4       /* warning conditions */
    #define APLOG_NOTICE    5       /* normal but significant condition */
    #define APLOG_INFO      6       /* informational */
    #define APLOG_DEBUG     7       /* debug-level messages */
    

  • message: the log message text.

m_user()

m_user() returns the user name, if an authentication check was made, otherwise it returns nil.

main()

main() returns the Request object corresponding to the main request. This is helpful if the current request is a subrequents and needs to obtain information about the original request.

make_content_type()

make_content_type() is a wrapper around the Apache function ap_make_content_type(). From the Apache source documentation:

Build the content-type that should be sent to the client from the content-type specified. The following rules are followed:

  • If type is NULL, type is set to ap_default_type(req).

  • If charset adding is disabled, stop processing and return type.

  • Then, if there are no parameters on type, add the default charset return type.

make_etag()

make_etag() is a wrapper around the Apache function ap_make_etag(), which constructs an entity tag from the resource information. If it's a real file, build in some of the file characteristics. For more information on ETags, see the Apache Core documentation.

Definition

def make_etag(force_weak)

  • force_weak: Force the entity tag to be weak — it could be modified again in as short an interval.

meets_conditions()

meets_conditions() is a wrapper around the Apache function ap_meets_conditions(). Implements condition GET rules for HTTP/1.1 specification. It inspects the client headers and determines if the response fulfills the requirements specified. It returns OK if the response fulfills the condition GET rules, some other status code otherwise.

method()

method() returns the request method (eg. GET, HEAD, POST, etc.)

method_number()

method_number() returns the method number of the request (e.g. M_GET, M_POST, etc. These method constants are documented in Table A.1, “Apache HTTP Method Constants”.

mtime()

mtime() returns the last modified time of the requested resource.

next()

next() returns the redirected request if this is an external redirect.

no_cache()

no_cache() returns 1 if this response cannot be cached, 0 otherwise.

no_local_copy()

no_local_copy() returns 1 if there is no local copy of this response, 0 otherwise.

note_auth_failure()

note_auth_failure() sets up the output headers so that the client knows how to authenticate itself the next time, if an authentication request failed. This function works for both basic and digest authentication.

note_basic_auth_failure()

note_basic_auth_failure() sets up the output headers so that the client knows how to authenticate itself the next time, if an authentication request failed. This function works only for basic authentication.

note_digest_auth_failure()

note_digest_auth_failure() sets up the output headers so that the client knows how to authenticate itself the next time, if an authentication request failed. This function works only for digest authentication

notes()

notes() returns and APR table containing notes to pass from one module to another.

out()

out() writes a string out via the Apache ap_puts(). This is an alias for the rputs() method.

Definition

def out(text)

  • text: text to write out

params()

params() returns an APR table containing all of the form data posted, provided that the request method is a POST and the content type is application/x-www-form-urlencoded (not multipart/form-data — which is handled a different way). This method does a lot of work under the hood to get this data. It mediates the transfer, loads, decodes, and organizes the form data into the APR table that is returned.

The APR table returned is cached, so this process is only performed on the first call. Subsequent calls will return the same table. If this function is called an a request which is not a POST, it will return nil.

Note

By default this function currently limits the amount of data that can be sent to 20Mb. Larger values can be accomodated by setting them using the Request::set_max_content_length() method.

parsed_uri()

parsed_uri() This function is currently not implemented. In the future, it will return a Ruby object that wraps the ap_uri_t structure.

path_info()

path_info() returns the path info extracted from this request.

prev()

prev() returns the previous request object if this is an internal redirect.

print()

print() writes a string out via the Apache ap_puts(). This is an alias for the rputs() method.

Definition

def print(text)

  • text: text to write out.

proto_num()

proto_num() returns the protocol version number of protocol (e.g. 1.1 = 1001).

protocol()

protocol() returns the protocol string, as given to us, or HTTP/0.9.

proxyreq()

proxyreq() returns a proxy request (calculated during post read request and translate_name. Possible values PROXYREQ_NONE, PROXYREQ_PROXY, PROXYREQ_REVERSE, PROXYREQ_RESPONSE.

puts()

puts() writes a string out via the Apache ap_puts(). This is an alias for the rputs() method.

Definition

def puts(text)

  • text: text to write out.

queries()

queries() decodes, parses and returns the query arguments in the URL as an APR table. This is a convenience function provided in order to alleviate the need of using the Ruby CGI class. Additionally, it provides the standard interface of a true APR table. It is implemented in C.

@request.puts "Queries: "
queries = @request.queries()

queries.each do |key, value|
  @request.puts "  #{key}=#{value}"
end

range()

range() returns the HTTP range header value.

rationalize_mtime()

rationalize_mtime() returns the latest rational time from a request/mtime pair. mtime is returned unless it's in the future, in which case we return the current time.

Definition

  • mtime: The last modified time.

read()

read() returns a buffer containing content of the request body. It takes a single argument specifying the number of bytes to read. It returns a buffer containing that many bytes, or whatever is left. If there is no content remaining, it returns nil.

read(), along with readline() are used in conjunction the the RSP::Rfc2388 parser class to process multipart forms and file uploads. There is really few if any other reasons to use them by themselves. They basically provide the minimal requirements for the parser class to operate on the Request object as if it were a file. The following example illustrates its use:

  def processForm()

    @req.log(APLOG_NOTICE, 'file upload')

    @req.set_content_type('text/html')

    # Get the multipart boundary
    boundary = @req.boundary()

    # Create the parser
    parser = RSP::Rfc2388::Parser.new()

    # Parse the form. The first argument is a file-like object (supports read()
    # and readline()). The second argument is the multi-part form boundary that
    # divides the content parts.
    parser.parse(@req,  boundary) do |part|

      # Look for a file argument. If exists, then we have a file upload.
      if part.fileName != nil
        
        # Open the file

        # Normally, you wouldn't want to save the file using the posted filename
        # as this can be error prone (with IE and all). Typically we would use
        # the form name (part.name). But for testing it is okay we control the
        # file we are parsing.

        upload_file = open("#{part.fileName}", 'w')
        
        # Read all of the data out of the content part. This ultimately operates
        # on the underlying file object (@req). By default, the content is read
        # out in 4k blocks.x
        bytes = 0
        part.read() do |content|
          bytes += content.size
          upload_file.write(content);
        end
        
        # Close
        upload_file.close()
      else
        
        # Else this is just simply content part of some kind and we will just
        # store the data in the part's internal buffer.
        part.read()
      end
      
      # Print what we did
      if part.fileName != nil
        @req.puts "file %-20s: %s" % [part.fileName, part.contentType]
      else
        @req.puts "%-25s: %s" % [part.name, part.data]
      end
    end    
  end

Note that all calls to read() and readline() are abstracted away in the parser class. This is because parsing is a pain, and the parser class manages all of that for you. The end result is that the parser feeds parsed content sections into the block supplied it, in the form of Part objects. These objects, like content sections, have headers and data. The headers are already parsed. The data is not yet read, allowing you to read it in discrete parts, using the same block/callback mechanism. This appraoch allows file uploads (and multi-part forms in general) to be handled sequentially, as a continuous stream, rather than all at once. This avoids having store the entire form in memory in order to parse/process it.

The bottom line is that all the parsing details are abstracted away, and your code only has to deal with processing the incoming data. The additional benefit is that this approach can handle arbitrarily large files without any increase in memory requirements, as it does not have to store the entire form in memory.

Definition

def read(bytes)

  • bytes: the number of bytes of content to read.

readline()

readline() reads a single line of text from the content of the request budy. It looks for a CRLF (\r\n) as the line terminator. If this is not found, it will return the entire request content.

read_body()

read_body() is the method for reading the request body (e.g. REQUEST_CHUNKED_ERROR).

read_chunked()

read_chunked() returns 1 if we are reading chunked transfer-coding

read_length()

read_length() returns the number of bytes that have been read from the request body

remaining()

remaining() returns the number of remaining bytes left to read from the request body

request_time()

request_time() returns the time when the request started.

rflush()

rflush() calls ap_rflush(), which flushes all of the data for the current request to the client. Returns the number of bytes sent.

rputs()

rputs() writes a string out via the Apache ap_puts().

Definition

def rputs(text)

  • text: text to write out.

send_file()

send_file() sends the full contents of a file to the client (using chunked encoding). When possible (on various platforms), send_file() uses optimized OS system calls where available to send the file contents through the kernel, minimizing file copy overhead. For example, on Linux and BSD Apache will use the sendfile (2) system call to transfer the file contents, resulting in a "zero-copy" transfer — meaning that it has been optimized so that copying of the file data is avoided.

Definition

def sendfile(path)

  • path: full path of file send. If a relative path is used, it will be relative to the current working directory which differs based on the context in which the request is being handled. Within the RSP environment, the current working directory is the directory containing the RHTML and RSP file being processed. Otherwise it is the default working directory set by Apache.

send_error_response()

send_error_response() Sends back an error code back to client.

Definition

def send_error_response(status)

  • status: error code to send.

server()

server() returns the server object associated with this request. For more information, see Section 3, “The Apache Server Class”.

set_content_length()

set_content_length() called ap_set_content_length(). This sets the content length for the request.

Definition

def set_content_length(length)

  • length: number of bytes in content.

set_content_type()

set_content_type() Sets the content type for this request.

Definition

def set_content_type(type)

  • type: string containing content type.

set_etag()

set_etag() sets the ETag outgoing header. For more information on ETags, see the Apache Core documentation

set_keepalive()

set_keepalive() set the keepalive status for this request.

set_max_content_length()

set_max_content_length() sets the maximum amount of form data the module will accept from the client in bytes. The default is 20Mb. To make this unlimited, set this value to -1. If the POST form data excedes this amount, then params() will set and error and return nil.

Definition

def set_max_content_length(byes)

  • bytes: Maximum POST form data in bytes.

set_status()

set_status() sets the HTTP status code to return (e.g. 200, 302, 404, etc.

Definition

def set_content_status(status)

  • status: HTTP status code to return.

setup_client_block()

setup_client_block() sets up the client to allow Apache to read the request body. This is usually used for processing form data in POST. If this is the case, don't worry about this, use the params() method, which does all of the work for you.

Definition

def setup_client_block(read_policy)

  • read_policy: How the server should interpret a chunked transfer-encoding. This can be one of the following:

        REQUEST_NO_BODY          Send 413 error if message has any body
        REQUEST_CHUNKED_ERROR    Send 411 error if body without Content-Length
        REQUEST_CHUNKED_DECHUNK  If chunked, remove the chunks for me.
    

should_client_block()

should_client_block() determines if the client has sent any data. This also sends a 100 Continue response to HTTP/1.1 clients, so modules should not be called until the module is ready to read content. This is often used in the processing of form data. You most likely will never need to use this method.

some_auth_required()

some_auth_required() determines if any authentication is required for the current request. It returns 1 is authentication is required, 0 otherwise.

status()

status() returns the current HTTP status code.

status_line()

status_line() returns the current HTTP status line (text).

subprocess_env()

subprocess_env() returns an APR table of environment variables passed to the subprocess, if one exists.

the_request()

the_request() returns the first line of actual HTTP request content.

unparsed_uri()

unparsed_uri() returns the URI without any parsing performed.

uri()

uri() returns just the path portion of the URI.

vlist_validator()

vlist_validator() returns the variant list validator (if negotiated).

write()

write() writes binary data directly to the socket via the Apache ap_rwrite(). This is useful for then you want to send images or binary files directly from a controller method. Here is an example of sending a captcha using RMagic:

Definition

def write(data, size)

  • data: The data to write out.

  • size: The size of the data in bytes.