The Apache request is the largest data structure in the Apache module API and encompasses every aspect of the HTTP request, including all other associated data structures such as the connection, process, and server structures. It is primary means by which to interact with Apache.
This data structure originates in the Apache API as the
request_req structure (documented here). It
is encapsulated, and somewhat extended in C++ form in mod_rsp
by the apache::Request
class
(request.cpp
). This, in turn, is then encapsulated in Ruby
by the Apache::Ruby
class, defined in the Ruby extension
ruby_request.cpp
.
Table B.1, “Apache Request Methods” contains the exhaustive list of methods, in alphabetical order. The documentation of each method follows.
Table B.1. Apache Request Methods
allow_options()
returns a bitmap specifying all of
the options set for this request (e.g. indexes, includes, sym links,
execcgi). This corresponds to the Options
directive for a
given directory in the Apache configuration file, which controls which server
features are available in a particular directory. The options bitmap in this
case returns the the set of options governing the directory encompassing the
current request.
The possible options returned, which are defined in
httpd_core.h
of the Apache source, are as xfollows:
#define OPT_NONE 0 #define OPT_INDEXES 1 #define OPT_INCLUDES 2 #define OPT_SYM_LINKS 4 #define OPT_EXECCGI 8 #define OPT_UNSET 16 #define OPT_INCNOEXEC 32 #define OPT_SYM_OWNER 64 #define OPT_MULTI 128 #define OPT_ALL OPT_INDEXES|OPT_INCLUDES|OPT_SYM_LINKS|OPT_EXECCGI
More information on this setting is available in the Apache Core Features documentation.
allow_overrides()
returns a bitmap describing the
values in the governing AllowOverrides
setting. This bitmap
is composed of the same set of options defined in the section called “allow_options()”.
From the Apache documentation:
When the Apache finds an .htaccess
file
(as specified by AccessFileName
) it needs to know which
directives declared in that file can override earlier configuration
directives.
When this directive is set to None
, then
.htaccess
files are completely ignored. In this case, the
server will not even attempt to read .htaccess files in the filesystem.
When this directive is set to All
, then any directive
which has the .htaccess
Context
is allowed in .htaccess
files.
More information on this setting is available in the Apache Core Features documenation.
allowed()
returns a bitvector of the allowed methods.
From the Apache source documentation:
A handler must ensure that the request method is one that it is capable of
handling. Generally modules should DECLINE
(HTTP response
code) any request methods they do not handle. Prior to aborting the handler in
this case, the handler should set allowed
to the list of
methods that it is willing to handle. This bitvector is used to construct the
Allow
header required for OPTIONS
requests
as well as HTTP_METHOD_NOT_ALLOWED
and
HTTP_NOT_IMPLEMENTED
status codes.
Since the default handler deals with OPTIONS
, all
modules can usually decline to deal with it. TRACE
is always
allowed, modules don't need to set it explicitly.
Since the default_handler will always handle a GET
, a
module which does not implement GET
should probably return HTTP_METHOD_NOT_ALLOWED
. Unfortunately
this means that a Script GET
handler can't be installed by
mod_actions
.
The bit vector map, defined in httpd.h
, is as follows:
#define M_GET 0 /** RFC 2616: HTTP */ #define M_PUT 1 /* : */ #define M_POST 2 #define M_DELETE 3 #define M_CONNECT 4 #define M_OPTIONS 5 #define M_TRACE 6 /** RFC 2616: HTTP */ #define M_PATCH 7 /** no rfc(!) ### remove this one? */ #define M_PROPFIND 8 /** RFC 2518: WebDAV */ #define M_PROPPATCH 9 /* : */ #define M_MKCOL 10 #define M_COPY 11 #define M_MOVE 12 #define M_LOCK 13 #define M_UNLOCK 14 /** RFC 2518: WebDAV */ #define M_VERSION_CONTROL 15 /** RFC 3253: WebDAV Versioning */ #define M_CHECKOUT 16 /* : */ #define M_UNCHECKOUT 17 #define M_CHECKIN 18 #define M_UPDATE 19 #define M_LABEL 20 #define M_REPORT 21 #define M_MKWORKSPACE 22 #define M_MKACTIVITY 23 #define M_BASELINE_CONTROL 24 #define M_MERGE 25 #define M_INVALID 26 /** RFC 3253: WebDAV Versioning */
To check whether a given method is support, you use bitwise
operations. For example, to check whether the PUT
command is
allowed, you would do the following:
req->allowed & (1 << M_PUT)
allowed=()
takes a integer value as a mitmask
representation of all allowed methods and sets it as the handler's bitvector of
the allowed methods.
For example, to set the allowed options to just GET
,
PUT
, OPTIONS
and TRACE
,
you would do the following:
req->allowed = M_GET | M_POST | M_OPTIONS | M_TRACE
See the section called “allowed()” for more information.
args()
returns the query args extracted from URL. You
can then parse them with the Ruby CGI
class.
In addition to args()
you can use the convenience
queries()
method instead, which decodes, parses and
returns the query in the form of an APR table. See the section called “queries()”
for more information.
assbackwards()
sets the handler to return an HTTP/0.9
"simple" request (e.g. GET /foo
with no headers).
auth_type()
If an authentication check was made, this
gets set to the auth type (the values "Basic"
or
"Digest"
).
More information on this setting is available in the Apache Core Features documenation.
sent_bodyct()
returns whether the byte count in
stream is for body. It's not clear if anybody really knows what this is for. It
indicates whether the byte count returned by
bytes_sent()
refers to the size of the body. The value
1 means it does, 0 means it does not. Exactly what it refers to in the case of 0
is anybody's guess, as there seems to be absolutely zero documentation anywhere
and very few uses of it in source. It may simply indicate if there is a body at
all. Like in the case where the client request specifies headers only
(HEAD
), then sent_bodyct()
would be 0.
bytes_sent()
sends the size in bytes of the
body. This is the number of bytes in the body sent back to the client. This
value will be zero until some actual data is flushed (sent) to the client
(i.e. you haven't previously called rflush()
).
chunked()
returns whether the handler is sending the
response using chunked encoding. By default, the request will not used chunked
encoding, which is useful if you wish to return the request back to the client
in pieces. To employ chunked encoding, you need only call
rflush()
, which will send all buffered data back to the
client in a chunk. Each subsequent call will send another chunk. When the
request completes, any remaining data will be sent as the last chunk.
clength()
returns the actual length in bytes of the
response body. This will always be zero. The reason is that the RSP handler
works as the content generator — it creates all of the
content.
Furthermore, this value is not set until the RSP handler completes,
whereupon the content eventually passes through Apache's content length filter
(ap_content_length_filter()
), which then computes and set
the content length, in turn setting the Content-Length
header. Therefore, this value is for all intents and purposes useless as you
can't set it, and it will always be zero during the handler phase.
connection()
returns the
Connection
object associated with this request. See Section 4, “The Apache Connection Class” for more information.
content()
provides a means to get at the request
content, specifically for processing data in POST request. There are two ways to
use it. The first is simply to call it, and it will return a string containing
the entire request body.
The second is to provide a block with a single argument. In this approach,
content()
will then funnel out the request body by
yielding 1024 bytes chunks to the block, as in the following example:
f = open('/tmp/file.dat','wb') @request.content() do |chunk| f.write(chunk) end f.close()
This is of course a trivial example. In the case of a file upload you would have to parse the multipart MIME format (RFC 2388) in place, locate the file part, and save that to the file (this is left as an exercise for the reader). In any case, this latter approach makes processing large file uploads much more efficient, as the file can be read and written to disk in increments, rather than having to first be held in memory. For information on handling multi-part file uploads, see the section called “read()”.
content_encoding()
returns the content encoding.
More information on this setting is available in the Apache MIME Module documenation.
content_languages()
returns an APR Array of strings
representing content languages.
More information on this setting is available in the Apache MIME Module documenation.
content_type()
returns the content type of the
request. This will always be "text/html" for .rhtml
and
.rsp
files (assuming the administrator properly configured
the MIME types in Apache).
More information on this setting is available in the Apache MIME Module documenation.
default_type()
returns the default content type from
the configuration, or text/plain
if none is set.
discard_request_body()
returns whether the request
body will be discarded.
From the Apache source documentation:
In HTTP/1.1, any method can have a body. However, most
GET
handlers wouldn't know what to do with a request body if
they received one. This helper routine tests for and reads any message body in
the request, simply discarding whatever it receives. We need to do this because
failing to read the request body would cause it to be interpreted as the next
request on a persistent connection.
document_root()
returns the document root from the
configuration (not necessarily the active one for this request)
This is in to be backward compatible, but is not always accurate
(e.g. if mod_userdir
is enabled). This should not be used.
err_headers_out()
returns an APR table containing the
MIME headers from the response printed, even on error.
While you might be wondering whether these headers are worth anything, as
it turns out, they are. Apache requries all headers be sent to
err_headers_out
rather than headers_out
if
the response code is anything other than 200. This fact is critical to
redirects.
filename()
returns the filename on disk corresponding
to this response.
finalize_request_protocol()
is cryptic, undocumented,
and unimportant. The Apache source say it is "Called at completion of sending
the response. It sends the terminating protocol information." What that means, I
don't know. If you ever rigure out what it does, right ahead and use it.
finfo()
returns an APR FileInfo object for the
request file (if it exists), otherwise it setsit to zero if no such file. See
Section 8, “The APR FileInfo Class” for information.
get_remote_logname()
returns the login name of the
remote user. Undefined if it cannot be determined.
get_server_name()
returns return the server name from
the request.
handler()
returns the name of the handler assigned
this request.
header_only()
returns 1 if this is a
HEAD
request, as opposed to GET
.
headers_in()
returns an APR table containing the MIME
headers from the request.
headers_out()
returns an APR table containing the
MIME headers for the reponse.
hostname()
returns host name as set by full URI or
host.
internal_redirect()
initiates an internal redirect to
another URI in this server
def internal_redirect(uri)
uri
: the new uri to redirect
to.
internal_redirect_handler()
is designed for things
like actions or CGI scripts, when using AddHandler
and you
want to preserve the content type across an internal redirect.
def internal_redirect_handler(uri)
uri
: the URI to replace the current
request with.
is_initial_req()
returns 1 is this is the main
request or 0 if it is a subrequest.
log()
logs a message to the Apache logfile.
def log(level, message)
level
: the log level, which can be one
the following:
#define APLOG_EMERG 0 /* system is unusable */ #define APLOG_ALERT 1 /* action must be taken immediately */ #define APLOG_CRIT 2 /* critical conditions */ #define APLOG_ERR 3 /* error conditions */ #define APLOG_WARNING 4 /* warning conditions */ #define APLOG_NOTICE 5 /* normal but significant condition */ #define APLOG_INFO 6 /* informational */ #define APLOG_DEBUG 7 /* debug-level messages */
message
: the log message text.
m_user()
returns the user name, if an authentication
check was made, otherwise it returns nil
.
main()
returns the Request
object corresponding to the main request. This is helpful if the current request
is a subrequents and needs to obtain information about the original request.
make_content_type()
is a wrapper around the Apache
function ap_make_content_type()
. From the Apache source
documentation:
Build the content-type that should be sent to the client from the content-type specified. The following rules are followed:
If type is NULL
, type is set to
ap_default_type(req)
.
If charset adding is disabled, stop processing and return type.
Then, if there are no parameters on type, add the default charset return type.
make_etag()
is a wrapper around the Apache function
ap_make_etag()
, which constructs an entity tag from the
resource information. If it's a real file, build in some of the file
characteristics. For more information on ETags, see the Apache Core
documentation.
def make_etag(force_weak)
force_weak
: Force the entity tag to be
weak — it could be modified again in as short an interval.
meets_conditions()
is a wrapper around the Apache
function ap_meets_conditions()
. Implements condition GET
rules for HTTP/1.1 specification. It inspects the client headers and determines
if the response fulfills the requirements specified. It returns
OK
if the response fulfills the condition
GET
rules, some other status code otherwise.
method()
returns the request method
(eg. GET
, HEAD
, POST
,
etc.)
method_number()
returns the method number of the
request (e.g. M_GET
, M_POST
, etc. These
method constants are documented in Table A.1, “Apache HTTP Method Constants”.
mtime()
returns the last modified time of the
requested resource.
next()
returns the redirected request if this is an
external redirect.
no_cache()
returns 1 if this response cannot be
cached, 0 otherwise.
no_local_copy()
returns 1 if there is no local copy
of this response, 0 otherwise.
note_auth_failure()
sets up the output headers so
that the client knows how to authenticate itself the next time, if an
authentication request failed. This function works for both basic and digest
authentication.
note_basic_auth_failure()
sets up the output headers
so that the client knows how to authenticate itself the next time, if an
authentication request failed. This function works only for basic
authentication.
note_digest_auth_failure()
sets up the output headers
so that the client knows how to authenticate itself the next time, if an
authentication request failed. This function works only for digest
authentication
notes()
returns and APR table containing notes to
pass from one module to another.
out()
writes a string out via the Apache
ap_puts()
. This is an alias for the
rputs()
method.
def out(text)
text
: text to write out
params()
returns an APR table containing all of the
form data posted, provided that the request method is a POST
and the content type is application/x-www-form-urlencoded
(not multipart/form-data
— which is handled a different
way). This method does a lot of work under the hood to get this data. It
mediates the transfer, loads, decodes, and organizes the form data into the APR
table that is returned.
The APR table returned is cached, so this process is only performed on the
first call. Subsequent calls will return the same table. If this function is
called an a request which is not a POST
, it will return
nil
.
By default this function currently limits the amount of data that
can be sent to 20Mb. Larger values can be accomodated by setting them using the
Request::set_max_content_length()
method.
parsed_uri()
This function is currently not
implemented. In the future, it will return a Ruby object that wraps the
ap_uri_t structure.
path_info()
returns the path info extracted from this
request.
prev()
returns the previous request object if this is
an internal redirect.
print()
writes a string out via the Apache
ap_puts()
. This is an alias for the
rputs()
method.
def print(text)
text
: text to write out.
proto_num()
returns the protocol version number of protocol
(e.g. 1.1 = 1001).
protocol()
returns the protocol string, as given to
us, or HTTP/0.9.
proxyreq()
returns a proxy request (calculated during
post read request and translate_name. Possible values
PROXYREQ_NONE
, PROXYREQ_PROXY
,
PROXYREQ_REVERSE
, PROXYREQ_RESPONSE
.
puts()
writes a string out via the Apache
ap_puts()
. This is an alias for the
rputs()
method.
def puts(text)
text
: text to write out.
queries()
decodes, parses and returns the query
arguments in the URL as an APR table. This is a convenience function provided in
order to alleviate the need of using the Ruby CGI class. Additionally, it
provides the standard interface of a true APR table. It is implemented in C.
@request.puts "Queries: " queries = @request.queries() queries.each do |key, value| @request.puts " #{key}=#{value}" end
range()
returns the HTTP range header value.
rationalize_mtime()
returns the latest rational time
from a request/mtime pair. mtime
is returned unless it's in
the future, in which case we return the current time.
mtime
: The last modified
time.
read()
returns a buffer containing content of the
request body. It takes a single argument specifying the number of bytes to
read. It returns a buffer containing that many bytes, or whatever is left. If
there is no content remaining, it returns nil
.
read()
, along with readline()
are used in conjunction the the RSP::Rfc2388
parser class to process multipart forms and file uploads. There is really few if
any other reasons to use them by themselves. They basically provide the minimal
requirements for the parser class to operate on the
Request
object as if it were a file. The following
example illustrates its use:
def processForm() @req.log(APLOG_NOTICE, 'file upload') @req.set_content_type('text/html') # Get the multipart boundary boundary = @req.boundary() # Create the parser parser = RSP::Rfc2388::Parser.new() # Parse the form. The first argument is a file-like object (supports read() # and readline()). The second argument is the multi-part form boundary that # divides the content parts. parser.parse(@req, boundary) do |part| # Look for a file argument. If exists, then we have a file upload. if part.fileName != nil # Open the file # Normally, you wouldn't want to save the file using the posted filename # as this can be error prone (with IE and all). Typically we would use # the form name (part.name). But for testing it is okay we control the # file we are parsing. upload_file = open("#{part.fileName}", 'w') # Read all of the data out of the content part. This ultimately operates # on the underlying file object (@req). By default, the content is read # out in 4k blocks.x bytes = 0 part.read() do |content| bytes += content.size upload_file.write(content); end # Close upload_file.close() else # Else this is just simply content part of some kind and we will just # store the data in the part's internal buffer. part.read() end # Print what we did if part.fileName != nil @req.puts "file %-20s: %s" % [part.fileName, part.contentType] else @req.puts "%-25s: %s" % [part.name, part.data] end end end
Note that all calls to read()
and
readline()
are abstracted away in the parser class. This is
because parsing is a pain, and the parser class manages all of that for you. The
end result is that the parser feeds parsed content sections into the block
supplied it, in the form of Part
objects. These objects,
like content sections, have headers and data. The headers are already
parsed. The data is not yet read, allowing you to read it in discrete parts,
using the same block/callback mechanism. This appraoch allows file uploads (and
multi-part forms in general) to be handled sequentially, as a continuous stream,
rather than all at once. This avoids having store the entire form in memory in
order to parse/process it.
The bottom line is that all the parsing details are abstracted away, and your code only has to deal with processing the incoming data. The additional benefit is that this approach can handle arbitrarily large files without any increase in memory requirements, as it does not have to store the entire form in memory.
def read(bytes)
bytes
: the number of bytes of content to
read.
readline()
reads a single line of text from the
content of the request budy. It looks for a CRLF (\r\n
) as
the line terminator. If this is not found, it will return the entire request
content.
read_body()
is the method for reading the
request body (e.g. REQUEST_CHUNKED_ERROR
).
read_chunked()
returns 1 if we are reading chunked
transfer-coding
read_length()
returns the number of bytes that have
been read from the request body
remaining()
returns the number of remaining bytes
left to read from the request body
request_time()
returns the time when the request
started.
rflush()
calls ap_rflush()
,
which flushes all of the data for the current request to the client. Returns the
number of bytes sent.
rputs()
writes a string out via the Apache
ap_puts()
.
def rputs(text)
text
: text to write out.
send_file()
sends the full contents of a file to the
client (using chunked encoding). When possible (on various platforms),
send_file()
uses optimized OS system calls where available
to send the file contents through the kernel, minimizing file copy overhead. For
example, on Linux and BSD Apache will use the sendfile (2)
system call to transfer the file contents, resulting in a "zero-copy" transfer
— meaning that it has been optimized so that copying of the file data is
avoided.
def sendfile(path)
path
: full path of file send. If a
relative path is used, it will be relative to the current working directory
which differs based on the context in which the request is being handled. Within
the RSP environment, the current working directory is the directory containing
the RHTML and RSP file being processed. Otherwise it is the default working
directory set by Apache.
send_error_response()
Sends back an error code back
to client.
def send_error_response(status)
status
: error code to send.
server()
returns the server object associated with
this request. For more information, see Section 3, “The Apache Server Class”.
set_content_length()
called
ap_set_content_length()
. This sets the content length for
the request.
def set_content_length(length)
length
: number of bytes in content.
set_content_type()
Sets the content type for this request.
def set_content_type(type)
type
: string containing content type.
set_etag()
sets the ETag outgoing header. For more
information on ETags, see the Apache Core
documentation
set_keepalive()
set the keepalive status for this
request.
set_max_content_length()
sets the maximum amount of form
data the module will accept from the client in bytes. The default is 20Mb. To
make this unlimited, set this value to -1. If the POST form data excedes this
amount, then params()
will set and error and return
nil
.
def set_max_content_length(byes)
bytes
: Maximum POST form data in bytes.
set_status()
sets the HTTP status code to return
(e.g. 200
, 302
, 404
,
etc.
def set_content_status(status)
status
: HTTP status code to return.
setup_client_block()
sets up the client to allow
Apache to read the request body. This is usually used for processing form data
in POST
. If this is the case, don't worry about this, use the
params()
method, which does all of the work for you.
def setup_client_block(read_policy)
read_policy
: How the server should
interpret a chunked transfer-encoding. This can be one of the
following:
REQUEST_NO_BODY Send 413 error if message has any body REQUEST_CHUNKED_ERROR Send 411 error if body without Content-Length REQUEST_CHUNKED_DECHUNK If chunked, remove the chunks for me.
should_client_block()
determines if the client has
sent any data. This also sends a 100 Continue response to HTTP/1.1 clients, so
modules should not be called until the module is ready to read content. This is
often used in the processing of form data. You most likely will never need to
use this method.
some_auth_required()
determines if any authentication
is required for the current request. It returns 1 is authentication is required,
0 otherwise.
status()
returns the current HTTP status code.
status_line()
returns the current HTTP status line
(text).
subprocess_env()
returns an APR table of environment
variables passed to the subprocess, if one exists.
the_request()
returns the first line of actual HTTP
request content.
unparsed_uri()
returns the URI without any parsing
performed.
uri()
returns just the path portion of the
URI.
vlist_validator()
returns the variant list validator
(if negotiated).
write()
writes binary data directly to the socket via
the Apache ap_rwrite()
. This is useful for then you want to
send images or binary files directly from a controller method. Here is an
example of sending a captcha using RMagic:
def write(data, size)
data
: The data to write out.
size
: The size of the data in bytes.