Standard CGIOverview
Standard CGI refers to the original Common Gateway
Interface that was first implemented on web servers.
OmniHTTPd has the ability to run Standard CGI
scripts/programs that are compiled for Win32. Binary
executables compiled for UNIX systems will not work.
Standard CGI has gained wide acceptance because it is
universally supported by almost all available web
servers. It has some performance problems because process
launches are much more resource intensive in the Win32
environment than in traditional UNIX systems.
Operation
Before the script is launched, the server creates two
pipes; one for input and one for output. The pipes are
attached to the stdin and stdout of the new CGI program
before it is launched. Any output that the program sends
through the stdout (output) pipe is parsed and sent back
to the client.
When a client requests a designated Standard CGI
resource, the server does the following things:
- It constructs an environment that contains
information about the request and about the user
- It attempts to launch the resource either as an
executable or as a argument to an interpreter,
such as Perl
- If the POST method is used, information is sent
to the new process through the standard input
stream
- Information is read and parsed from the standard
output stream
- When the executable terminates, the parsed
information is sent back to the client
Environment Variables
OmniHTTPd provides the following environment variables
to CGI scripts:
QUERY_STRING |
QUERY_STRING is defined as anything which
follows the first ? in the URL. This information
could be added by the HTML form or it could also
be manually embedded in an HTML anchor which
references the sciprt. This string will usually
be an information query, i.e. what the user wants
to search for in the archie databases, or perhaps
the encoded results of your feedback GET form. This
string is encoded by changing spaces to +, and
encoding special characters with %xx
standard URL hexadecimal encoding. If the
QUERY_STRING does not contain '=' or '&', it
will also be placed in the command line with +
changed to spaces.
|
PATH_INFO |
CGI allows for extra information to be
embedded in the URL for your gateway which can be
used to transmit extra context-specific
information to the scripts. This information is
usually made available as "extra"
information after the path of your gateway in the
URL. This information is not encoded by the
server in any way. The most useful example of PATH_INFO
is transmitting file locations to the CGI
program. To illustrate this, let's say I have a
CGI program on my server called /cgi-bin/foobar
that can process files residing in the
DocumentRoot of the server. I need to be able to
tell foobar which file to process. By including
extra path information to the end of the URL,
foobar will know the location of the document
relative to the DocumentRoot via the PATH_INFO
environment variable, or the actual path to the
document via the PATH_TRANSLATED
environment variable which the server generates
for you.
|
Transmitting Data Back to the Client
The most common error in beginners' CGI programs is
not properly formatting the output so the server can
understand it.
CGI programs can return a myriad of document types.
They can send back an image to the client, and HTML
document, a plaintext document, or perhaps even an audio
clip. They can also return references to other documents.
The client must know what kind of document you're sending
it so it can present it accordingly. In order for the
client to know this, your CGI program must tell the
server what type of document it is returning.
In order to tell the server what kind of document you
are sending back, whether it be a full document or
a reference to one, CGI requires you to place a
short header on your output. This header is ASCII text,
consisting of lines separated by either linefeeds or
carriage returns (or both) followed by a single blank
line. The output body then follows in whatever native
format.
- A full document with a corresponding MIME
type
- In this case, you must tell the server what kind
of document you will be outputting via a MIME
type. Common MIME types are things such as text/html
for HTML, and text/plain for straight
ASCII text.
For example, to send back HTML to
the client, your output should read:
Content-type: text/html
<HTML><HEAD>
<TITLE>output of HTML from CGI script</TITLE>
</HEAD><BODY>
<H1>Sample output</H1>
What do you think of <STRONG>this?</STRONG>
</BODY></HTML>
- A reference to another document
- Instead of outputting the document, you can just
tell the browser where to get the new one, or
have the server automatically output the new one
for you.
For example, say you want to reference
a file on your Gopher server. In this case, you
should know the full URL of what you want to
reference and output something like:
Content-type: text/html
Location: gopher://httprules.foobar.org/0
<HTML><HEAD>
<TITLE>Sorry...it moved</TITLE>
</HEAD><BODY>
<H1>Go to gopher instead</H1>
Now available at
<A HREF="gopher://httprules.foobar.org/0">a new location</A>
on our gopher server.
</BODY></HTML>
However, today's browsers are smart enough to
automatically throw you to the new document,
without ever seeing the above since. If you get
lazy and don't want to output the above HTML,
NCSA HTTPd will output a default one for you to
support older browsers.
If you want to reference another file (not
protected by access authentication) on your own
server, you don't have to do nearly as much work.
Just output a partial (virtual) URL, such as the
following:
Location: /dir1/dir2/myfile.html
The server will act as if the client had not
requested your script, but instead requested http://yourserver/dir1/dir2/myfile.html.
It will take care of most everything, such as
looking up the file type and sending the
appropriate headers. Just be sure that you output
the second blank line.
If you do want to reference a document that is
protected by access authentication, you will need
to have a full URL in the Location:,
since the client and the server need to
re-transact to establish that you access to the
referenced document.
Advanced usage: If you would like to output headers
such as Expires or Content-encoding, you can if your
server is compatible with CGI/1.1. Just output them along
with Location or Content-type and they will be sent back
to the client.
Retrieving the Form Data
As you now know, there are two methods which can be
used to access your forms. These methods are GET
and POST. Depending on which method you used,
you will receive the encoded results of the form in a
different way.
- The GET method
- If your form has METHOD="GET" in its
FORM tag, your CGI program will receive the
encoded form input in the environment variable QUERY_STRING.
The
POST method
- If your form has METHOD="POST" in its
FORM tag, your CGI program will receive the
encoded form input on stdin. The server will NOT
send you an EOF on the end of the data, instead
you should use the environment variable
CONTENT_LENGTH to determine how much data you
should read from stdin. \\
Decoding the Form Data
When you write a form, each of your input items has a
NAME tag. When the user places data in these items in the
form, that information is encoded into the form data. The
value each of the input items is given by the user is
called the value.
Form data is a stream of name=value pairs separated by
the & character. Each name=value pair is URL
encoded, i.e. spaces are changed into plusses and some
characters are encoded into hexadecimal.
The basic procedure is to split the data by the
ampersands. Then, for each name=value pair you get for
this, you should URL decode the name, and then the value,
and then do what you like with them.
Notes
Scripts must be placed in their respective directories
so that the server can determine how to correctly execute
the script. Set the server properties. Do not alias to
these directories as an alias definition will override a
CGI directory definition.
If you get errors when launching standard CGI scripts
or not all environment variable are present, you are
running out of DOS environment space. To fix this, add
the following lines to your SYSTEM.INI:
[NonWindowsApp]
CommandEnvSize=8192
Copyright © 1998 Omnicron Technologies Corporation
Portions Copyright © NCSA Documentation
|