The default configuration directory layout
We will now run through the entire configuration that you get bundled with Nginx by default. Some of it is a good example from which you will start writing your own. Some of it is just a sign of Nginx age. Again, we use the original tarball for the 1.9.12 version that is available on the official Nginx website.
This is a list of files inside the conf
folder of the Nginx source archive:
The nginx.conf
is the main file, the one everything starts with. All other files are either included from nginx.conf
or not used at all. Actually, nginx.conf
is the only configuration file that is required by Nginx code (and you can override even that by using -c
command-line switch). We will discuss its content a little bit later.
A pair of fastcgi.conf
and fastcgi_params
files contains almost the same list of simple commands configuring the Nginx FastCGI client. FastCGI, being an interface to run web applications behind Nginx, is not turned on by default. These two files are provided as examples (one of them is even included with the include
command from a commented section of the nginx.conf
file).
Three files with enigmatic names koi-utf
, koi-win
, and win-utf
are character maps to convert between different ways to encode Cyrillic characters in electronic documents. And Cyrillic is, of course, the script used for Russian and several other languages. In the old days of the first Internet hosts in Russia, there was a dispute on which way to encode Russian letters in documents. You can read about different Cyrillic charsets/encodings at http://czyborra.com/charsets/cyrillic.html. Several of them got popular, and web servers had to include functionality of converting documents on the fly in the case that a client browser requested a different encoding from what was used by the server. There was also a whole fork of Apache Web Server that had this functionality built in. Nginx had to do the same to stand a chance against Apache. And now, more than 10 years later, we still have these re-encoding files that are deeply obsolete as the global World Wide Web continues to move towards UTF-8 as the one universal encoding for all human languages. You won't ever use these koi-utf
, koi-win
, and win-utf
files unless you support a very old website for Russian-speaking visitors.
The file named mime.types
is used by default. You can see that it is included from the main nginx.conf
, and you better leave it that way. "MIME types" is a registry of different types of information in files.
They have their origin in some of the email standards (hence, the MIME name) but are used everywhere, including the Web. Let's look inside mime.types
:
Because it is included from nginx.conf
, it should have a proper Nginx configuration language syntax. That's right, it contains a single multiline directive types
, which is not a context (as described in the previous section). Its block is a list of pairs, each being a mapping from one MIME type to a list of file extensions. This mapping is used to mark static files served by Nginx as having a particular MIME (or content) type. According to the quoted segment, the files common.css
and new.css
will get the type text/css
, whereas index.shtml
will be text/html
, and so on and so forth; it is really easy.
A quick example of modifying the MIME types registry
Sometimes, you will add things to this registry. Let's try to do this now and demonstrate an introduction of a simple mistake and the workflow to find and fix it.
Your website will host calendars for your colleagues. A calendar is a file in the iCalendar format generated by a third-party application and saved to a file with .ics
extension. There is nothing about ics
in the default mime.types
, and because of this, your Nginx instance will serve these files with the default application/octet-stream
MIME type, which basically means "it is a bunch of octets (bytes) and I don't have the faintest idea of what they mean". Suppose that the new calendar application your colleagues use require proper iCalendar-typed HTTP responses. This means that you have to add this text/calendar
type into your mime.types
file.
You open mime.types
in your editor and add this line to the very end (not in the middle, not to the start, but the end is important for the sake of this experiment) of the file:
You then run nginx -t
because you are a good Nginx administrator:
Bam. Nginx is smart enough to tell you what you need to fix; this line does not look like either a simple or a multiline directive. Let's add the semicolon:
Now this is more obscure. What you should do here is understand that this line is not a separate standalone directive. It is a part of the big types
multiline (the rare, non-context one) directive; therefore, it should be moved into the block.
Change the tail of the mime.types
from this:
The preceding code should look as follows:
It is done by swapping the last two meaningful lines:
Congratulations, you just enabled a new business process for your company involving mobile workforce.
Two last default configuration files are scgi_params
and uwsgi_params
. Those two are the counterparts for the fastcgi_params
, setting up two alternative methods of running web application on your web servers (SCGI and UWSGI, respectively, as you guessed). You will use them if and when your application developers will bring you applications written with these interfaces in mind.
Now, let's dig deeper into the main configuration file nginx.conf
. In its default form that you see inside the tarball, it is rather empty and useless. At the same time, it is always what you use as a starting point when writing your own configuration, and it can also be used as a demonstration of some common troubles that people inflict on themselves. Going over each directive is not needed, so only those that are good to demonstrate a technique or a common place of errors will be included in this section:
This directive specifies the name of the UNIX user that Nginx processes will run as. Commenting out pieces of configuration is a common documentation technique. It shows the default values and removing the comment character is safe. Nginx will complain if you try to run it as a nonexistent user. As a general rule, you should either trust your package vendor and not change the default or use an account with the least permissions possible.
These lines specify some default filenames. The three error_log
directives are an example of yet another technique: providing several variants as comments so that you can uncomment the one you prefer. These three differ by the level of detail that goes into the error log. There is a whole chapter about logs as those are definitely the first and foremost debugging and troubleshooting tool available for any Nginx administrator.
The pid
directive allows you to change the filename where pid of the main Nginx process will be stored. You rarely have to change this.
Note that these directives use relative paths in these examples, but this is not required. They could also use absolute paths (starting with /
). The error_log
directive provides two other ways of logging besides simple files, which you will see later.
This is our first context and a confusing one. events
is not used to narrow the scope of directives inside it. Most of those directives cannot be used in any other context except events
. This is used as a logical grouping mechanism for many parameters that configure the way Nginx responds to activity on the network. These are very general words, but they fit the purpose. Think of events
as a fancy historical way of marking a group of parameters that are close to one another.
The worker_connections
directive specifies the maximum number of all network connections each worker process will have. It may be a source of strange mistakes. You should remember that this limit includes both the client connections between Nginx and your user's browsers and the server
connections that Nginx will have to open for your backend web application code (unless you only serve static files).
Here we go, http
marks the start of a huge context that usually spans several files (via nested includes) and groups all the configuration parameters that concern the web part of Nginx. You might feel that this sounds a lot like events
, but it is actually a very valid context requiring a separate directive because Nginx can work not only as an HTTP server but also serve some other protocols, for example, IMAP and POP3. It is an infrequent use case, to put it mildly, and we won't spend our time on it, but it shows a very legitimate reason to have a special scope for all HTTP options.
You probably know what the first two directives inside http
do. Never change the default MIME type. Many web clients use this particular type as an indication of a file that needs to be saved on the client computer as an opaque blob of data, and it is a good idea for all the unknown files.
These two directives specify logging of all requests, both successful and unsuccessful, for the reason of tracing and statistics. The first directive creates a log format and the second initiates logging to a specific file according to the mentioned format. It is a very powerful mechanism that gets special attention later in this book. Then we have the following code:
The first and the second of these turn on certain networking features of the HTTP support. sendfile
is a syscall that allows copying of bytes from a file to a socket by the OS kernel itself, sometimes using "zero copy" semantics. It is always safe to leave it on unless you have very little memory—there were reports that sometimes sendfile
boxes may work unreliably on servers with little memory. tcp_nopush
is an option that makes sense only in the presence of sendfile on
. It allows you to optimize a number of network packets that a sendfile-d
file gets sent in. keepalive
is a feature of modern HTTP—the browser (on any other client) may choose not to close a connection to a server right away but to keep it open just in case there will be a need to talk to the same server again very soon. For many modern web pages, consisting of hundreds of objects, this could help a lot, especially on HTTPS, where the cost of opening a new connection is higher. keepalive
timeout is a period in seconds that Nginx will not drop open connections to clients. Tweaking the default value of 75 will rarely affect performance. You can try if you know something special about your clients, but usually people either leave the default timeout or turn the keepalive
off altogether by setting the timeout to zero.
There are a (big) number of compression algorithms much better than the LZW of the traditional gzip, but gzip is most widely available among servers and clients on the web, providing good enough compression for texts with very little cost. gzip on
will turn on automatic compression of data on the fly between Nginx and its clients, that is, those which announce support for gzipped server responses, of course. There are still browsers in the wild that do not support gzip properly. See the description of the gzip_disable
directive in the Nginx documentation at http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_disable. It might be a source of problems, but only if you have some really odd users either with weird special-case client software or from the past.
Now we have another multiline context directive inside http
. It is a famous server
directive that configures a single web server object with a hostname and a TCP port to listen on. Those two are the top-most directives inside this server
. The first, listen
has a much more complex syntax than just a port number, and we will not describe it here. The second one has a simple syntax, but some complex rules of matching that are also better described in the online documentation. It will be sufficient to say that these two provide a way of choosing the right server to process an incoming HTTP request. The most useful is the server_name
in its simplest form; it just contains a hostname in the form of DNS domain and it will be matched against the name that browser sent in the Host:
header which, in turn, is just the host name part of the URL.
This is a way to indicate the charset (encoding) of the documents you serve to the browsers. It is set by default to the special value off
and not the good old koi8-r
from RFC1489. Nowadays, your best bet is specifying utf8
here or just leaving it off. If you specify a charset that does not correspond to the actual charset of your documents, you will get troubles.
Here is an interesting example of using a directive inside a narrowing context. Remember that we already discussed access_log
one level higher, inside the http
directive. This one will turn on logging of requests to this particular server only. It is a good habit to include the name of the server in the name of its access log. So, replace host
with something very similar to your server_name
.
Again, we see a multiline directive introducing a context for a number of URLs on this particular server. location /
will match all the requests unless there is a more specific location on the same level. The rules to choose the correct location block to process an incoming request are quite complex, but simple cases could be described with simple words.
The index
directive specifies the way to process URLs that correspond to a local folder. In this case, Nginx seeks the first existing file from the list in this directive. Serving either an index.html
or index.htm
for such URLs is a very old convention; you shouldn't break it unless you know what you are doing.
By the way, index.htm
without the last l
is an artifact of the old Microsoft filesystems that allowed three or less characters in the filename extension. Nginx never worked on Microsoft systems with such limitations, but files ending in htm
instead of html
still linger around.
These directives set up the way errors are reported to the user. You, as the webmaster, will most certainly rely on your logs but just in case something happened, your users should not be left in dark. The error_page
directive installs a handler for an HTTP error based on the famous HTTP status codes. The first example (commented) tells Nginx that in case it encounters a 404 (not found) error, it should not report it to the user as a real 404 error but instead initiate the subrequest to the /404.html
location, render the results, and present them in the response to the original user request.
By the way, one of the most embarrassing mistakes you could make with Apache web server is to provide a 404 handler that raises another 404 error. Remember these?
Nginx will not show this type of detail to users, but they will still see some very ugly error messages:
The location = /50x.html
looks suspiciously similar to the one we discussed earlier. The only important difference is the =
character that means "exact match". The whole matching algorithm is a complete topic in itself, but here you should definitely remember that =
means "process requests for this and only this URL, do not treat it as a prefix that could match longer URLs".
This is a big commented chunk of options all about the same – processing PHP scripts using two different strategies. Nginx, as you know, does not try to be everything, and it especially tries to never be an application server. The first location
directive sets up proxying to another local PHP server, probably Apache with mod_php
.
Note
Pay attention to the ~
character in location
. It turns on regular expressions engine for the matching of the URLs, hence the escaped .
and the $
in the end. Nginx regular expressions use the common syntax originating from the first grep and ed programs written in the late 1960s. They are implemented with the PCRE library. See the PCRE documentation for the comprehensive description of the language at http://www.pcre.org/original/doc/html/pcrepattern.html.
The second block talks to a FastCGI server running locally on the 9000
port instead of HTTP proxying. It is a bit more modern way of running PHP, but it also requires a lot of parameters (see included file) as compared with the very simple and humble HTTP.
The last part of the server block under discussion introduces Access Control Lists (ACLs), in a location
with a regular expression. The note in the comment is a curious one. There is a tradition of "bolting" Nginx onto an existing Apache installation so that Nginx would serve all the static files itself while proxying more complex, dynamic URLs to the downstream Apache. This kind of setup is definitely not recommended, but you have probably seen or even inherited one. Nginx itself does not support the local .htaccess
files but has to protect those files left from Apache because they could contain sensitive information.
And the final server multiline directive is an example of a secure server serving HTTPS:
Besides a bunch of simple ssl_
directives in the middle, the important thing to note is listen 443 ssl
, which enables HTTPS (basically, HTTPS is HTTP over SSL on the TCP port 443
). We talk about HTTPS in Chapter 3, Troubleshooting Functionality of this book.