Testing Nginx configuration
There is a very handy tool in the Nginx kit, a syntax checker for the configuration files. It is built into the main Nginx executable application and invoked by using the -t
command-line switch as follows:
... % nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful ...
The command nginx -t
tries to check your configuration quite thoroughly. For example, it will check all the included files and try to access all the auxiliary files like logs or pids to warn you about their nonexistence or insufficient permissions. You will become a better Nginx administrator if you acquire a habit of frequently running nginx -t
.
The default configuration directory layout
We will now run through the entire configuration that you get bundled with Nginx by default. Some of it is a good example from which you will start writing your own. Some of it is just a sign of Nginx age. Again, we use the original tarball for the 1.9.12 version that is available on the official Nginx website.
This is a list of files inside the conf
folder of the Nginx source archive:
... % ls fastcgi.conf koi-utf mime.types scgi_params win-utf fastcgi_params koi-win nginx.conf uwsgi_params ...
The nginx.conf
is the main file, the one everything starts with. All other files are either included from nginx.conf
or not used at all. Actually, nginx.conf
is the only configuration file that is required by Nginx code (and you can override even that by using -c
command-line switch). We will discuss its content a little bit later.
A pair of fastcgi.conf
and fastcgi_params
files contains almost the same list of simple commands configuring the Nginx FastCGI client. FastCGI, being an interface to run web applications behind Nginx, is not turned on by default. These two files are provided as examples (one of them is even included with the include
command from a commented section of the nginx.conf
file).
Three files with enigmatic names koi-utf
, koi-win
, and win-utf
are character maps to convert between different ways to encode Cyrillic characters in electronic documents. And Cyrillic is, of course, the script used for Russian and several other languages. In the old days of the first Internet hosts in Russia, there was a dispute on which way to encode Russian letters in documents. You can read about different Cyrillic charsets/encodings at http://czyborra.com/charsets/cyrillic.html. Several of them got popular, and web servers had to include functionality of converting documents on the fly in the case that a client browser requested a different encoding from what was used by the server. There was also a whole fork of Apache Web Server that had this functionality built in. Nginx had to do the same to stand a chance against Apache. And now, more than 10 years later, we still have these re-encoding files that are deeply obsolete as the global World Wide Web continues to move towards UTF-8 as the one universal encoding for all human languages. You won't ever use these koi-utf
, koi-win
, and win-utf
files unless you support a very old website for Russian-speaking visitors.
The file named mime.types
is used by default. You can see that it is included from the main nginx.conf
, and you better leave it that way. "MIME types" is a registry of different types of information in files.
They have their origin in some of the email standards (hence, the MIME name) but are used everywhere, including the Web. Let's look inside mime.types
:
... types { text/html html htm shtml; text/css css; text/xml xml; image/gif gif; ...
Because it is included from nginx.conf
, it should have a proper Nginx configuration language syntax. That's right, it contains a single multiline directive types
, which is not a context (as described in the previous section). Its block is a list of pairs, each being a mapping from one MIME type to a list of file extensions. This mapping is used to mark static files served by Nginx as having a particular MIME (or content) type. According to the quoted segment, the files common.css
and new.css
will get the type text/css
, whereas index.shtml
will be text/html
, and so on and so forth; it is really easy.
A quick example of modifying the MIME types registry
Sometimes, you will add things to this registry. Let's try to do this now and demonstrate an introduction of a simple mistake and the workflow to find and fix it.
Your website will host calendars for your colleagues. A calendar is a file in the iCalendar format generated by a third-party application and saved to a file with .ics
extension. There is nothing about ics
in the default mime.types
, and because of this, your Nginx instance will serve these files with the default application/octet-stream
MIME type, which basically means "it is a bunch of octets (bytes) and I don't have the faintest idea of what they mean". Suppose that the new calendar application your colleagues use require proper iCalendar-typed HTTP responses. This means that you have to add this text/calendar
type into your mime.types
file.
You open mime.types
in your editor and add this line to the very end (not in the middle, not to the start, but the end is important for the sake of this experiment) of the file:
... text/calendar ics ...
You then run nginx -t
because you are a good Nginx administrator:
... nginx: [emerg] unexpected end of file, expecting ";" or "}" in /etc/nginx/mime.types:91 nginx: configuration file /etc/nginx/nginx.conf test failed ...
Bam. Nginx is smart enough to tell you what you need to fix; this line does not look like either a simple or a multiline directive. Let's add the semicolon:
... text/calendar ics; ... ... nginx: [emerg] unknown directive "text/calendar" in /etc/nginx/mime.types:90 nginx: configuration file /etc/nginx/nginx.conf test failed ...
Now this is more obscure. What you should do here is understand that this line is not a separate standalone directive. It is a part of the big types
multiline (the rare, non-context one) directive; therefore, it should be moved into the block.
Change the tail of the mime.types
from this:
} text/calendar ics;
The preceding code should look as follows:
text/calendar ics; }
It is done by swapping the last two meaningful lines:
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful
Congratulations, you just enabled a new business process for your company involving mobile workforce.
Two last default configuration files are scgi_params
and uwsgi_params
. Those two are the counterparts for the fastcgi_params
, setting up two alternative methods of running web application on your web servers (SCGI and UWSGI, respectively, as you guessed). You will use them if and when your application developers will bring you applications written with these interfaces in mind.
Default nginx.conf
Now, let's dig deeper into the main configuration file nginx.conf
. In its default form that you see inside the tarball, it is rather empty and useless. At the same time, it is always what you use as a starting point when writing your own configuration, and it can also be used as a demonstration of some common troubles that people inflict on themselves. Going over each directive is not needed, so only those that are good to demonstrate a technique or a common place of errors will be included in this section:
... #user nobody; ...
This directive specifies the name of the UNIX user that Nginx processes will run as. Commenting out pieces of configuration is a common documentation technique. It shows the default values and removing the comment character is safe. Nginx will complain if you try to run it as a nonexistent user. As a general rule, you should either trust your package vendor and not change the default or use an account with the least permissions possible.
... #error_log logs/error.log; #error_log logs/error.log notice; #error_log logs/error.log info; #pid logs/nginx.pid; ...
These lines specify some default filenames. The three error_log
directives are an example of yet another technique: providing several variants as comments so that you can uncomment the one you prefer. These three differ by the level of detail that goes into the error log. There is a whole chapter about logs as those are definitely the first and foremost debugging and troubleshooting tool available for any Nginx administrator.
The pid
directive allows you to change the filename where pid of the main Nginx process will be stored. You rarely have to change this.
Note that these directives use relative paths in these examples, but this is not required. They could also use absolute paths (starting with /
). The error_log
directive provides two other ways of logging besides simple files, which you will see later.
... events { worker_connections 1024; } ...
This is our first context and a confusing one. events
is not used to narrow the scope of directives inside it. Most of those directives cannot be used in any other context except events
. This is used as a logical grouping mechanism for many parameters that configure the way Nginx responds to activity on the network. These are very general words, but they fit the purpose. Think of events
as a fancy historical way of marking a group of parameters that are close to one another.
The worker_connections
directive specifies the maximum number of all network connections each worker process will have. It may be a source of strange mistakes. You should remember that this limit includes both the client connections between Nginx and your user's browsers and the server
connections that Nginx will have to open for your backend web application code (unless you only serve static files).
The http directive
... http { include mime.types; default_type application/octet-stream; ...
Here we go, http
marks the start of a huge context that usually spans several files (via nested includes) and groups all the configuration parameters that concern the web part of Nginx. You might feel that this sounds a lot like events
, but it is actually a very valid context requiring a separate directive because Nginx can work not only as an HTTP server but also serve some other protocols, for example, IMAP and POP3. It is an infrequent use case, to put it mildly, and we won't spend our time on it, but it shows a very legitimate reason to have a special scope for all HTTP options.
You probably know what the first two directives inside http
do. Never change the default MIME type. Many web clients use this particular type as an indication of a file that needs to be saved on the client computer as an opaque blob of data, and it is a good idea for all the unknown files.
... #log_format main '$remote_addr - $remote_user [$time_local] "$request" ' # '$status $body_bytes_sent "$http_referer" ' # '"$http_user_agent" "$http_x_forwarded_for"'; #access_log logs/access.log main; ...
These two directives specify logging of all requests, both successful and unsuccessful, for the reason of tracing and statistics. The first directive creates a log format and the second initiates logging to a specific file according to the mentioned format. It is a very powerful mechanism that gets special attention later in this book. Then we have the following code:
... sendfile on; #tcp_nopush on; #keepalive_timeout 0; keepalive_timeout 65; #gzip on; ...
The first and the second of these turn on certain networking features of the HTTP support. sendfile
is a syscall that allows copying of bytes from a file to a socket by the OS kernel itself, sometimes using "zero copy" semantics. It is always safe to leave it on unless you have very little memory—there were reports that sometimes sendfile
boxes may work unreliably on servers with little memory. tcp_nopush
is an option that makes sense only in the presence of sendfile on
. It allows you to optimize a number of network packets that a sendfile-d
file gets sent in. keepalive
is a feature of modern HTTP—the browser (on any other client) may choose not to close a connection to a server right away but to keep it open just in case there will be a need to talk to the same server again very soon. For many modern web pages, consisting of hundreds of objects, this could help a lot, especially on HTTPS, where the cost of opening a new connection is higher. keepalive
timeout is a period in seconds that Nginx will not drop open connections to clients. Tweaking the default value of 75 will rarely affect performance. You can try if you know something special about your clients, but usually people either leave the default timeout or turn the keepalive
off altogether by setting the timeout to zero.
There are a (big) number of compression algorithms much better than the LZW of the traditional gzip, but gzip is most widely available among servers and clients on the web, providing good enough compression for texts with very little cost. gzip on
will turn on automatic compression of data on the fly between Nginx and its clients, that is, those which announce support for gzipped server responses, of course. There are still browsers in the wild that do not support gzip properly. See the description of the gzip_disable
directive in the Nginx documentation at http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_disable. It might be a source of problems, but only if you have some really odd users either with weird special-case client software or from the past.
... server { listen 80; server_name localhost; ...
Now we have another multiline context directive inside http
. It is a famous server
directive that configures a single web server object with a hostname and a TCP port to listen on. Those two are the top-most directives inside this server
. The first, listen
has a much more complex syntax than just a port number, and we will not describe it here. The second one has a simple syntax, but some complex rules of matching that are also better described in the online documentation. It will be sufficient to say that these two provide a way of choosing the right server to process an incoming HTTP request. The most useful is the server_name
in its simplest form; it just contains a hostname in the form of DNS domain and it will be matched against the name that browser sent in the Host:
header which, in turn, is just the host name part of the URL.
... #charset koi8-r; ...
This is a way to indicate the charset (encoding) of the documents you serve to the browsers. It is set by default to the special value off
and not the good old koi8-r
from RFC1489. Nowadays, your best bet is specifying utf8
here or just leaving it off. If you specify a charset that does not correspond to the actual charset of your documents, you will get troubles.
... #access_log logs/host.access.log main; ...
Here is an interesting example of using a directive inside a narrowing context. Remember that we already discussed access_log
one level higher, inside the http
directive. This one will turn on logging of requests to this particular server only. It is a good habit to include the name of the server in the name of its access log. So, replace host
with something very similar to your server_name
.
... location / { root html; index index.html index.htm; } ...
Again, we see a multiline directive introducing a context for a number of URLs on this particular server. location /
will match all the requests unless there is a more specific location on the same level. The rules to choose the correct location block to process an incoming request are quite complex, but simple cases could be described with simple words.
The index
directive specifies the way to process URLs that correspond to a local folder. In this case, Nginx seeks the first existing file from the list in this directive. Serving either an index.html
or index.htm
for such URLs is a very old convention; you shouldn't break it unless you know what you are doing.
By the way, index.htm
without the last l
is an artifact of the old Microsoft filesystems that allowed three or less characters in the filename extension. Nginx never worked on Microsoft systems with such limitations, but files ending in htm
instead of html
still linger around.
... #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root html; } ...
These directives set up the way errors are reported to the user. You, as the webmaster, will most certainly rely on your logs but just in case something happened, your users should not be left in dark. The error_page
directive installs a handler for an HTTP error based on the famous HTTP status codes. The first example (commented) tells Nginx that in case it encounters a 404 (not found) error, it should not report it to the user as a real 404 error but instead initiate the subrequest to the /404.html
location, render the results, and present them in the response to the original user request.
By the way, one of the most embarrassing mistakes you could make with Apache web server is to provide a 404 handler that raises another 404 error. Remember these?
Nginx will not show this type of detail to users, but they will still see some very ugly error messages:
The location = /50x.html
looks suspiciously similar to the one we discussed earlier. The only important difference is the =
character that means "exact match". The whole matching algorithm is a complete topic in itself, but here you should definitely remember that =
means "process requests for this and only this URL, do not treat it as a prefix that could match longer URLs".
... # proxy the PHP scripts to Apache listening on 127.0.0.1:80 # #location ~ \.php$ { # proxy_pass http://127.0.0.1; #} # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000 # #location ~ \.php$ { # root html; # fastcgi_pass 127.0.0.1:9000; # fastcgi_index index.php; # fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name; # include fastcgi_params; #} ...
This is a big commented chunk of options all about the same – processing PHP scripts using two different strategies. Nginx, as you know, does not try to be everything, and it especially tries to never be an application server. The first location
directive sets up proxying to another local PHP server, probably Apache with mod_php
.
Note
Pay attention to the ~
character in location
. It turns on regular expressions engine for the matching of the URLs, hence the escaped .
and the $
in the end. Nginx regular expressions use the common syntax originating from the first grep and ed programs written in the late 1960s. They are implemented with the PCRE library. See the PCRE documentation for the comprehensive description of the language at http://www.pcre.org/original/doc/html/pcrepattern.html.
The second block talks to a FastCGI server running locally on the 9000
port instead of HTTP proxying. It is a bit more modern way of running PHP, but it also requires a lot of parameters (see included file) as compared with the very simple and humble HTTP.
... # deny access to .htaccess files, if Apache's document root # concurs with Nginx's one # #location ~ /\.ht { # deny all; #} ...
The last part of the server block under discussion introduces Access Control Lists (ACLs), in a location
with a regular expression. The note in the comment is a curious one. There is a tradition of "bolting" Nginx onto an existing Apache installation so that Nginx would serve all the static files itself while proxying more complex, dynamic URLs to the downstream Apache. This kind of setup is definitely not recommended, but you have probably seen or even inherited one. Nginx itself does not support the local .htaccess
files but has to protect those files left from Apache because they could contain sensitive information.
And the final server multiline directive is an example of a secure server serving HTTPS:
... # HTTPS server # #server { # listen 443 ssl; # server_name localhost; # ssl_certificate cert.pem; # ssl_certificate_key cert.key; # ssl_session_cache shared:SSL:1m; # ssl_session_timeout 5m; # ssl_ciphers HIGH:!aNULL:!MD5; # ssl_prefer_server_ciphers on; # location / { # root html; # index index.html index.htm; # } #} ...
Besides a bunch of simple ssl_
directives in the middle, the important thing to note is listen 443 ssl
, which enables HTTPS (basically, HTTPS is HTTP over SSL on the TCP port 443
). We talk about HTTPS in Chapter 3, Troubleshooting Functionality of this book.