Опубликован: 06.08.2012 | Доступ: свободный | Студентов: 1340 / 49 | Оценка: 5.00 / 5.00 | Длительность: 53:41:00
Лекция 25:

Basic network access: servers

Virtual hosts

Running and maintaining a web server is enough work that you might want to use the same server to host several sets of web pages, for example for a number of different organizations. apache calls this feature virtual hosts, and it offers a lot of support for them. Theoretically, all your hosts can be virtual, but the configuration file still contains additional information for a "main" server, also called a "default" server. The default configuration does not have any virtual servers at all, though it does contain configuration information.

There's a good reason to keep the "main" server information: it serves as defaults for all virtual hosts, which can make the job of adding a virtual host a lot easier.

Consider your setup at http://example.org: you may run your own web pages and also a set of pages for http://biguser.com (see page 310). To do this, you add the following section to /usr/local/etc/apache/httpd.conf:

<VirtualHost *>
ServerAdmin grog@example.org
DocumentRoot /usr/local/www/biguser      Where we put the web pages
ServerName www.biguser.com               the name that the server will claim to be
ServerAlias biguser.com                  alternative server name
ErrorLog /var/log/biguser/error_log
TransferLog /var/log/biguser/access_log
Options +FollowSymLinks
Options +SymLinksIfOwnerMatch
</VirtualHost>

If you look at the default configuration file, you'll find most of these parameters, but not in the context of a VirtualHost definition. They are the corresponding parameters for the "main" web server. They have the same meaning, so we'll look at them here.

  • ServerAdmin is the mail ID of the system administrator. For the main server, it's set to you@your.address, which obviously needs to be changed. You don't necessarily need a ServerAdmin for each virtual domain; that depends on how you run the system.
  • DocumentRoot is the name of the directory that will become the root of the web page hierarchy that the server provides. By default, for the main server it's /usr/local/www/data, which is not really a very good place for data that changes frequently. You might prefer to change this to /var/www, as some Linux distributions do. This is one parameter that you must supply for each virtual domain: otherwise the domain would have the same content as the main server. In this case, it's the location of the files in http://www.example.com/.
  • Next you can put information about individual data directories. The default server first supplies defaults for all directories:
    <Directory />
      Options FollowSymLinks
      AllowOverride None
    </Directory>
    

    The / in the first line indicates the local directory to which these settings should apply. For once, this is really the root directory and not DocumentRoot: they're system-wide defaults, and though you don't have to worry about apache playing around in your root file system, that's the only directory of which all other directories are guaranteed to be a subdirectory. The Options directive ensures that the server can follow symbolic links belonging to the owner. Without this option, symbolic links would not work. We'll look at the AllowOverride directive in the discussion of the .htaccess file below.

    There's a separate entry for the data hierarchy:

    <Directory "/usr/local/www/data">
      Options Indexes FollowSymLinks MultiViews
      AllowOverride None
      Order allow,deny
      Allow from all
    </Directory>
    

    In this case, we have two additional options:

    • Indexes allows httpd to display the contents of a directory if no index file, with a name defined in DirectoryIndex, is present. Without this option, if there is no index file present, you will not be able to access the directory at all.
    • MultiViews allows content-based multiviews, which we don't discuss here.

    Note that if you change the name of the default data directory, you should also change the name on the Directory invocation.

    We'll look at the remaining entries in more detail when we see them again in the discussion of the .htaccess file.

  • Normally you should set ServerName. For example, www.example.org is a CNAME for freebie.example.org (see page 370), and if you don't set this value, clients will access www.example.org, but the server will return the name freebie.example.org.
  • httpd can maintain two log files, an access log and an error log. We'll look at them in the next section. It's a good idea to keep separate log files for each domain.
  • You should have a default VirtualHost entry. People can get quite confused if they select an invalid name (for example, http://www.big-user.com) and get the (default) web page for http://www.example.org. The default page should not match any other host. Instead, it should indicate that the specified domain name is invalid.
  • For the same reason, it's a good idea to have a ServerAlias entry for the same domain name without initial www. The entry in the example above serves the same pages for http://www.biguser.com and http://biguser.com.
  • The directive Options +SymLinksIfOwnerMatch limits following symbolic links to those links that belong to the same owner as the link. Normally the Options directive specifies all the options: it doesn't merge the default options. The + sign indicates that the option specified should be added to the defaults.

After restarting apache, it handles any requests to http://www.biguser.com with these parameters. If you don't define a virtual host, the server will access the main web pages (defined by the main DocumentRoot in entry /usr/local/etc/apache/access.conf).

Log file format

httpd logs accesses and errors to the files you specify. It's worth understanding what's inside them. The following example shows five log entries. Normally each entry is all on a very long line.

p50859b17.dip.t-dialin.net - -             name of system, more
[01/Nov/2002:07:06:12 +1030]               date of access
"GET /Images/yaoipower.jpeg HTTP/1.1"      HTML command
200                                        status (OK)
19365                                      length of data transfer

aceproxy3.acenet.net.au - -
[01/Nov/2002:07:35:34 +1030]
"GET /Images/randomgal.big.jpeg HTTP/1.0"
304 -                                      status (cached)

218.24.24.27 - -                           system without reverse DNS
[01/Nov/2002:07:39:55 +1030]
"GET /scripts/root.exe?/c+dir HTTP/1.0"    looking for an invalid file
404 284                                    status (not found)

218.24.24.27 - -
[01/Nov/2002:07:39:56 +1030]
"GET /MSADC/root.exe?/c+dir HTTP/1.0" 404 282

218.24.24.27 - -
[01/Nov/2002:07:39:56 +1030]
"GET /c/winnt/system32/cmd.exe?/c+dir HTTP/1.0" 404 292

218.24.24.27 - -
[01/Nov/2002:07:40:00 +1030]
"GET /_vti_bin/..%255c../..%255c../..%255c../winnt/system32/cmd.exe?/c+dir HTTP/1.0"
404 323

The fields in the log file are separated by blanks, so empty entries are replaced by a - character. In this example, the second and third fields are always empty. They're used for identity checks and authorization.

To get the names of the clients, you need to specify the HostnameLookups on directive. This requires a DNS lookup for every access, which can be relatively slow.

Although we specified hostname lookups, the last four entries don't have any name: the system doesn't have reverse DNS. They come from a Microsoft machine infected with the Nimda virus and show an attempt to break into the web server. There's not much you can do about this virus; it will probably be years before it goes away. Apart from nuisance value, it has never posed any threat to apache servers.