tosapol@hotmail.com

[ Home ] [ Education ] [ Working ] [ Hobbies ] [ Software ] [ Hardware ] [ How-To ] [ Link ] [ mail ]

Squid Reverse-Proxy Servers


The Machines:

There are currently two Squid reverse-proxy servers acting as http accelerators in the office. One is a front-end for the USGS office web server, and the second accelerates the Trinet web server.

Squid is an open-source web cache server, funded by the National Science Foundation.

The http accelerator mode of Squid works by intercepting incoming http requests. It attempts to serve the requests from its cache, and only forwards the requests to the back-end server if the request if for a cgi script, or for a file whose expiration time has passed.

The Setup:

Both machines are AMD CPUs. The USGS Squid is an Athlon 1.2GHz, and the Trintet Squid is an Athlon 1GHz. Each has an Adaptec 29160 fast-wide SCSI controller and one 9GB disk. Each also has a 3COM 100Mb FastEthernet controller, and is connected to a 100Mb CITnet port. The USGS Squid has 768MB of memory, and the Trinet Squid has 256MB.

The USGS Squid server is running FreeBSD 4.2 and Squid 2.3-Stable4. The Trinet Squid is running FreeBSD 4.2 and Squid 2.3-Stable4.

And before you start saying, "Hey, you can't use a PC as a heavy-duty server..." consider that Yahoo serves over 500,000,000 hits a day with FreeBSD servers.

Configuration:

The System Kernel

Both machines are configured essentially identically. Each is running a custom FreeBSD kernel which is tuned for high network performance. The kernel configuration for the USGS Squid is here. The config for the Trinet Squid is here. The essential features of this configuration are documented in comments in the file. To build a kernel for either machine:

usgs-squid# su
Password:
usgs-squid# cd /usr/src/sys/i386/conf
usgs-squid# config USGS-SQUID
Don't forget to do a ``make depend''
Kernel build directory is ../../compile/USGS-SQUID
usgs-squid# cd ../../compile/USGS-SQUID
usgs-squid# make depend
[make depend messages]
usgs-squid# make
[compilation messages deleted to save space]
loading kernel
   text    data     bss     dec     hex filename
1785635  118620  122728 2026983  1eede7 kernel
usgs-squid# make install
To make the new kernel live, it is necessary to reboot the machine.
usgs-squid# su
Password: ******
usgs-squid# shutdown -r now

The Network Interface

In order to get maximum network performance, the Ethernet interface is set for full-duplex operation. This raises the practial network saturation level to around 90Mb/s. This is done by adding "mediaopt full-duplex" to the ifconfig definition in /etc/rc.conf:

ifconfig_xl0="inet 131.215.66.193 netmask 255.255.255.0 media 100baseTX mediaopt full-duplex"

Another tweak for network performance is to decrease the value of TCP MSL from 30 seconds to 3. This was recommened by Duane Wessels from the Squid development group. This allows ports to be reused faster. On FreeBSD 4.x machines, this value can be adjusted on the fly:

sysctl -w net.inet.tcp.msl=3000
The units are milliseconds. To make this happen at boot time, create the file /etc/sysctl.conf with the line:
net.inet.tcp.msl=3000

Squid

Squid is part of the FreeBSD Ports Collection. To build it:

usgs-squid# su
Password:
usgs-squid# cd /usr/ports/www/squid23
usgs-squid# make
usgs-squid# make install
Easy, no?

The only bit of customization done to Squid is to enable the SNMP agent. This is done by uncommenting the following line in the makefile:

CONFIGURE_ARGS+= --enable-snmp

The Squid configuration file is located in /usr/local/etc/squid/squid.conf. A copy of it is here.

Softupdates and other performance tweaks

FreeBSD 4.x has a facility called softupdates in the kernel. This is a method for speeding up filesystem access by avoiding synchronous writes wherever possible. This is enabled on both Squid servers. To enable this feature [and this only needs to be done once] boot the machine in single-user mode:

    boot -s

    [kernel boot messages]

    # tunefs -n enable /home
    # tunefs -n enable [other filesystems]
The Trinet Squid server has softupdates enabled for /cache, /home, and /var. The documentation recommends against enabling this for the root filesystem.

Another optimization is to mount the /cache filesystem with the 'noatime' option. This turns off marking file access times for file reads. The access times for files in the Squid cache are not important. This is enabled in /etc/fstab by adding 'noatime' to the 'options' field:

/dev/da0s1h             /cache          ufs     rw,noatime              2        2

Finally, the following three lines were added to /etc/sysctl.conf:

vfs.vmiodirenable=1
kern.ipc.somaxconn=4096
kern.maxfiles=65536
These adjustments were suggested in an article in Daemon News. They are described in the tuning man page.

Other Software

The machines have the following other ports installed:

ucd-snmp - An snmp agent to report on network traffic.
analog - Logfile analysis.  Used for daily reports.
webalizer - Logfile analysis.  Used for longer-term reports.
rsync - Copies statistics reports to the back-end server for display.
The statistics are at http://pasadena.wr.usgs.gov/stats and http://www.trinet.org/stats.

Real-Time Statistics

MRTG is used to query the SNMP agents on these machines to get network and cache activity information.

Internet Link Exchange
Member of the Internet Link Exchange Free Home Pages at GeoCities