There are currently two Squid reverse-proxy servers acting as http accelerators in the office. One is a front-end for the USGS office web server, and the second accelerates the Trinet web server.
Squid is an open-source web cache server, funded by the National Science Foundation.
The http accelerator mode of Squid works by intercepting incoming http requests. It attempts to serve the requests from its cache, and only forwards the requests to the back-end server if the request if for a cgi script, or for a file whose expiration time has passed.
Both machines are AMD CPUs. The USGS Squid is an Athlon 1.2GHz, and the Trintet Squid is an Athlon 1GHz. Each has an Adaptec 29160 fast-wide SCSI controller and one 9GB disk. Each also has a 3COM 100Mb FastEthernet controller, and is connected to a 100Mb CITnet port. The USGS Squid has 768MB of memory, and the Trinet Squid has 256MB.
The USGS Squid server is running FreeBSD 4.2 and Squid 2.3-Stable4. The Trinet Squid is running FreeBSD 4.2 and Squid 2.3-Stable4.
And before you start saying, "Hey, you can't use a PC as a heavy-duty server..." consider that Yahoo serves over 500,000,000 hits a day with FreeBSD servers.
Both machines are configured essentially identically. Each is running a custom FreeBSD kernel which is tuned for high network performance. The kernel configuration for the USGS Squid is here. The config for the Trinet Squid is here. The essential features of this configuration are documented in comments in the file. To build a kernel for either machine:
usgs-squid# su Password: usgs-squid# cd /usr/src/sys/i386/conf usgs-squid# config USGS-SQUID Don't forget to do a ``make depend'' Kernel build directory is ../../compile/USGS-SQUID usgs-squid# cd ../../compile/USGS-SQUID usgs-squid# make depend [make depend messages] usgs-squid# make [compilation messages deleted to save space] loading kernel text data bss dec hex filename 1785635 118620 122728 2026983 1eede7 kernel usgs-squid# make installTo make the new kernel live, it is necessary to reboot the machine.
usgs-squid# su Password: ****** usgs-squid# shutdown -r now
In order to get maximum network performance, the Ethernet interface is set for full-duplex operation. This raises the practial network saturation level to around 90Mb/s. This is done by adding "mediaopt full-duplex" to the ifconfig definition in /etc/rc.conf:
ifconfig_xl0="inet 131.215.66.193 netmask 255.255.255.0 media 100baseTX mediaopt full-duplex"
Another tweak for network performance is to decrease the value of TCP MSL from 30 seconds to 3. This was recommened by Duane Wessels from the Squid development group. This allows ports to be reused faster. On FreeBSD 4.x machines, this value can be adjusted on the fly:
sysctl -w net.inet.tcp.msl=3000The units are milliseconds. To make this happen at boot time, create the file /etc/sysctl.conf with the line:
net.inet.tcp.msl=3000
Squid is part of the FreeBSD Ports Collection. To build it:
usgs-squid# su Password: usgs-squid# cd /usr/ports/www/squid23 usgs-squid# make usgs-squid# make installEasy, no?
The only bit of customization done to Squid is to enable the SNMP agent. This is done by uncommenting the following line in the makefile:
CONFIGURE_ARGS+= --enable-snmp
The Squid configuration file is located in /usr/local/etc/squid/squid.conf. A copy of it is here.
FreeBSD 4.x has a facility called softupdates in the kernel. This is a method for speeding up filesystem access by avoiding synchronous writes wherever possible. This is enabled on both Squid servers. To enable this feature [and this only needs to be done once] boot the machine in single-user mode:
boot -s [kernel boot messages] # tunefs -n enable /home # tunefs -n enable [other filesystems]The Trinet Squid server has softupdates enabled for /cache, /home, and /var. The documentation recommends against enabling this for the root filesystem.
Another optimization is to mount the /cache filesystem with the 'noatime' option. This turns off marking file access times for file reads. The access times for files in the Squid cache are not important. This is enabled in /etc/fstab by adding 'noatime' to the 'options' field:
/dev/da0s1h /cache ufs rw,noatime 2 2
Finally, the following three lines were added to /etc/sysctl.conf:
vfs.vmiodirenable=1 kern.ipc.somaxconn=4096 kern.maxfiles=65536These adjustments were suggested in an article in Daemon News. They are described in the tuning man page.
The machines have the following other ports installed:
ucd-snmp - An snmp agent to report on network traffic. analog - Logfile analysis. Used for daily reports. webalizer - Logfile analysis. Used for longer-term reports. rsync - Copies statistics reports to the back-end server for display.The statistics are at http://pasadena.wr.usgs.gov/stats and http://www.trinet.org/stats.
MRTG is used to query the SNMP agents on these machines to
get network and cache activity information.
Member of the Internet Link Exchange | Free Home Pages at GeoCities |