100 Mbps Advanced Packet Vault

Operations Guide

Center for Information Technology Integration
February 2002

Introduction

This document assists in configuring, compiling, installing, and running the 100 Mbps Advanced Packet Vault (APV). The reader is assumed to be familiar with Linux and its components as well as administering a conventional Linux system.

Project Overview

During the second six months of our three-year project, we have developed a 100 Mbps APV. The vault writes all captured IP packets to long-term magnetic tape storage for later analysis and for evidentiary purposes. Performance and reliability have been achieved by using high-capacity commodity hardware, an open-source operating system, and mass storage. The goal is to permit the vault to archive all traffic found on a fully-loaded 100 Mbps network segment, and to permit the vault to run for extended periods of time without constant supervision. See Antonelli et al [ACF01] for more details about the 10 Mbps APV, which shares the same architecture.

Although we have done some testing of the APV on a saturated 100 Mbps Ethernet network, we cannot be sure that it will not fail after extended periods of extremely heavy traffic. This is still a preliminary version of the 100 Mbps APV, and unforeseen problems are possible.

Hardware Environment

The APV implementation discussed herein has been developed and tested on a high-performance commodity Intel host platform running Linux. The platform is composed of the following components:

Intel STL2 Server Board	Dual 866 Mhz Pentium III processors, dual PCI (33 Mhz/32 bit, 66 Mhz/64 bit)
Adaptec AIC-7899 U160 (on-board STL2)	Dual-port SCSI controller; provides separate Ultra160 and Ultra Wide SCSI channels
Intel EtherExpress Pro100+ (on-board STL2)	10/100 Mbps PCI Ethernet controller
512 MB RAM (on-board STL2)	Two Kingston KVR133X72RC3/256-IS PC-133 256 MB memory modules
Adaptec 29160 SCSI Controller	single-channel Ultra 160 SCSI controller
two 35 GB SCSI Disks	Seagate ST336704LWV Cheetah 35 GB SCSI3 disks
15 GB IDE Disk	Western Digital WD153AA Caviar 15 GB 5400 RPM Fast ATA/Enhanced IDE compatible
Qualstar Tape Library	Qualstar TLS-8211 tape library subsystem with one HP Model 230 LTO tape drive, eleven media slots, a media changer, and an I/O port

The SCSI disk is connected to the Ultra 160 channel of the on-board SCSI controller, and the tape library changer and LTO tape drive are connected to the Adaptec 29160 SCSI controller. The SCSI disks are used exclusively for buffering packets before they are written to tape, storing packets retrieved from tape, and for the APV database; the IDE disk is used for all other permanent storage. Not shown in the above table are standard components including a monitor, keyboard, mouse, and floppy and CD-ROM drives.

Software Environment

We recommend that you read the entire procedure before beginning any installation to get an overall understanding of the process.

Installing Linux

The 100 Mbps vault runs on RedHat Linux 7.2. We have attempted to maintain compatibility with both OpenBSD and FreeBSD, but the directions below apply only to Linux. There are a number of performance problems that prevent the APV from obtaining reliable 100 Mbps performance under the other two operating systems. Nevertheless, if you wish to attempt to run the vault under OpenBSD or FreeBSD, you may use the (out-of-date) instructions in the file apv/doc/ops10.html as a starting point.

Therefore, begin by installing RedHat 7.2. We recommend a workstation installation with KDE and software development package groups.

The SCSI disk partition will be created later, but other partitions will be created during the install process. The vault software itself, and the packages it depends on, are not particularly large, and any partitioning scheme that leaves adequate space for an ordinary workstation installation should be adequate for the vault. However, the vault does write a lot of data to log files in /var/log, so we advise setting aside a separate /var partition of size at least a gigabyte.

After verifying correct system operation, you may proceed by making APV kernel and system program modifications; obtaining, building, and installing prerequisite packages; and compiling the APV user programs.

Note that throughout these instructions, it is assumed that you have /sbin in your path, which is not the default under this Linux distribution; add /sbin to your path, or be prepared to prepend it to command names as necessary.)

Unpacking the APV Source Distribution

Choose a directory into which the APV source distribution is to be unpacked. For example, to unpack the distribution into directory /usr/apv type the following with root authority:

cd /usr
tar zxf apv100.tar.gz

In the remainder of this document, file and directory names beginning with apv are assumed to refer to the root of the APV source distribution (/usr/apv in this example).

The top level of the APV software distribution is organized as follows:

bin various kernel modifications required for the APV.

bpf Utility routines for reading BPF-format data.

crypto Utility routines for encrypting and decrypting data.

crypto/rijndael Optimized ANSI C implementation of AES. (Not used by the vault, by default.)

crypto/gladman Brian Gladman's AES implementation in assembler for processors compatible with the Pentium family. (This is the default implementation used by the vault.)

decrypt User programs to decrypt data from the vault.

des DES implementation in i386 assembler. (Not used by the vault, by default.)

doc APV documentation and sample system configuration files.

dump User program to encrypt data.

listen User program to read data from network.

pilot Scripts for running the APV.

sys APV Open/FreeBSD kernel modifications

util Utility functions and programs.

Building a modified Linux Kernel

We assume that you know how to patch, build and install a new kernel, configure and install modules, and so forth.

The vault has been tested on Linux kernel versions 2.4.16 and 2.4.17; other kernel versions may work, but kernels earlier than 2.4 almost certainly will not.

First, you must download and apply an appropriate version of the SCSI media changer patch from http://bytesex.org/changer.html. The patches for kernel versions 2.2.15 and 2.4.6 are included in the tar file for version 0.16 of the SCSI media changer package. You'll need the user-level programs in this package later. There are patches for later kernel levels at http://bytesex.org/patches. Obtain and apply the appropriate patch for your kernel. We used version 0.18 of the kernel driver patch found here. This step must be completed before proceeding to the kernel configuration step below.

After applying the scsi changer patch, apply the patch found in apv/sys/linux_scsi_ch_fix.diff; this corrects a problem that occasionally prevents the APV from obtaining information about the bar codes on tapes in the tape library.

Finally, configure, build, and install a kernel. In addition to making sure the new kernel is properly configured for your hardware, make sure the following options are selected:

Multi-device support/Raid support, Multi-device support/RAID-0 (striping) mode: on our hardware, we currently require a software raid partition to get adequate disk performance. To simplify detection of raid partitions at boot time, do not build this support as a module.
Networking options/Packet socket, Networking options/mmapped IO: both options must be checked for the vault to run; this allows the "listen" program to use shared memory to read packets from the network interface.
SCSI support/SCSI media changer support: if this option is not available, then you need to apply the SCSI changer patch mentioned above.
File systems/Virtual memory file system support: the vault uses this file system to buffer packet data that is waiting to be encrypted.

Building and installing modified System Utilities

If you did not obtain the user-level portion of the scsi media changer package when getting the kernel patches above, obtain it now from http://bytesex.org/changer.html. We performed our testing with version 0.16 of the user-level package. Follow the instructions in the README and INSTALL files contained in the tar file to compile the mover program. The APV includes a perl script, apv/pilot/chio-mover, that emulates enough of the chio command found on BSD Unix systems for APV operation. Unless you install the mover command somewhere within your path, you'll need to modify the chio-mover script to reference the mover program that you built above.

In addition, you'll need the mt command which is provided via an RPM. Obtain and install the correct RPM for your system. We used the version found here.

Obtaining and Building Prerequisites

Before using the APV code, it is necessary to obtain and install three prerequisite packages, all freely available (nasm under the GPL, libnet under a permissive BSD-like license, and Time::HiRes under the standard Perl license).

nasm: The nasm assembler is needed to assemble Brian Gladman's AES implementation. Download the latest source tarball from http://www.octium.net/nasm/?page=download. (The Redhat distribution includes a version of nasm, but the version in that rpm has a bug that prevents assembly of the AES code; so first uninstall Redhat's version if necessary.) As of this writing, the source tarball contained some minor errors which prevented compiling nasm properly, but the following procedure should work:

tar -xzvf nasm-0.98.08

cd nasm-0.98.08

rm config.cache

sh configure && make && make install

nasm

/usr/local/bin

libnet: This is needed to compile the testing program spit. Download the latest version from http://www.packetfactory.net/Projects/libnet and install using the standard configure, make, make install procedure.
Time::HiRes: This is a Perl module needed to get accurate timings to measure the performance of the vault. Download it from, e.g., http://www.cpan.org/authors/id/D/DE/DEWEG/Time-HiRes-01.20.tar.gz, and install it using

tar -xzvf Time-HiRes-01.20.tar.gz

cd Time-HiRes-01.20

perl Makefile.PL && make && make test && make install

Compiling the APV User Programs

A Makefile is provided at the top level of the APV source distribution in the directory apv. Compile the APV user program components by issuing the following commands:

cd apv
make

Preparing to run the APV

Several steps are necessary to prepare an APV machine to run for the first time. All commands should be executed with root authority.

Set Up Directories and Filesystems

The vault needs a very fast disk partition to buffer packet data while it waits to be written to tape. For this purpose, we use a RAID partition. To set up this virtual partition, first make two physical partitions of approximately equal size, one on each SCSI disk, and assign each the type 0xfd(253 decimal). The combined size of the two partitions should be at least five GB, to allow enough space for incoming data to accumulate while tapes are rewound and reloaded. More space would be preferable (we use partitions totalling approximately 20 GB).

Next, create an /etc/raidtab file; a sample is provided in apv/doc; at a minimum, the device names in that file must be replaced with the corresponding partitions you have set aside. Issue the command

mkraid

to use the settings in /etc/raidtab to create a new block device called /dev/md0. Create a new filesystem on this device using

mke2fs -b4096 -Rstride=8 /dev/md0

Finally, create a mount point named /scratch0 and mount the RAID filesystem with the command

mount /dev/md0 /scratch0

Now you should be able to access the RAID filesystem.

Since the physical partitions were created with type fd, the Linux kernel will automatically recognize them as RAID partitions and create /dev/md0 at boot; with an appropriate entry in /etc/fstab, /dev/md0 will thus be mounted at /scratch0 automatically. However, the Redhat system initialization scripts contain some unnecessary code that may interfere with this process; to fix this, edit the file /etc/rc.d/rc.sysinit to delete (or comment out) the section referring to raid devices, beginning with the comment on line 465 and ending with the "fi" on line 532.

The vault also needs some directories and files set up in /scratch0:

cd /scratch0
mkdir volumes retrieved fileDB
touch fileDB/firstgen

Finally, the vault needs a directory /mfs where it will mount an in-memory file system:

mkdir /mfs

Remove Daemons and Periodic Tasks

Turn off any unnecessary system services to eliminate the chance that they might consume resources needed by the APV or serve as targets for attacks against the APV:

chkconfig gpm off
chkconfig kudzu off
chkconfig lpd off
chkconfig netfs off
chkconfig portmap off
chkconfig xinetd off
chkconfig lpautofs off
chkconfig isdn off
chkconfig sendmail off
chkconfig wine off

Similarly, turn off any unnecessary cron jobs; this can be done by removing the corresponding files from /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly; we recommend removing the following:

/etc/cron.daily/makewhatis.cron
/etc/cron.daily/slocate.cron
/etc/cron.daily/rpm
/etc/cron.weekly/makewhatis.cron

Setup Rotation of APV Log Files

To ensure that the log files created by the APV are rotated regularly, copy the file apv/doc/apv_logrotate to the directory /etc/logrotate.d/.

Ethernet Device Name

The Ethernet device name is defined as eth0 in the APV software distribution. For host platforms whose components differ from ours (see the "Hardware Environment" section above) the name of the Ethernet device may be different, and will require a change to a script.

To determine the name of the network interface, use the command

netstat -in

to list the available network interfaces. Once this has been determined, change the apv/pilot/pilot.pl script to use the new name:

ed apv/pilot/pilot.pl
/eth0/
s/eth0/newname/
w
q

Filesystem Tuning

Under Linux, the default parameters which control when cached data is written to disk tend to make the system write out very large chunks of data at once. These large writes can prevent the APV from streaming data to the tape at reliable rates. To change these parameters, add the following line to the end of the file /etc/rc.d/rc.local:

echo "10 0 0 0 50 100 20 0 0">/proc/sys/vm/bdflush

and the changes will take effect at the next reboot.

Reboot System

Reboot your system to start the new kernel and remove the unnecessary daemons. Enter the following command with root authority:

reboot

Inspect the boot log by typing

dmesg

and examining the output for signs that the SCSI subsystem has recognized the tape library changer, the tape drive, and the RAID partition.

If all is well, run the command 'apv/pilot/chio-mover status'. You should see output of the form:

picker 0: voltag: <:0>
slot 0: <ACCESS,FULL> voltag: <A0000002:0>
slot 1: <ACCESS,FULL> voltag: <A0000004:0>
slot 2: <ACCESS> voltag: <:0>
slot 3: <ACCESS> voltag: <:0>
slot 4: <EXCEPT,ACCESS,FULL> voltag: <:0>
slot 5: <ACCESS> voltag: <:0>
slot 6: <ACCESS> voltag: <:0>
slot 7: <ACCESS> voltag: <:0>
slot 8: <ACCESS> voltag: <:0>
slot 9: <ACCESS> voltag: <:0>
slot 10: <ACCESS> voltag: <:0>
portal 0: <INEAB,EXENAB,ACCESS> voltag: <:0>
drive 0: <FULL> voltag: <A0000001:0>
drive 1: <FULL> voltag: <A0000003:0>

The string within the angle brackets following voltag:is the bar-code label of the corresponding tape. The string contains the primary volume tag and the alternate volume tag separated with a colon. For the APV, the alternate volume tag is always zero. The EXCEPT flag for the tape in slot 4 indicates the tape has an unreadable bar-code, or no barcode at all.

Load Tapes

Fill the tape library with fresh tapes to which unique bar-codes have been affixed, and prepare a supply of bar-coded tapes to be used to replenish the library while running.

Create Master Key

You will need to create a vault master key (actually, a public keypair), which is used in the cryptographic organization of the vault. Data may be retrieved en masse from the vault archives with the master key. Consequently, the master key is itself protected by a pass phrase that you choose at the time of generation. You should keep both the master key and the passphrase protected in a safe place; loss of either will prevent retrieval of any data from the vault archives.

The master key is identified by the User-ID you select for it (see below). It is important that you select a name that is distinct from that of all other vaults. For this reason, specify an email address using the fully-qualified domain name of the host platform.

Become root, and ensure that any previous keypairs are deleted:

/bin/rm -rf /root/.gnupg

Create the master key:

gpg
gpg --gen-key

The first invocation initializes things, the second creates the keypair. When asked, select the default kind of key (DSA and ElGamal); the default key size (1024 bits); a key lifetime of 0 (does not expire); and for the key User-ID specify real name "apv10", email address "apv10@apv.your.domain.name", and comment "kv1". Next, choose a passphrase to protect the key. The passphrase will be required for retrieving vault data.

After the keys are generated, you can display some information about them using:

gpg --list-public-keys
gpg --list-secret-keys

Starting the APV

The APV code is divided into two subsystems. The listener subsystem reads packets, encrypts them, and stores them to disk in units of one GB volumes. The archiver subsystem reads volumes from the disk and writes them to magnetic tape. The subsystems are largely independent, and are started and stopped independently. To start the APV, first start the listener by issuing the following commands with root authority in a console or X window:

cd apv/pilot
./pilot.csh

Start the archiver by issuing the following commands with root authority in another window:

cd apv/pilot
./archiver.csh

Stopping the APV

The usual way of stopping the APV involves shutting down the listener subsystem while the archiver subsystem is left running. This is accomplished by typing Ctrl-C in the listener window. This shuts down the flow of packets into the APV while the archiver continues to write volumes to tape. The APV may be restarted at any time by restarting the listener subsystem:

cd apv/pilot
./pilot.csh

The archiver will suspend when there are no more volumes to write to the tape, and will resume when more volumes appear.

The archiver may be stopped by typing Ctrl-C in its window. The recommended way of stopping the APV is to stop the listener first, waiting until all data have been written to tape, and then stopping the archiver. If the archiver is currently writing a volume to tape when it is requested to stop, it will not stop immediately but will finish writing the volume to the tape first. This process can take several minutes; the archiver will display progress messages in its window. At exit, the archiver unloads the tape from the drive and stores it in the library. Any volumes waiting to be written to tape will be written to a new tape when the archiver is next started.

To assist in determining whether a volume is being written or if any volumes are waiting to be written, execute the following commands with root authority:

cd apv/util
./apvsync status

A volume with status "FILLED" is waiting to be written to tape, but the write operation has not begun; status "DRAINING" indicates the volume is being written to tape.

Monitoring the APV

The pilot.csh and archiver.csh commands start four main processes:

listen is responsible for sniffing packets from the network and writing them out to files (called segments) in the memory file system mounted at /mfs.
pkt_dump reads the segments written by listen, encrypts them, and writes the results to disk.
pilot.pl oversees the operation of listen and pkt_dump, and gathers the segments written by pkt_dump into volume directories (in /scratch0/volumes/).
archiver.pl writes complete volumes out to tape.

These processes write output to files in /var/log named, respectively, listen.log, pkt_dump.log, pilot.log, and archiver.log. Running tail -f on each of these files in four separate windows will give a good picture of the operation of the vault. All log file entries are self-explanatory, with the exception of listen.log, whose entries are written one per segment and may be interpreted as follows:

dt	Time over which this segment was collected (secs).
pr:t(+s)	Packets read (t = total for this listener so far, `s` = packets in this segment only).
br:t(+s)	Bytes read (t = total for this listener so far, `s` = bytes in this segment only).
pd:t(+s)	Packets dropped (t = total for this listener, `s` = total for segment).
Bps	Bytes/sec processed for this segment.
Mbps	Mbits/sec processed for this segment.
mf	Megabytes of free space in `/mfs`.
uf	Megabytes of free space in `/scratch0`.

The latter two numbers are particularly useful for monitoring the state of the vault. In normal operation, the space used in each filesystem should be relatively stable. However if the vault encounters traffic beyond its ability to keep up, /mfs will fill up as pkt_dump lags behind, and /scratch0 will fill up as archiver.pl lags behind.

To make sure the logs are regularly rotated, a file should be added to the /etc/logrotate.d directory; an example is provided in apv/doc/apv_logrotate. See the logrotate man page for more information.

Database

The APV maintains a flat-file database /scratch0/fileDB/firstgen describing the volumes written to tape. Each volume is described by a record written to the end of the file after it has been committed to tape. The record contains the following fields:

volid	The volume ID of the volume.
tapeid	The ID (bar-code number) of the tape on which the volume was written.
sequence#	Sequence number of the volume on the tape (the first volume of the tape has sequence number one).
starttime	Epoch at which the first packet of the first segment in the volume was returned by BPF.
endtime	Epoch at which the last packet of the last segment in the volume was returned by BPF.
segments	The number of segments stored in the volume.
lis_pkts	Number of packets in the volume read from the interface.
lis_pkts_drop	Number of packets in the volume dropped before being read from the interface.
lis_bytes	Number of bytes in the volume read from the interface.
lis_bytes_snap	Currently unused, and always zero.
lis_bytes_drop	Currently unused, and always zero.
lis_bytes_per_sec	Bytes/sec processed by the listener for the volume, averaged over all segments.
dmp_pkts	Number of packets read by the dumper for the volume.
dmp_pkts_written	Number of packets written by the dumper to the volume.
dmp_bytes	Number of bytes read by the dumper for the volume.
dmp_bytes_written	Number of bytes written by the dumper to the volume.
dmp_bytes_per_sec	Bytes/sec processed by the dumper for the volume, averaged over all segments.

Recovering Data from the APV

To recover data from the APV, determine from which tape(s) data are to be recovered. Manual inspection of the database
/scratch0/fileDB/firstgen can, for example, determine the tapeid's of tapes that contain volumes gathered between epochs of interest. In the future an automated tool will be provided to assist in this task.

Data may be recovered from a tape by issuing the following commands with root authority:

cd apv/pilot
./retrieve.pl [--tapeid=id --firstvol=first --lastvol=last]

where tapeid is the ID from the label of the desired tape, firstvol is the sequence number of the first volume to be recovered (there will be at most 100 volumes per tape; the first volume has sequence number one), and lastvol is the number of the last volume. If the last two parameters are elided, all volumes on the tape are restored.

Alternatively, if the retrieve.pl command is given without the tapeid parameter, you will be prompted to enter the tape id and, optionally, the starting and ending volume numbers. When the specified tape has been processed, you will be prompted for another.

retrieve.pl will demand the passphrase used to encrypt the vault master key; this must be entered twice to prevent accidental mistyping. Terminal echo is turned off.

The specified tape will be loaded, and the specified volume(s) will be copied from tape, decrypted, and stored in directory /scratch0/retrieved. Each volume directory will consist of a number of decrypted segment packet files (named :n.d, where n denotes the epoch at which the last packet was returned by BPF for the segment, and d disambiguates files written at the same epoch) and segment descriptor files (named x:n.d). Each segment packet file is in tcpdump format and may be viewed with the command

tcpdump -r :n.d

The associated descriptor file is a text file giving more details about the data in the segment:

starttime Epoch at which the first packet was read from the interface.

endtime Epoch at which the last packet was read from the interface.

lis_pkts Number of packets read from the interface.

lis_pkts_drop Number of packets dropped before being read from the interface.

lis_bytes Number of bytes read from the interface.

lis_bytes_snap Currently unused, and always zero.

lis_bytes_drop Currently unused, and always zero.

lis_bytes_per_sec Bytes/sec processed by the listener.

dmp_pkts Number of packets read by the dumper.

dmp_pkts_written Number of packets written by the dumper.

dmp_bytes Number of bytes read by the dumper.

dmp_bytes_written Number of bytes written by the dumper.

dmp_bytes_per_sec Bytes/sec processed by the dumper.

It will not be possible to use the APV for writing volumes to tape while it is being used to recover data, although the APV listener subsystem can be running concurrently. However, resource contention between the listener and the retrieval process could force the listener to stop.

Understanding and Recovering from APV Errors

Exceptional APV conditions and suggested recovery procedures are summarized below.

Packet Overrun

If the packet input rate is very high it may be possible for the APV to drop packets. This will be indicated in the listener log file /var/log/listen.log by non-zero segment pd: (packets dropped) "+" entries, and by volume summary entries in the database. Such nonzero entries serve as evidence that the vault did not capture all packets on the network while collecting data for the specified segment, but vault operations are not interrupted by such overruns. We have not seen packet drops during our testing of the vault, using 1500 byte packets at 93 Mbps; we cannot drive the network to greater speeds.

Filesystem Exhaustion

The listener subsystem will stop if less than 32 MB are found to be available in /mfs after a segment has been written there. This event is noted by a message in the listener log file /var/log/listen.log. This is usually caused by consistent heavy network traffic with which the packet dumper cannot keep up. The archiver subsystem continues to run, but the APV will no longer collect packets from the network.

In a similar vein, the listener subsystem will stop if less than 1 GB are found to be available in /scratch0. Again, this is noted in the log and the APV will stop collecting packets from the network. This may be caused by the archiver waiting for scratch tapes, or by continuous heavy traffic with which the archiver cannot keep up.

If either of these conditions are encountered, more monitoring of the system resources may indicate where the problem lies. In either case, the listener subsystem may be restarted after the cause has been determined and the backlog has been alleviated.

Tape Exhaustion

If the archiver needs a scratch (blank) tape from the tape library in order to archive APV data, the archiver will request that new tapes be loaded via a message in /var/log/archiver.log and in the window where the archiver was started. Once fresh scratch tapes have been loaded into the library, hit enter in the archiver window to begin using them. While scratch tapes are not available, APV data will be buffered in /scratch0. If space in /scratch0 is exhausted before fresh tapes are added, the APV will stop collecting packets from the network as explained above.

Unexpected Failures

Unexpected failures in the listener, packet dumper, or archiver programs, as well as the scripts that drive them, cause an explanatory message to appear in the appropriate log file. The affected subsystem will stop. If the listener subsystem fails, the APV will stop collecting packets from the network. If the archiver subsystem fails, packets will be buffered in /scratch0. If space in /scratch0 is exhausted before the cause of the archiver subsystem has been identified and repaired, the APV will stop collecting packets from the network as explained above.

Host Platform Crashes

After the system has rebooted, log in and execute the following commands with root authority before restarting any APV components:

cd apv/pilot
./reconcile.pl

These commands will reconcile the state of the buffered volumes in /scratch0 with the archiver. When the archiver is next restarted, these buffered volumes will be written to tape.

Advanced Configuration

A number of options may be passed to the vault at the time it is run to change its behavior. However, the APV may fail to perform reliably under some circumstances if it is used in other than the default configuration, so this is recommended only for testing purposes.

Examination of the pilot.csh script will show that the main job of pilot.csh is to run the Perl script pilot.pl and redirect its output to a log file. Both pilot.csh and pilot.pl accept a number of command-line options; run pilot.pl --help to get a list.

In particular, command-line options can be used to choose any of three different file formats for the encrypted files that the vault produces: conversation format (the default), endpoint format, and the format used by the prototype vault; see [ACF01] for a description of the differences between these formats.

Note that it is also possible for the vault to use multiple tape drives simultaneously; to do this, start two instances of the archiver, specifying a different tape device for each one. For example, to write to two drives, execute

./archiver.csh --drivenum=0
./archiver.csh --drivenum=1

in two separate shells, both in the apv/pilot directory. Do not try to start two archivers with the same drive number; if this is attempted, the second invocation will fail with an appropriate diagnostic while the original archiver continues to execute without ill effects.

It is not possible to use two different changers; both drives are assumed to be contained in the same tape library.

Resource requirements

Memory space

Our system uses 512 MB of memory, with 256 MB devoted to /mfs. The listener subsystem will stop (and packets will be dropped) if less than 32 MB are available in /mfs; 80 MB is probably the minimum safe size.

Disk space

/scratch0/volumes

At a minimum, the system should have enough space in /scratch0 to buffer volumes during the rewinding of a finished tape and reloading of a new scratch tape. As little as 5 Gigabytes of disk space is probably sufficient for this purpose; we use about 20 Gigabytes.

Database disk space

A database record is added for each volume. In the worst case, the vault should not add more than 250KB to the database in a day. No mechanism is currently provided to manage the growth of the database, so if the vault is run for an extended period of time, the size must be monitored manually.

Log disk space

The four logs produced by the vault should in the worst case grow by under 100 MB per day. The sample apv_logrotate file rotates the logs daily, and deletes old logs after a week; hence the total disk space used by these logs should never exceed 700 MB.

Tape consumption

In the worst case, it may be possible for the vault to fill a tape in as little as two hours. The library has eleven tape slots, so a daily manual procedure will suffice to clear written tapes and install fresh tapes into the library assuming constant 100 Mbps traffic. When the supply of fresh tapes runs out the archiver will display a message in its console window and in its log. Volumes will be buffered in UFS until fresh tapes are added. When UFS space drops below 1 GB, the listener will stop, and packets will be dropped.

To assist with tape management, use the following commands.

Tape Status

Use the following command to show the status of the tapes in the library:

cd apv/pilot
./tapestatus.pl

In the output of this command, "WRITTEN" means the tape has been written to and can be removed; "SCRATCH" denotes a fresh tape available for writing; "N/A" means an empty or unavailable slot.

Removing Filled Tapes

Use the following command to remove a tape from the Qualstar 8211 tape library:

cd apv/pilot
./chio-mover move slot n portal 0

The tapestatus.pl output will identify the slot numbers, n, of tapes that have been written. Use apv/pilot/chio-mover to move each tape in turn to the I/O port; press and hold the "*" key on the library front panel and then press "MENU" to open the I/O port and eject the tape. Press the "*" and "MENU" key sequence again to close the port.

Ingesting Fresh Tapes

Use the following command to put a fresh tape into the Qualstar 8211 tape library:

cd apv/pilot
./chio-mover move portal 0 slot n

where n is the number of an empty slot identified as "N/A" by the output of tapestatus.pl. For each fresh tape in turn press and hold the "*" key on the library front panel and then press "MENU" to open the I/O port, and insert the tape into the port, after which the port will close automatically; then use the chio-mover command to move the tape from the I/O port to an available slot.

Bulk Moves

Rather than using the I/O port, you may open the front door of the Qualstar 8211 tape library and remove and insert the desired tapes en masse. When the door is closed the library changer will scan all slots. Concurrent tape drive operations should be unaffected. We recommend using the I/O port. NOTE: an alarm buzzer will sound when you reach into the library.

References

[ACF01] Charles J. Antonelli, Kevin Coffman, and J. Bruce Fields, "A 10 Mbps Advanced Packet Vault." Technical Report 01-10, Center for Information Technology Integration, University of Michigan, 2001. (Also available in the apv/doc/paper-2001 directory of this distribution).

[Qua01] Qualstar Corporation, "TLS-6000 SCSI-2 Interface Manual," 501205 Revision A. http://www.qualstar.com, under "Technical Services."

bin	various kernel modifications required for the APV.
bpf	Utility routines for reading BPF-format data.
crypto	Utility routines for encrypting and decrypting data.
crypto/rijndael	Optimized ANSI C implementation of AES. (Not used by the vault, by default.)
crypto/gladman	Brian Gladman's AES implementation in assembler for processors compatible with the Pentium family. (This is the default implementation used by the vault.)
decrypt	User programs to decrypt data from the vault.
des	DES implementation in i386 assembler. (Not used by the vault, by default.)
doc	APV documentation and sample system configuration files.
dump	User program to encrypt data.
listen	User program to read data from network.
pilot	Scripts for running the APV.
sys	APV Open/FreeBSD kernel modifications
util	Utility functions and programs.

starttime	Epoch at which the first packet was read from the interface.
endtime	Epoch at which the last packet was read from the interface.
lis_pkts	Number of packets read from the interface.
lis_pkts_drop	Number of packets dropped before being read from the interface.
lis_bytes	Number of bytes read from the interface.
lis_bytes_snap	Currently unused, and always zero.
lis_bytes_drop	Currently unused, and always zero.
lis_bytes_per_sec	Bytes/sec processed by the listener.
dmp_pkts	Number of packets read by the dumper.
dmp_pkts_written	Number of packets written by the dumper.
dmp_bytes	Number of bytes read by the dumper.
dmp_bytes_written	Number of bytes written by the dumper.
dmp_bytes_per_sec	Bytes/sec processed by the dumper.