| Crop
Rotation in the FARM
Although
IRIS' mission is to protect the entire archive of continuous data,
event-related data holdings are the most valuable and frequently
accessed. To support this high event-related activity, the DMC has
routinely extracted the relevant portions of continuous holdings,
assembled them into event-focused volumes in SEED
format, and placed them onto DMC computing systems where they are
easy to access by the general community. We call this event-related
data collection the FARM
and it is one of the most actively used parts of the data holdings
at the IRIS DMC. But, just like many things, certain aspects of
the FARM have grown stale and difficult to manage. It is time for
a crop rotation of the FARM products.
As
the following figure shows, of the 70,000 data shipments that we
anticipate making this year, only 20,000 will actually come out
of the large mass storage systems at the IRIS DMC. Roughly 3,800
of them will come from SPYDER® SEED volumes built in near-real
time and nearly 46,000 will come straight from the FARM (via physical
transfers on tape, by ftp, over the internet via WILBER, or the
WEED program). Figures like this show us the value of the FARM products.

The
figure below shows both the total size of the FARM and the number
of event volumes. The current FARM consists of only the GDSN and
GSN network data from the years 1977 through May 2000. There are
a total of 5,204 full SEED volumes with a total size of 104 gigabytes.
Since these are full SEED volumes, it is difficult to ensure that
the header information in the SEED volume is current, and in many
instances we know the headers are out of date.

Some
of the Main Features of the New FARM
(data for 2000 only up to May)
We
are currently revamping the way in which the FARM is created. Some
of the main changes are as follows:
- The
new FARM will have data from all available networks. The
current FARM includes data only from the GSN Network, and yet
the DMC has data from nearly 100 networks (when one includes PASSCAL
deployments). Mini-SEED volumes for each network will be stored
separately in a Pool of Network Data (POND).
- The
waveform data will be stored as mini-SEED (data only). The
products will not have headers attached to them until the products
are requested. In this manner, products will always have the most
current metadata available and will never grow stale.
- There
will be three types of FARM products: SPYDER®-FARM volumes
(or simply SPYDER®), FARM volumes, and UV-FARM volumes.
- The
SPYDER®-FARM will be built from real-time, near-real-time,
and quality controlled data sources as they arrive at the DMC.
The hypocenters for these data will come from the same NEIC
source that currently triggers SPYDER®. Data will continue
to be added to a SPYDER®-FARM product until a FARM product
for the event exists.
- The
FARM products will be built from data that pass through
the existing quality control system of the DMS. These are the
primary products in the FARM. The NEIC Weekly PDE will be the
catalog used to build the FARM (using the Harvard Moment Magnitudes).
- The
UV-FARM will contain all the ultra-long period, and very-long
period data channels for a given network. Each UV-FARM product
will contain data for a two-week period.
- All
FARM products will be dynamically updated as new data arrive.
(Newly arrived data will flow into the appropriate FARM POND.)
Therefore, FARM products will always contain all available data.
- The
data in the SPYDER®-FARM and the FARM products will be coordinated.
As quality controlled FARM products are built, the corresponding
data from the SPYDER®-FARM volumes will be removed. If everything
works as planned, the SPYDER®-FARM volume should eventually
disappear. However if some data never reaches us from the quality-controlled
path, it will remain forever in the SPYDER®-FARM volume.
- WILBER
and WEED will be modified to take full advantage of the new organization
of the FARM. These access tools will have to be updated to
accommodate the addition of multiple network volumes (or PONDs).
- The
basic algorithm for the FARM will only change slightly. We
will attempt to go down to smaller events (Mw>=5.0) and we
will lengthen the pre-event window for the long period channels.
Please
be patient, there are several million files that must be produced
for this project and so it will take a few months to get the new
FARM system in place. In the intervening period we will continue
to support the present FARM building process.
submitted
by Tim Ahern
For more information or comments contact
|