Quack Tutorial


Overview

Quack is a software package under development at the IRIS DMC for analyzing the quality of realtime seismic data flowing into the IRIS DMC's BUD (Buffer of Uniform Data). The Quack program has a simple, consistent modular design that that is tailored to allow developers to easily create custom modules for monitoring seismic data quality. Although Quack is initially being tailored to run at the IRIS DMC, it should be possible to port it to other data centers.

Quack in an Eggshell

The following diagram shows the basic components of Quack's architecture. The Quack program uses a simple API (Application Program Interface) to instruct modules to measure the quality of seismic waveform data files. The modules then report their calculations back to Quack through the same API after which Quack stores the results into database tables.

Overview of Quack's Architecture

A chief design goal of Quack is the separation of tasks between itself and the modules that it invokes.

Quack is responsible for:

Selecting waveform data files to measure

Selecting sample time intervals
Retrieving seismic meta-data
Invoking custom modules at regularly scheduled time intervals
Storing information generated by the modules where it can later be retrieved for display and analysis

 

Modules are responsible for:

Reading the waveform data files

Parsing the waveform data files (assumed to be in the miniSEED format)
Calculating quality parameters
Returning the calculated quality parameters back to the Quack where they are stored

Quack is configured by via a standard web browser. Configuration involves the association of seismic channels with modules. For example, the administrator might choose to have a module measure all of the channels coming from the IU network once a day.

More Details and Jargon

The Quack program and the modules that it invokes are written in Java.

The modules are, in the Quack vernacular, called Quality Control Applications or QCAs.

A QCA is a java jar file containing at least one class that extends a special base class that is defined in the API. Classes that extend this base class are called controlets (after java applets and servlets). Quack invokes controlets when it needs to have waveform files analyzed. A QCA jar file may contain more than one controlet. A QCA jar file must also contain a special meta-data xml file named qcapp.xml. The meta-data file lets Quack know what controlets are available within the jar file. Complete documentation of the API and sample code can be found at http://www.iris.washington.edu/QUACK/apidocs.

Although the modules must be written in java, module writers are free to invoke programs written in other languages such at C/C++, Perl and Fortran. The modules can also directly invoke programs written in C/C++ through the java JNI mechanism.


Quack in Operation

Controlet Discovery

Before Quack can use controlets in a QCA jar file, it must first "discover" them and register them in a database table. It does this by reading the qcapp.xml file contained in the jar file.The steps to do this are:

  1. Via a web browser interface, the administrator instructs Quack to load a QCA jar file.
  2. Quack reads the qcapp.xml in the jar file and determines what controlets are available within it.
  3. Quack registers the path to the jar file along with the names of the controlets contained within it in a database table.

Controlet Configuration

To configure a controlet to monitor waveform data the administrator uses a web browser interface to Quack to perform the following steps:

  1. Selects a controlet from the list of known controlets
  2. Selects a sample time interval for sampling waveform data.
    e.g. once an hour at 10 minutes past the hour.
  3. Selects a sample delay and sample window length.
       The delay specifies how far into the past data will be sampled.
       The window specifies how many seconds of data will be
       sampled at each sampling interval.
  4. Selects a set of seismic channels to sample:
       A channel is uniquely specified by a network/station/location/channel
       quadruplet

The configuration information is stored in a set of database tables. After the configuration information is stored, Quack sets up a timer mechanism that will fire at the configured interval and activate the selected controlet.

Controlet Activation

Steps to controlet activation:

  1. At the configured sample interval, Quack determines what waveform files need to be sampled and packages this information in a request object. Also packaged in the request object are the starting and ending times for when data should analyzed. The starting time will be the current time minus the delay specified in configuration.The ending time will be the starting time plus the window specified in configuration.

    Quack also creates a response object where the controlet can store its quality measurements.
      
  2. Quack calls the controlet's measureQuality() method passing it the request and response objects.
      
  3. The controlet inspects the request object and reads the waveform files that it points to making calculations about waveform quality parameters. If the controlet needs meta-information such as station location or instrument response it can call methods made available by the request object to retrieve this information.
      
  4. The controlet saves its calculated quality parameters by calling the response object's setValues() method one or more times. There are two versions of the setValues() method. One version allows measurements to be associated with individual seismic channels. The other version allows measurements to be associated with multiple seismic channels.
      
  5. Calling setValues() causes Quack to record the calculated quality parameters in database tables.
      
    The stored information can, at later date, be retrieved from the database and displayed. The mechanisms for reporting the saved data have not yet been designed (as of 9/02).
 

Sample Controlet

Download

A complete version of the code sample presented here can be downloaded from:

http://www.iris.washington.edu/QUACK/quack_example/

The source code can also be viewed in the online javadocs which are available at:

http://www.iris.washington.edu/QUACK/apidocs

RMS Controlet

Here we present a simple controlet that measures Average and RMS signal values and records them back into Quack. To see the complete listing, download the tar file and examine edu/whatsamatau/geophys/RmsControlet.java . We will not go over all of the code; just the basic structure of the sample controlet.

RmsControlet.java

package edu.whatsamatau.geophys;

To make use of the Quack API we import edu.iri.dmc.qc.controlet package. The the class files for this pacakge can be found in qc_api.jar which is included in the quack_example.tar file.

import edu.iris.dmc.qc.controlet.*;
 

To make the class a controlet, we extend the QcControlet base class.

public class RmsControlet extends QcControlet {

The measureQuality() method must be implemented.
This is the main entry point from Quack.

    public void measureQuality(QcRequest request, QcResponse response) throws QcException {
 
        QcWaveform waveform[] = request.getWaveforms();

Each QcWaveform object in the array encapulates information about

  • The location of waveform files for one channel for the request object's time period.
  • Channel meta information for the request object's time period.
  • Channel response information for request object's time period.

The time period for the request object is found by calling the getStartTime() and getEndTime() methods.

In this example we loop over each element in the array and ask our private method to estimate Average and RMS for the given time period. The results are returned in an array of floats. Notice that our private method takes as two of its arguments the starting and ending time of the requests time period.

        for( int i = 0; i < waveform.length; i++ ) {
 
            float[] est = computeAvgRMS(waveform[i], request.getStartTime(), request.getEndTime());
 
            //
            // Results are stored in this format:
            //
            // est[0] = number of samples
            // est[1] = average
            // est[2] = variance
            //

Now that we have our results, we need to pass them back to Quack so that they can be stored in database tables. We do this by calling the setValues() method of the response object. There are two versions of this method. The first version associates measurements with individual channels while the second associates measurements with multiple channels. Here we use the former. Each measurement may contain multiple values which maybe made up of Booleans, Floats, Integers and Strings.

The setValues() takes four Hashtables to handle the four storage types. In this example we will only return Floats.

The hashtable keys must be integers. In Quack, the keys are used as fields in database tables which can subsequently be queried to retrieve stored measurement values.

Float Hashtable
Key
Meaning
0 Number of samples
1 Signal Average
2 Signal RMS

Controlet writers are free to use whatever key/meaning associations that they wish. However, they must document the associations so that the data can be retrieved from Quack's database in a meaningful way.

            // Only floatValues will be populated....
            Hashtable intValues     = null;
            Hashtable floatValues   = new Hashtable();
            Hashtable stringValues  = null;
            Hashtable booleanValues = null;
 
            floatValues.put( new Integer(0), new Float(est[0]) ); // the number of samples
            floatValues.put( new Integer(1), new Float(est[1]) ); // the avg
            floatValues.put( new Integer(2), new Float(est[2]) ); // the rms

The last mater to be taken care of is the start and stop times of the measurements.

In this example we just pull these values straight out of the request object. The controlet writer is free to use other values. For example a controlet could be written that looks for gaps in data. The start/stop times could then be set to indicate starting and stopping times of data gaps. If the controlet writer uses values other than those from the request object, it is important that they document their meaning.

            // pull the start and stop times out of the request
            Date startTime = request.getStartTime();
            Date stopTime  = request.getEndTime();
 
            // 
            // Have the response object store the results. This is where the
            // results are sent to the data base
            //
            try {
                response.setValues( waveform[i], 
                                    intValues,     // null
                                    floatValues,   // not empty
                                    stringValues,  // null
                                    booleanValues, // null
                                    startTime, 
                                    stopTime );
                } catch( IllegalArgumentException e ) {
                    log("unexpected exception:" +  e );
                    throw new QcException( "Unexpected IllegalArgumentException: " + e );
                }
            } catch( IOException e ) {
                log("unexpected exception:" +  e );
                throw new QcException( "Unexpected IOException: " + e );
            }
        }
    }

The following private method calculates statistical parameters from the miniSEED data.

    //------------------------------------------------------------
    // compute the variance and Avg for the time range
    // Note: for simplicity all times are rounded to milliseconds.
    //       MiniSEED time resolution is 0.1 millisecond.
    //------------------------------------------------------------
    private float[] computeAvgRMS( 
                                    QcWaveform waveform, 
                                    Date startTime, 
                                    Date stopTime 
                                 ) 
        throws IOException, QcException {
 
        //-----------------------------------------------------
        //
        //            sum{ (xj - avg)^2 }
        // Variance = -------------------
        //                  (n - 1)
        //
        //            sum{ xj^2 } - 2*avg*sum{ xj } + n*avg^2
        //          = ----------------------------------------
        //                            n - 1
        //
        //   Let sumA = sum{ xj^2 }
        //   Let sumB = sum{ xj } --> avg = sumB/n
        //
        //            sumA - 2*sumB^2/n + n*sumB^2/(n*n)
        //          = ----------------------------------
        //                            n - 1
        //
        //            sumA - sumB^2/n
        //          = ---------------
        //                  n - 1
        //
        //   rms = squrt( variance )
        //
        //------------------------------------------------------
        double rms = 0.0;
        double avg = 0.0;
        int n = 0;
        double sumA = 0.0;
        double sumB = 0.0;

This is where we find out the location of the actual miniSEED data files.

The url array will contain, in time sequence, the urls of all of the miniSEED data that need to be analyzed for one channel for the time interval given by the request objects getStartTime() and getEndTime() methods.

Note: In some circumstances, the url's may point to none-existent files. A well written controlet should be able to handle this situation.

        URL url[] = waveform.getMiniSeedURLs();
 
        //
        // if the url array is empty, return null
        //
        if( url.length == 0 ) {
            log("no data to process, getMiniSeedURLs() returned empty array");
            return null;
        }

DataBuffer is a custom class that comes with this controlet and is used for parsing the raw miniSEED data.

DataBuffer is not part of Quack's controlet API

        DataBuffer db = new DataBuffer(); // miniseed class
 
        //
        // Get the sample window in milliseconds since GMT 0:0:0 jan 1, 1970
        //
        long s1 = startTime.getTime(); // start of the sample interval
        long s2  = stopTime.getTime(); // end   of the sample interval
 
        //
        // Loop on all of the urls.
        //
        for( int i = 0; i < url.length; i++ ) {
 
            URLConnection uc;
            InputStream   in;
            byte          buffer[];
 
            //---------------------------------------------
            // get the record length and close the stream
            //---------------------------------------------
            uc = url[i].openConnection();
            uc.connect();
            in = uc.getInputStream();
            buffer = new byte[512];
 
            int readLength = in.read(buffer);
            if( readLength != buffer.length ) {
                log( "unable to read input stream, readLength=" + readLength);
                in.close();
                continue;
            }
            db.loadBuffer(buffer);
            int buffsize = db.getDataRecLen();
            in.close();
 
            //--------------------------------------------
            // Open the stream again and process
            //--------------------------------------------
            buffer = new byte[buffsize];
            uc = url[i].openConnection();
            uc.connect();
            in = uc.getInputStream();
 
            //--------------------------------------------------
            // MiniSEED data is made up of records. Each record
            // contains a variable number of sequential samples
            // loop over records.
            //--------------------------------------------------
            while( in.read(buffer) == buffer.length ) {
 
                db.loadBuffer(buffer);
 
                if( db.getDataRecLen() != buffer.length ) {
                    log("bad record length!");
                    throw new QcException("bad record length != " + buffer.length );
                }
 
                //
                // determine if the interval is within startTime and stopTime
                //
 
                //----------------------------------------------
                //  r1 - starttime of the record interval
                //       milliseconds since gmt 0:0:0 01/01/70
                //----------------------------------------------
                long r1 = db.getJavaDate().getTime(); 
 
                //---------------------------------------------
                // r2 - stoptime of the record interval
                //      milliseconds since gmt 0:0:0 01/01/70
                //
                //      Note the divide by 10. 
                //      This is to convert 0.1 msecs to msecs
                //---------------------------------------------
                double dRecLength = db.getRecInterval()/10.0;      
                long r2 = r1 + (long)dRecLength;      
 
                //-------------------------------------------------------
                // Determine if the record interval (r1,r2) is touched by
                // the sample interval (s1,s2)
                //-------------------------------------------------------
                boolean recordInSampleInterval = 
 
                    // start of the sample interval is in the record interval
                     ( r1 <= s1 && s1 <= r2 ) ||
 
                    // end of the sample interval is in the record interval
                     ( r1 <= s2 && s2 <= r2 ) ||
 
                    // sample interval surrounds the record interval
                     ( s1 <= r1 && r2 <= s2 ) ;
 
                if( recordInSampleInterval ) {
 
                    //
                    // get the DataBuffer code to decode the miniseed record
                    //
                    int x[] = new int[db.getNumSamples()];
                    db.setDecoder(db.getBlk_1000().getDecodeString());
                    db.decodeBuffer();
                    db.packTrace(x);
 
                    float sampRate = db.getSampleRate();
 
                    //
                    // loop over the array of data. Only anlayze data that is
                    // within the sample interval (s1,s2)
                    //
                    for( int j = 0; j < x.length; j++ ) {
 
                        //
                        // rj - sample time
                        //
                        long rj = r1 + (long)(1000.0f * ((float)j)/sampRate);
 
                        if( s1 <= rj && rj <= s2 ) {
                            sumA += x[j]*x[j];
                            sumB += x[j];
                            n++;
                        }
 
                    } // end loop sample array
 
                } // end if in interval
 
            } // end while read
 
            in.close();
 
        } //end for url's
 
        avg = sumB/(double)n;
 
        //-----------------------------
        //            sumA - sumB^2/n
        // Variance = ---------------
        //                n - 1
        //-----------------------------
        double variance = (sumA - sumB*sumB/(double)n)/((double)n - 1.0);
 
        rms = Math.sqrt( variance );
        float ret[] = new float[3];
        ret[0] = (float)n;
        ret[1] = (float)avg;
        ret[2] = (float)rms;
        return ret;
    }
}