2dFGRS Final Release Database

The 2dFGRS database consists of three main components:

  1. A set of FITS files, one per object in the source catalogue (the input photometric catalogue), that contains every piece of information about each object from the source catalogue, spectroscopic observations and subsequent analysis.
  2. An Mini SQL (mSQL) database, that contains all of the parameters for each object and allows complex searching and subsetting of the survey objects, and the retrieval of selected subsets of spectra and images.
  3. A WWW interface, that provides a number of different modes for querying the mSQL database and a variety of ways for returning the results of such queries.
This overview gives a basic introduction to each of these three database components. Further information can be found in Colless et al. (2001), which describes the 100k data release. Note, however, that the final release database includes changes and additions to the 100k database, as described below.

1. Object FITS Files

There are 382323 target objects in the final FITS database, each with its own FITS file. Each object in the survey source catalogue has been given a serial number (SEQNUM), and the name of the FITS file for that object is this serial number. The serial numbers for objects in the SGP strip are 1-193550, for objects in the NGP strip 193551-332694, and for objects in the random fields 332695-389713.

Note that: (i) The 'supplementary' objects with serial numbers >389713 that were included in the 100k release have now all been deleted; these were additional objects fainter than the nominal survey limit, and were never observed. (ii) The objects in random field 859 were improperly named, so, as this field was never observed, the 1166 sources in the field have been deleted. (iii) Visual inspection of the source images during the initial construction of the source catalogue identified 4604 sources as artifacts; these have now been deleted. (iv) A final visual check of problematic objects selected by comparing the APM and SuperCosmos source lists (see below) identified a further 1620 sources as artifacts; these have also been deleted. These deletions mean that the source catalogue in the final release is significantly cleaner than that in the 100k release.

Each FITS file has a primary part which contains all the source catalogue data about the object (as FITS keywords) and a Digital Sky Survey sky chart (postage stamp image) of the objects. The WWW version of the database also contains SuperCosmos bJ and rF postage stamp images (these images are omitted from the DVD version to conserve space, although the SuperCosmos parameters are present in both versions). Subsequent extensions of the FITS file contain the spectral observations and parameters obtained for the object, appended in chronological order. Each spectrum extension contains the spectrum of the object, the variance (error) array for the object spectrum, the spectrum of the mean sky that was subtracted from the object spectrum, and FITS keyword data giving information about the spectroscopic observation and derived parameters such as the redshift and spectral quality. Many targets contain multiple spectrum extensions corresponding to multiple observations.

Table 1 lists all the FITS keywords present in the primary part (extension 0) of the FITS files; Table 2 lists all the keywords present in spectrum extensions 1..spectra. The tables give the names of the keywords, example values, and the keyword descriptions.

The first set of photometric parameters are from APM scans of the Southern Sky Survey bJ plates, and are the primary photometric data on which the 2dFGRS was based. Photometric parameters with FITS keywords prefixed SB and SR are from SuperCosmos scans of, respectively, the bJ and rF Southern Sky Survey plates.

In order to remove artifacts from the source catalogue and to flag other problematic cases, 10812 sources (about 3% of the total) were selected for visual examination of their images based on various criteria: 1656 objects with no SuperCosmos source matching the APM position, 4377 objects with a poor match between the APM and SuperCosmos positions and/or magnitudes, another 1321 objects with anomalous colours, and finally all 3458 bright objects with SuperCosmos magnitude bJ<16. These objects' images were individually examined and the VISCHECK classification parameter set as follows: 0 = not examined (373131 sources), 1 = galaxy (5283 sources), 2 = star (114 sources), 3 = merger (3102 sources), 4 = in bright star halo (377 sources), 5 = artifact (1620 sources), 6 = other (316 sources). In addition, the VISCOMM parameter gives comments in some cases (and is mandatory for 'other' classifications). The 1620 objects classified as artifacts were deleted from the database; all 9192 other sources are retained.


Table 1: FITS keywords for the primary image (extension 0)
    
Keyword  = Value                / Description
	 
SIMPLE   x                    T / file does conform to FITS standard             
BITPIX   x                   16 / number of bits per data pixel                  
NAXIS    x                    2 / number of data axes                            
NAXIS1   x                   49 / length of data axis 1                          
NAXIS2   x                   49 / length of data axis 2                          
EXTEND   x                    T / FITS dataset may contain extensions            
BSCALE   x               1.0000 / REAL = (FITS * BSCALE) + BZERO                 
BZERO    x               0.0000 / Bias                                           
SEQNUM   =               100100 / Sequence number : Database Primary Key         
NAME     = 'TGS469Z164'         / 2dFgg assigned name                            
IMAGE    x 'SKYCHART'           / Existance of postage stamp image               
RA       =         0.7943429758 / RA  in radians :   3  2  3.00                  
DEC      =        -0.5475286941 / Dec in radians : -31 22 15.9                   
APMEQNX  =              1950.00 / Equinox of RA and Dec                          
BJSEL    =               18.858 / final bj mag used in the object selection      
PROB     =               2335.4 / psi classification parameter                   
PARK     =                0.910 / k   classification parameter                   
PARMU    =                0.187 / mu  classification parameter                   
IGAL     =                    1 / final class flag (equals 1 for a galaxy)       
JON      =                   -1 / eyeball class flag                             
ORIENT   =                 91.0 / orientation in degrees clockwise from E-W      
ECCENT   =                0.270 / eccentricity                                   
AREA     =                308.0 / isophotal area in pixels                       
X_BJ     x               2918.7 / plate x_bj in 8 micron pixels                  
Y_BJ     x               9123.1 / plate y_bj in 8 micron pixels                  
DX       x                 43.0 / corrected difference (x_bj - x_R)*100          
DY       x                 49.0 / corrected difference (y_bj - y_R)*100          
BJG      =               18.920 / bj without extinction correction               
RMAG     x                10.35 / unmatched apm "total" mag                      
PMAG     x                10.53 / unmatched raw apm profile integrated mag       
FMAG     x                 8.72 / unmatched raw apm 2" profile integrated mag    
SMAG     x                10.74 / unmatched raw stellar mag (from APMCAL)        
REDMAG   x                      / unmatched raw red stellar mag (from APMCAL)    
IFIELD   =                  417 / ukst field                                     
IGFIELD  x                 2007 / galaxy number inthis field                     
REGION   = 'S417    '           / GSSS Region name                               
OBJEQNX  =              2000.00 / Equinox of the plate reference frame           
OBJRA    =         0.8034094522 / RA  in Radians :  03 04 07.673                 
OBJDEC   =        -0.5441390223 / Dec in Radians : -31 10 36.73                  
PLTSCALE x              67.2000 / Plate Scale in arcsec per mm                   
XPIXELSZ x           25.2844500 / X pixel size in microns                        
YPIXELSZ x           25.2844500 / Y pixel size in microns                        
OBJPLTX  x              7970.86 / Object X on plate (pixels)                     
OBJPLTY  x              4148.11 / Object Y on plate (pixels)                     
DATAMAX  x                14431 / Maximum data value                             
DATAMIN  x                 4011 / Minimum data value                             
BJSELOLD =                18.96 / original bj mag used in the object selection   
BJG_OLD  =                19.01 / original bj without extinction correction      
BJSEL100 =                18.93 / bj mag used in the object selection (100k)     
BJG_100  =                18.99 / bj without extinction correction (100k)        
GALEXT   =                0.062 / galactic extinction                            
VISCHECK =                    0 / visual check flag                              
VISCOMM  = '        '           / visual check comments
SBFIELD  x '417     '           / Sky survey field number                        
SBRA     =      46.032062700000 / Object RA (J2000.0)                            
SBDEC    =     -31.176815800000 / Object declination (J2000.0)                   
SBBJMAG  =               18.836 / UK-J (Bj) magnitude                            
SBAREA   =                  254 / Total area                                     
SBAI     =                 7927 / Weighted semi-major axis                       
SBBI     =                 6103 / Weighted semi-minor axis                       
SBPA     =                    8 / Celestial position angle                       
SBCLASS  =                    1 / Classification flag                            
SBSIGMA  =               22.668 / profile classification statistic               
SBBLEND  =                    0 / Deblending flag                                
SBQUAL   =                    0 / Quality flag                                   
SBAPM    =                    1 / APM-Supercosmos Bj-band match classification   
SRFIELD  x '417     '           / Sky survey field number                        
SRRA     =      46.031963000000 / Object RA (J2000.0)                            
SRDEC    =     -31.176771700000 / Object declination (J2000.0)                   
SRRMAG   =               17.188 / UK-J (R) magnitude                             
SRAREA   =                  249 / Total area                                     
SRAI     =                 7467 / Weighted semi-major axis                       
SRBI     =                 6317 / Weighted semi-minor axis                       
SRPA     =                   13 / Celestial position angle                       
SRCLASS  =                    1 / Classification flag                            
SRSIGMA  =               16.574 / profile classification statistic               
SRBLEND  =                    0 / Deblending flag                                
SRQUAL   =                    0 / Quality flag                                   
SRSB     =                    1 / Supercosmos R vrs Bj match classification      

Table 2: FITS keywords for the spectra (extensions 1..spectra)

Keyword  = Value                / Description
	 
XTENSION x 'IMAGE   '           / IMAGE extension                    
BITPIX   x                  -32 / number of bits per data pixel      
NAXIS    x                    2 / number of data axes                
NAXIS1   x                 1024 / length of data axis 1              
NAXIS2   x                    3 / length of data axis 2              
PCOUNT   x                    0 / required keyword; must = 0         
GCOUNT   x                    1 / required keyword; must = 1         
CRVAL1   x         5802.8979492 / Co-ordinate value of axis 1        
CDELT1   x         4.3103027344 / Co-ordinate increment along axis 1 
CRPIX1   x       512.0000000000 / Reference pixel along axis 1       
CUNIT1   x 'Angstroms'          / Units for axis 1                   
EXTNAME  x 'SPECTRUM'           / 2dFGRS spectrum                    
OBSNAME  = 'TGS469Z164'         / Observed object name               
OBSRA    =         0.7943429758 / Observed RA                        
OBSDEC   =        -0.5475286940 / Observed Dec                       
MATCH_DR =               0.0000 / Position match dr in arcsec        
Z        =             0.178876 / raw measured redshift              
Z_HELIO  =             0.178860 / heliocentric corrected redshift    
QUALITY  =                    5 / redshift measurement quality       
ABEMMA   =                    1 / redshift source abs1 emm2 man3     
NMBEST   =                    0 / number emission lines for best z_em
NGOOD    =                    0 / number of good emmision lines      
Z_EMI    =              -9.9990 / emission redshift                  
Q_Z_EMI  =                    0 / emission redshift quality          
KBESTR   =                    2 / x-correlation template             
R_CRCOR  =              15.5600 / x-correlation peak                 
Z_ABS    =               0.1789 / x-correlation redshift             
Q_Z_ABS  =                    3 / x-correlation quality              
Q_FINAL  =                    3 / Suggested quality for redshift     
IALTER   =                    0 / No idea                            
Z_COMM   = '        '           / Observers Comment                  
THPUT    =              0.96613 / Fibre Throughput                   
SPFILE   = 'sgp469_991104_1z.fits' / 2dF reduced data file           
PLATE    =                    1 / 2dF plate number                   
PIVOT    =                  302 / 2dF pivot number                   
FIBRE    =                   58 / 2dF fibre number                   
OBSRUN   = '99OCT   '           / Observation run                    
GRS_DATE = '991104  '           / 2dF YYMMDD observed date           
UTDATE   = '1999:11:04'         / UT date of observation             
SPECTID  = 'A       '           / 2dF spectrograph ID                
GRATID   = '300B    '           / 2dF grating ID                     
GRATLPMM =                  300 / 2dF grating line per mm            
GRATBLAZ = 'COLLIMATOR'         / 2dF grating blaze direction        
GRATANGL =             25.30000 / 2dF grating angle                  
LAMBDAC  =             5782.700 / Central wavelength                 
CCD      x 'TEKTRONIX_5'        / CCD ID                             
CCDGAIN  x                2.790 / CCD inverse gain e/ADU             
CCDNOISE x                5.200 / CCD read noise (electrons)         
OBJX     =               196833 / 2dF object x position              
OBJY     =                10401 / 2dF object y position              
OBJXERR  =                    6 / 2dF object x position error        
OBJYERR  =                   14 / 2dF object y position error        
OBJMAG   =                18.96 / 2dF object magnitude               
THETA    =                4.526 / 2dF fibre angle                    
PTRTYPE  = 'P       '           / 2dF ptrtype                        
PID      =                    0 / 2dF pid                            
OBSFLD   = 'sgp469  '           / 2dF observed field number          
NCOMB    =                    3 / Number of frames combined          
REFRUN   =                   31 / AAT Run number of reference frame  
UTSTART  x '16:37:59.48'        / UT start of exposure of reference run
UTEND    x '16:57:59'           / UT end of exposure of reference run
REFEXP   =               1200.0 / Exposure time (secs) of reference run
REFHASTA x '36.07263999999999982' / HA at start of exposure of reference run
REFHAEND x '41.08118999999999943' / HA at end of exposure of reference run
SNR      =        2.0299999E+01 / Median signal-to-noise ratio per pixel
ETA_TYPE =       -2.5934000E+00 / Eta spectral type (2dF defined)    
SEEING   =        3.0000000E+00 / Seeing (2dF calculated)            

2. mSQL Parameter Database

The mSQL database (see Jepson & Hughes 1998) can be thought of as a table. The rows of the table are labelled by the unique object serial number (serial, identical to the parameter SEQNUM in the primary extension of the FITS file) and the extension number (extnum). There are multiple rows for each target object corresponding to each of the extensions in the object's FITS file: the first row corresponds to the primary FITS extension (extension 0), while subsequent rows correspond to the spectrum extensions 1..spectra. The columns of the table correspond to the object parameters, and are labelled by the name of the corresponding keyword. N.B. case is significant in these keywords.

The object serial numbers (serial) provide the primary database key, but the objects are also indexed by their unique survey name (name, identical to the parameter NAME in the primary extension of the FITS file), which has the format TGhfffZnnn, where h is the hemisphere (N for the NGP strip and S for the SGP strip and random fields), fff is the number of the primary field to which the object is assigned and nnn is the number of the galaxy within that field.

Note that the observed name of the object (parameter OBSNAME in each spectrum extension) is the same as name (or NAME in the primary extension of the FITS file) except that: (i) if the field in which the object is observed (given by OBSFLD) is an overlapping field rather than its primary field (given by fff), then the first character of the name is changed from T to X; and (ii) if the object has been flagged as a possible merger, then the second character of the name is changed from G to M.

The first row for each object (extnum=0) contains the source catalogue data and the basic spectroscopic information for the best spectrum of that object. The keywords for each row are a subset of the FITS parameters for the primary image (the parameters in Table 1 with an '=' between Keyword and Value; those with an 'x' instead of an '=' are not in the mSQL database) plus all the additional keywords listed in Table 3. The best spectrum is the one with the highest redshift quality parameter; if there is more than one spectrum of the same quality, then the latest of these spectra is taken to be the best.

Subsequent rows for the same object (extnum=1..spectra, where spectra is the number of spectra obtained for that object) contain a subset of the FITS parameters pertaining to each spectroscopic observation (the parameters in Table 2 with an '=' between Keyword and Value; those with an 'x' instead of an '=' are not in the mSQL database) plus the additional keywords in section (i) of Table 3. If there is no spectrum for the object then spectra=0 and only the row corresponding to extnum=0 will exist. Note that some information is duplicated between rows and that not all parameters are defined for all rows; undefined parameters return a NULL value.

The best spectrum is indicated by the parameter best, which is present in all extensions: in the primary extension (extnum=0) its value is the number of the extension with the best spectrum; in the spectral extensions (extnum>0) its value is 1 if that extension contains the best spectrum and 0 otherwise.


Table 3: Additional mSQL database keywords

(i) Keywords in all extensions

Keyword  = Value           / Description
        
serial   =          100100 / 2dFGRS serial number
name     =      TGS469Z164 / 2dFGRS name
UKST     =             417 / UKST sky survey field number
spectra  =               1 / number of spectra for this object
extnum   =               1 / extension number
best     =               1 / in extension 0, extension number with best spectrum
                           / in extension>0, 1 if best spectrum and 0 otherwise
obsrun   =           99OCT / observing run year and month
TDFgg    =            -469 / 2dFGRS field number (+NGP,-SGP)
pivot    =             302 / 2dF pivot
plate    =               1 / 2dF plate
fiber    =              58 / 2dF fiber
z        =        0.178876 / observed redshift
z_helio  =        0.178860 / heliocentric redshift
abemma   =               1 / redshift type (abs=1,emi=2,man=3)
quality  =               5 / redshift quality parameter

(ii) Keywords in extension 0 only

Keyword  = Value           / Description
	 
alpha    =    0.7943429758 / RA  (B1950) in radians
delta    =   -0.5475286941 / DEC (B1950) in radians
ra       =      3  2  3.00 / RA  (B1950) in HH MM SS.SS
dec      =     -31 22 15.9 / DEC (B1950) in DD MM SS.S 
ra2000   =     03 04 07.68 / RA  (J2000) in HH MM SS.SS
dec2000  =     -31 10 36.8 / DEC (J2000) in DD MM SS.S 
l2       =  228.9258834424 / Galactic longitude
b2       =  -60.8572447739 / Galactic latitude

Searches of the database use the mSQL query format (Jepson & Hughes, 1998), which has the general format
  SELECT list_of_parameters FROM database_name WHERE list_of_conditions
(here list_of_conditions is a series of equalities and inequalities linked by Boolean relations).

An example is
  SELECT name, extnum, ra, dec, BJSEL, Z, QUALITY, z, quality FROM TDFgg
  WHERE name='TGS469Z164'

which selects the listed parameters for the object with 2dFGRS name TGS469Z164 (note the single quotes around the character string) from both the summary row (extnum=0) and for each spectrum (extnum=1..spectra). Note that the parameters with the same name in lower case and upper case are distinct: the former are generally from extnum=0, the latter from extnum>0 (parameters are returned as NULL in rows where they are not defined).

Another example with a more complex list of conditions is
  SELECT name FROM TDFgg
  WHERE extnum=0 AND ((BJSEL<15.5 AND quality>=3) OR quality>4)

which lists just the names of the objects which are either brighter than bJ=15.5 with redshift quality at least 3, or have quality greater than 4, or both; the search is restricted just to the summary row by requiring extnum=0

One common and useful type of search is where parameters from only the best spectrum (or from the best spectrum and the primary extension) are desired. This can easily be achieved using the best parameter. For example,
  SELECT serial,name,extnum,ra2000,dec2000,z_helio,ETA_TYPE FROM TDFgg
  WHERE best>0 AND quality>=3 ORDER BY serial,extnum

extracts a list of positions, redshifts and spectral types for all objects with reliable (Q>=3) redshifts, using only the best spectrum for each object. Both the primary extension and the best spectrum extension are needed because some of the desired parameters are only in one or other. As a result the list has two rows for each object, and so to make subsequent processing easier the ORDER BY construction is used to sort on serial number and extension (so that the list is ordered by serial number and then extension, with the primary first and the best spectrum second). Note, however, that ORDER BY is very slow for large datasets, and it is much better to sort off-line. Variants on this type of search using parameters only in the primary extension or the best spectrum extension can be obtained by suitable conditions on extnum and best.

Simple searches on the two indexed parameters (serial and name), are quick - e.g. WHERE serial=69656 or WHERE name='TGS203Z081'; more complex searches take about 5 minutes. Use of the ORDER BY construction is very slow for large datasets, and is not generally recommended.

Further information about the mSQL database software and its structured query language is given in Yarger et al. (1999) and on the WWW at http://www.hughes.com.au.

3. WWW Interface

The 2dFGRS mSQL database can be searched via the WWW interface in a number of ways:

  1. perform a standard mSQL query as described above - this is the most general method;
  2. perform a standard mSQL query restricted to a list of named objects - this allows searching on any defined subsample;
  3. perform a standard mSQL query restricted to objects within a specified radius of a given sky position;
  4. match objects to a supplied catalogue of positions.

The results of a query can be returned either directly as an HTML table (suitable for relatively small datasets) or via an email giving the URL of the results file (suitable for large datasets). The results file may be either a compressed text (gzipped ASCII) file containing the chosen parameters for the objects selected by the mSQL query, or a compressed archive (gzipped tarfile) of the FITS files for the objects selected by the query, depending upon the option selected. If results are returned as an HTML table, then objects can be selected interactively and their DSS images (and for the on-line database, their SuperCosmos images) and their spectra can be displayed. If the spectra have measured redshifts, then the plot of each spectrum shows the positions of prominent spectral features at the redshift associated with that spectrum.



Matthew Colless, colless@mso.anu.edu.au,