FITS Files
Most of the numerical SDSS data is stored in the form of FITS files. These files can contain both images and binary data tables in a well-defined format. FITS files can be read and written with many programming languages, but the most common ones used by SDSS are IDL and Python.
IDL
The Goddard utilities contain tools for reading and writing FITS files. The most commonly used functions are MRDFITS
and MWRFITS
. The Goddard utilities are included in the idlutils package, which also contains additional programs for manipulating FITS files.
Python
The astropy.io.fits package handles the reading and writing of FITS files in Python. Because of the general usefulness of the astropy package, this is the recommended Python reader for most FITS files.
Another package is fitsio
, developed by Erin Sheldon, which is a Python wrapper on the CFITSIO library. It allows direct access to the columns of a FITS binary table which can be useful for reading large fits files, as detailed below. However, fitsio requires that the input files adhere rather strictly to the FITS standard. This package is available for download here.
Large FITS Files
FITS files larger than about 2 GB can be more challenging to read. One such file is the spAll file. The simplest method for reading large FITS files is to download the fitsio
Python module described above. The module can read only selected columns from the FITS file:
import fitsio columns = ['PLATE', 'MJD', 'FIBERID', 'Z', 'ZWARNING', 'Z_ERR'] d = fitsio.read('spAll-v5_10_0.fits', columns=columns)
The astropy.io.fits module has more stringent hardware requirements as it must read the whole file in order to use it. On a 64-bit machine with > 4 GB of memory, it is possible to use the memmap
option:
from astropy.io import fits fx = fits.open('spAll-v5_10_0.fits', memmap=True) d = fx[1].data
In IDL, the routine HOGG_MRDFITS()
is available as part of the idlutils package. This routine is similar to fitsio
, in that one can specify a subset of columns to read. It avoids memory overload by reading only a subset of the rows of a FITS file, extracting the columns, then moving on to the next subset of rows.