Datacubes vs RSS and LIN vs LOG

Back to MaNGA tutorials

There are four main output files per PLATE-IFUDESIGN identification of a MaNGA observation:
row-stacked spectra (RSS) and datacubes (CUBE) files that are sampled either linearly (LIN) or logarithmically (LOG) in wavelength. The specific file that you should use for your project (LINRSS, LOGRSS, LINCUBE, or LOGCUBE) depends on your science and analysis software that you want to use.

MaNGA observing strategy -> four output files

To help you decide, let’s quickly review elements from the MaNGA observing strategy and the data reduction. Instead of only observing each galaxy once, MaNGA observes multiple times, iterating through a set of three dither positions. This pattern optimizes the observing depth across the IFU field-of-view by sampling the sky in regions with gaps between the 2″ fibers present in any given pointing. An individual spectrum is produced for each fiber of the IFU in each observation. To reach the required survey depth typically requires 3-4 sets of the 3-point dither sequence (totaling 9-12 exposures). The RSS file contains all spectra acquired in each exposure in a single 2D array; for example, an RSS file for a 19-fiber bundle may contain 19(fibers)*4(dither sets)*3(exposures per dither)=228 spectra, each with the same number of spectral pixels. The CUBE file uses the RSS spectra to construct a uniformly sampled datacube, a 3D array with WCS coordinates (RA, DEC, λ); the spatial sampling is set to 0.5″ spaxels, but the spatial point-spread function (PSF) in the datacube is roughly 2.5″ at FWHM. When re-sampling the wavelength pixels onto a uniform wavelength grid for each fiber, the MaNGA drp does so both linearly and logarithmically. For a detailed explanation of the data formats, see the data model page.

A data cube
A representation of a data cube. Image credit: Stephen Todd (ROE) and Douglas Pierce-Price (JAC)
An RSS file
An example of an RSS file. Each row represents a spectrum from one observation with a given fiber.

RSS vs datacubes

The difficulty with the RSS data are two-fold: (1) Because they are the spectra from each 15-minute exposure, the S/N of each spectrum can be rather low. Roughly speaking, you should expect the S/N of an individual fiber spectrum from the RSS file to be a factor of 3 lower than the spectrum at the same spatial position in the CUBE file. (2) The individual fiber spectra suffer from differential atmospheric refraction (DAR) such that the on-sky position observed is a function of wavelength, which can be difficult to deal with during analysis. The advantage of the RSS data is that each spectrum can be treated as largely independent of the others.

The difficulty with the CUBE data is due to the significant covariance between adjacent spaxels. Each spaxel is approximately 20% of the FWHM of the spatial PSF meaning that a single fiber contributes to many spaxels. Analysis of the datacube should then account for this spatial covariance; i.e., one should not treat the datacube as containing independent spectra. The advantage of the CUBE data is that the S/N of each spaxel is significantly higher than the RSS spectra, and the regridding of the data naturally accounts for the DAR in the instrument based on the astrometric solution provided by the DRP.

LIN vs LOG

The purpose of binning the spectra with linear steps in log(λ) — log-linear binning — is that each spectral pixels can be treated as a linear step in redshift, easing analysis that, e.g., determines the stellar kinematics via a convolution of the spectrum with a line-of-sight velocity distribution kernel.

So, which file do I want to use?

If you’re interested in making 2D maps, the datacubes likely provides the most convenient path to producing these maps based on the regular on-sky grid of the spaxels. However, one should account for covariance between spaxels in any subsequent analysis of these 2D maps. The RSS files are particularly useful when you want to be able to treat the spectra as independent observations (e.g., when constructing a forward model of the data) and/or when you are only interested in coarse spatial information, such as when constructing stacks of spectra binned as a function of radial distance from the galaxy center.