Optical Spectra Caveats

There are several small caveats to watch out for in SDSS spectroscopic data. Some affect only a few spectra or a few data columns, while some have wider impacts. Caveats are in many cases specific to the release. For example, some caveats present in given data release were fixed in a posterior release; those are listed on this page to allow for easier comparison between releases. This page contains a list of known caveats as of SDSS Data Release 16 optical spectroscopic data.

The most important caveat for eBOSS is that the redshifts derived from the SDSS pipeline are not the same as those used in the eBOSS large-scale structure catalogs. Instead, the outputs of the SDSS pipeline are fed into separate redshift extraction algorithms (which are specific to each target class) that produce the final redshifts used for cosmological analysis. These redshifts will be made public when the eBOSS cosmological analysis is finalized. The SDSS pipeline redshifts are generally accurate in the cases where the ZWARNING flag is 0. The additional processing supplies eBOSS with many more accurate redshifts for the cases where the SDSS pipeline ZWARNING flag is not 0.

Two additional caveats for eBOSS DR16 files only containing individual spectra and individual exposures, the "spec" files, are that:

  • Files for 3064 out of the 3803 plates contain columns, such as 'flux', 'ivar', etc, written with upper-case names, while the rest of the plates have these columns written in lower case. We suggest trying both cases when reading samples of these files.
  • The spec files containing all individual exposures, stored in the "full" directory, suffer from a bug that removed all the content of headers for each HDU containing individual exposures, including their associated metadata. This information can be retrieved in the headers of their corresponding spCFrame files. The exposure ids of each exposure can be found on the header of HDU0 of the spPlate file.

This will be fixed for the next data release DR17.

Caveats that affect all spectra

Redshift status

The quality flags for the redshift fitting procedure is stored in the ZWARNING bit mask. Most redshift warnings indicate a likely substantial problem with the data, or an indication that the best-fit classification or redshift is not reliable (due, e.g., to low S/N, or the unusual nature of the spectrum). An exception is MANY_OUTLIERS, which flags when many pixels are poorly explained in a statistical sense by the best-fit redshift model. This bit is typically set for very high signal-to-noise ratio stars (where errors are small, so χ2 is high), or galaxies with broad lines (the redshift fitting model includes only narrow lines); in such cases, the redshift is usually fine. About 2% of non-sky spectra have some warning set other than MANY_OUTLIERS. The redshifts of the remainder are virtually always correct. Many of the spectra flagged with problems also have correct redshifts and classifications, but we recommend care before using them. Note that the ZWARNING flag bits in BOSS are similar, but not identical, to those used in SDSS-I/II.

In DR16, the eBOSS team uses the output of the standard SDSS redshift algorithm as input to more advanced algorithms, leading to higher completeness. These redshifts are released in the DR16 clustering catalogs.

Galactic extinction correction

The spectra released in DR10 have not been corrected for Galactic extinction, because the SDSS includes a substantial number of spectra of Milky Way stars whose extinction would differ from that given in the Galactic dust maps, as they don't lie beyond the full dust column. This policy has been the standard since DR2; in the EDR and DR1, the spectroscopic data were corrected for galactic extinction. The extinction is a relatively small effect over most of the survey area, since the median E(B-V) over the survey is around 0.04; however, for some SEGUE pointings the reddening can be substantially larger.

Night sky emission lines

The night sky emission lines at 5577Å, at 6300Å, 6363Å (when there is auroral activity), and in the OH forest in the red can be very strong, and leave significant residuals in the spectra whose amplitude is occasionally underestimated by the noise model. Be cautious about interpreting the reality of weak features close to these lines.

Sky Subtraction Bias

The sky spectrum estimates in BOSS (and in fact in SDSS) that are subtracted from each object are biased slightly low. This is due to the well-known bias associated with fitting an error-weighted model to data when the errors are estimated from the data itself (e.g. in the case of Poisson estimates of errors). These residuals can be detected by taking the average of the sky-subtracted sky fibers, which yield a slightly positive spectrum ranging from 7×10-20 erg/cm2/s/Å at around 8000 Angstroms to up to 10-18 erg/cm2/s/Å at the bluest and reddest end of the spectra.

Since DR13, the extraction algorithm was modified as described in Bautista et al. (2017). This new algorithm uses a constant weighing scheme, reducing considerably the sky subtraction bias. As a consequence, the extraction is slightly less optimal, yielding 5 to 10% larger errors in flux estimates per pixel.

Coadd errors are not perfect

The default BOSS/eBOSS spectra distributed in DR16 are coadded from several individual exposures. Each individual exposure has a slightly different relationship of pixel number to wavelength. Thus, errors in the coadded spectra have covariance between neighboring spectral bins; however, we do not calculate or track this covariance. As a result, there is a 10-20% "error-on-the-error" in the coadd noise model. If discrepancies at this level matter for your analysis, you should use the individual exposures, which have a much better accuracy in their noise model (1-2%).

Galaxy velocity dispersion measurements

We do not recommend using SDSS velocity dispersion measurements for:

  • spectra with median per-pixel (S/N)2 < 10
  • velocity dispersion estimates smaller than about 70 km s-1 given the typical S/N and the instrumental resolution of the SDSS spectra

Also note that the velocity dispersion measurements are not corrected to a standard aperture size. See the velocity dispersion algorithm for details.

Clipped Spectral Lines

Some emission lines are erroneously clipped because they were identified as cosmic rays. If an emission line is so bright that it is saturated in the individual 15-minute exposures of the spectrograph, it can suffer this effect. Unfortunately, such saturated pixels are not flagged as such, although usually that region of the spectrum has an inverse variance equal to zero. Luckily, objects with such strong emission lines are very rare, but the user should be aware of the possibility of objects with extremely strong emission lines and unphysical or unusual line ratios.

Spectrophotometric calibration induces artificial Balmer lines

This problem was fixed since DR12.

Prior to DR12, the spectrophotometric calibration procedure induced redshift-zero Balmer lines that were caused by mismatches between the calibration stars and the template library. This is noticeable in particular on some fibers in plate 274.

Known missing or corrupted spectra files on SAS

There are some spectra-related files on SAS which are known to be missing. These are documented in the "knownMissing.txt" files in each subdirectory. Most of these are logs and diagnostic plots, but a few spZline (redshift fits to individual lines) and spCFrame (calibrated individual exposures) files are missing. There are no known cases of missing coadded spectra. In addition, the individual spectrum exposure SPECTRO_REDUX/26/2639/spCFrame-b2-00042347.fits is missing HDU 6 (sky), but the other HDUs are fine.

In SDSS spectra released in DR8 - DR13, the Balmer Series line Hζ (H-zeta, 3889.049 Å) was incorrectly labeled as Hε (H-epsilon, 3970.072 Å), and the real Hε was not included in the analysis of spectral lines. This affects line measurements tabulated in spZline files. These files are only available on the SAS. These measurements are not loaded into any version of the CAS.

Caveats that affect BOSS spectra

Target selection issues in early BOSS data(BOSS)

The details of the targeting algorithm and photometric pipeline have changed throughout the first year of BOSS observations. Particular care should be taken with the following:

  • Chunks "boss1" and "boss2" (around 5% of the BOSS data in DR9): these used a different definition of the BOSS_TARGET1 flags. In particular, the GAL_CMASS and GAL_CMASS_SPARSE bits were used for internal tests and should not be used to select objects from these chunks. In order to select a CMASS or CMASS_SPARSE sample of objects, one should select objects based on the GAL_CMASS_COMM bit and sub-select objects that pass the final CMASS cuts (taking into account possible changes in photometry).
  • Chunks "boss1" to "boss14" (around 70% of the BOSS data in DR9): the targeting photometry of a given object in these chunks may not correspond to its final photometry. This affects a tiny percentage of targets, and may mean that the final matched photometry of a target falls outside the color and flux limits. In these cases, such objects should still be considered as valid targets: the scatter across the boundaries simply reflects the stochastic element of targeting a sample from noisy data. To find the original targeting photometry for any galaxy use the targetObjID in specObj (either within CAS or within the flat files).
  • Chunks "boss1" to "boss6" (around 40% of the LOWZ data): due to a bug in the target selection, LOWZ galaxies were incorrectly targeted during the initial stages of the survey. These chunks should be excluded from any LOWZ analysis. The simplest way to do so is to require tileID ≥ 10324.

Classification and redshift efficiency (BOSS/SEQUELS)

Classification and redshift efficiency depends mildly on fiber number and thus position on the focal plane. The spectrograph PSF gets worse at the CCD edges which results in lower-quality spectra with lower efficiency for determining classification and redshift. See Ross et al. (2012), Figure 3.

DR13: SEQUELS galaxy redshifting efficiency significantly reduces to 68% compared to BOSS CMASS galaxies (98%) due to lower signal-to-noise mostly. See details at the eBOSS overview paper (Dawson et al. 2016) .

DR16: Thanks to change in the spectroscopic pipeline and new redshift algorithms, the classification efficiency for the LRG sample is around 96% and 90% for ELGs. The dependency with fiber number is, however, still present since this is related to the instrument design.

NOQSO

A dominant source of bad classification/redshift fits to galaxy spectra is QSO templates with unphysical parameters, e.g. negative terms so that QSO emission lines "fit" galaxy absorption lines. To correct for this, galaxy spectra also have a ZWARNING_NOQSO mask, Z_NOQSO redshift, etc. which excludes QSO templates when performing classification/redshift fits. For studies with galaxy spectra, these *_NOQSO values should be used instead of the original ZWARNING mask, Z redshift, etc.

Bad CCD column results in bogus high-z quasars

Some unmasked intermittently bad CCD columns result in spurious identifications of spectra as z>5 quasars. These affect fibers 40, 59, 60, 833, 839, and 840. Treat any redshifts this large with caution, and especially these fibers.

QSO pipeline redshifts

The accuracy of the quasar redshift estimated by the BOSS pipeline depends on the quasar redshift. At low redshifts (z<1.6), the pipeline estimate is very accurate. At higher redshifts, when the CIV line enters in the BOSS wavelength range, the redshift estimate tends to be biased by the CIV emission line position. In the redshift range 2 to 2.5, where the MgII emission line still lies in the spectrum, about half of the redshifts are underestimated by about +0.005 in z.

(DR12 and prior) QSO Flux Calibration is Wrong (BOSS)

This problem was significantly improved in DR14. The spectroscopic pipeline computes a model for the impact of atmospheric differential refraction (ADR) on quasar targets at the individual exposure level, significantly improving flux calibration of quasars. See details in section 2.1 of Jensen et al. (2016).

In DR13, corrections for targets with focal plane offsets were computed to be applied to the final coadded spectra. See Margala et al. (2016) for details. They are included in the DR13 individual spectrum files.

The problem: BOSS QSO target fiber positions are purposefully offset in X and Y (position in the focal plane), and Z (vertical offset from focal plane) to optimize the S/N in the at 4000 Å for Lyman-alpha forest studies. Since the standard stars used for flux calibration are positioned for λ = 5400 Å (like the galaxies), and because (primarily) of chromatic differential refraction therefore affects the standards differently than the quasars, the derived flux solutions are not appropriate for quasars. This results in overall bluer quasar spectra, though the mis-calibration varies from exposure to exposure and position in the focal plane. This does not affect original SDSS quasar spectra which did not have the xyz hole offsets, or most of the ancillary spectra. The quantity LAMBDA_EFF stored for each spectrum reports what wavelength the fiber position was optimized for. See Dawson et al. (2012).

QSO Visual Scans (BOSS)

All QSO targets are visually inspected to verify their classification and redshift. In DR9, these results were included in the spAll and specObj catalogs as columns Z_PERSON, CLASS_PERSON, and Z_CONF_PERSON. For DR10 and beyond, these have been factored out into the DR10Q catalog be published separately as an update to the DR9Q SDSS QSO catalog.

QSO Z_ERR column left blank (BOSS)

In the SDSS quasar catalog, the Z_ERR column was left blank. The error in the redshift estimate from the SDSS quasar pipeline is in the column Z_PIPE_ERR; you should use that column instead for all studies.

For Z_PIPE_ERR, the majority of objects have error values listed. If Z_PIPE_ERR is negative, it instead acts as a flag indicating an invalid fit. Explanations of the flag values are:

Z_PIPE_ERR = -1: the best-fit is at the lowest or highest redshift tested for a template

Z_PIPE_ERR = -2: there are fewer than 3 good points in the spectrum

Z_PIPE_ERR = -4: the best-fit appears to be a local maximum rather than minimum of the chi^2 surface

Z_PIPE_ERR = -5: the chi^2 surface does not have a dynamic range of at least 1, meaning there are no fits that are formally 1-sigma worse

Z_PIPE_ERR = -6: the best-fit redshift is outside the fitting range

Z_PIPE_ERR = -7: all good data points in the spectrum have the same flux value

Z_PIPE_ERR = -8: MPFITPEAK fitting to a Gaussian of the chi^2 surface fails with a non-zero status

BOSS Flux Calibration

The flux calibration of individual exposures has an observing hour-angle and fiber dependence, especially below 4200 Angstroms. Analyses which rely upon accurate flux calibration of individual exposures should perform additional systematics cross checks for the consistency between different exposures of the same object, and avoid data observed at large hour-angles. This issue may also affect SDSS spectra but that has not been confirmed.

BOSS Stellar Classifications

BOSS object classifications are primarily focused on the identification of galaxy vs. quasar vs. star. Although sub-classifications are provided, they are not optimized for accuracy. In particular, the CV star templates have more degrees of freedom than other stellar templates, which can result in unphysical solutions where negative PCA components of the CV templates can fit absorption features of White Dwarfs. Fixing this has not been a high priority since the primary classification of "star" vs. "galaxy" or "qso" is still correct.

BPT Classification Issues in emissionLinesPort Table

The BPT classification reported by in the emissionLinesPort table in CAS and related files (e.g. Portsmouth galaxy properties) is truncated to 7 characters for spectra associated with DR12 BOSS spectroscopic reduction v5_7_2 (SEQUELS and other ancillary programs).

Normally, the classification is something like "Star Forming" or "Composite". However, the truncated strings only retain 7 characters: for example "Star Fo" or "Composi". For the most part, the classifications are unambiguous, even truncated to 7 characters. However, "Seyfert" and "Seyfert/LINER" indistinguishable since "Seyfert" is 7 characters.

It is possible to recover the full, untruncated classification from the individual plate files. For example, to retrieve the classifications for plate 7565, MJD, 56809, access the file portsmouth_emlinekin-7565-56809.fits.

Artificial dichroic transitions at 6000 Å due to cross-talk from bright stars

A small number of spectra are affected by cross-talk from bright stars (generally spectrophotometric standards) in neighboring fibers. This is often manifested in a strong break feature at the dichroic transition around 6000 Å, resulting from different levels of cross-talk between the red and blue arms of the spectrograph. These effects appear to occur less frequently at later survey dates, which would be consistent with the improvements in the focus of the BOSS spectrograph cameras that have been achieved with routine operation. We intend to mitigate these effects in future BOSS data releases through improvements in the extraction codes, and to flag any spectra that remain compromised. No masking of this effect is implemented for BOSS DR10 data, however, except to the extent that it triggers a ZWARNING bit in certain instances.

Correcting for wavelength dependence of focal plane when observing quasars

In addition to the wavelength-dependent ADR offset, we also account for the wavelength dependence of the focal plane when observing the quasar targets. The focal plane for 4000 Ångstrom light differs from the focal plane for 5400 Å light by 0-300 microns, depending on the distance from the center of the plate. To account for this difference, small, sticky washers are inserted at the location of certain quasar targets. The washer causes the fiber tip to sit slightly behind the 5400 Å focus. No washers are used for holes within 1.02 degrees of the plate center. Between 1.02° and 1.34°, 175 μm washers are used; between 1.34° and 1.49°, 300 μm washers are used. Washers only became available after MJD 55441 (September 2, 2010), and were not consistently used until MJD 55474 (October 5, 2010). In the DR10 data model, the value of ZOFFSET is given in microns (μm). It can be found in the Science Archive Server in the plateDesign, spAll, and specObjAll files, and in SkyServer in the table and its associated views. Note that ZOFFSET is the intended washer usage, which may not match the actual washer usage. The exact washer usage for each observation during this transition period (including plates with repeat observations spanning this time-period) is documented in the file found at http://www.sdss3.org/svn/repo/idlspec2d/trunk/opfiles/washers.par. Observations prior to MJD 55441 did not have washers; Observations after 55474 have washers unless they are listed otherwise in washers.par. The discrepancy will be resolved with Data Release 10 in the summer of 2013. By optimizing the focal plane position, and thus the signal-to-noise ratio (SNR), for 4000 Å light, we are also perturbing the spectrophotometry relative to the standard stars as discussed in Dawson et al. (2012). Only the CORE, BONUS, and QSO_VAR_SDSS quasar targets are optimized in this way for 4000 Å focal plane and ADR offsets. Otherwise, the plate design remains the same as it was in the SDSS-I and SDSS-II surveys (Stoughton et al. 2002).

Position errors in some early plates in the Low-Mass Binary Stars ancillary target program

There was an error in correcting the positions of the target for their proper motions in the first year of the ancillary target program, affecting targets in plates numbered less than 3879 or between 3965-3987.

Caveats that affect SEGUE spectra

Duplicate Spectra

Some objects have multiple spectroscopic observations, either from being an intentional repeat, as a QA target, or as part of a different program or survey, or, finally, from being on a plate with multiple observations. Thus, while each object in the CAS has only one bestobjid, associated with the photometry, it may have multiple specobjid, one for each spectroscopic observation. SpecObjAll has a number of parameters that signify whether or not an observation is the best (defined as the highest S/N) available:

segue1primary
Best observation of a target in SEGUE-1
segue2primary
Best observation of a target in SEGUE-2
segueprimary
Best observation of a target in all of SEGUE, also in the sppParams table

For example, imagine a star that we have three observations of. Two of these observations are on SEGUE-1 plates. Of these two, the one with the highest S/N will have segue1primary set to 1. Imagine that the third observation is on a SEGUE-2 plate, and that this has the highest S/N overall. This third observation will have segue2primary set to 1, as it is the best observation of an object in SEGUE-2. It will also have segueprimary=1. Thus, even though one of the two SEGUE-1 observations is better than the other, and has segue1primary, set to 1, it will not have segueprimary=1. To make sure that any query returns one and only one spectroscopic observation of any object, and that it is the best observation of that object, use the segueprimary parameter in either sppParams or SpecObjAll. To extract the best observation from exclusively SEGUE-1 plates, use segue1primary. Finally, to pull out the best observation from SEGUE-2, use segue2primary. The criteria for an observation to be sciencePrimary and more general information is available at the SDSS Spectroscopic Catalogs page. Adding the following clause to a query will ensure that it returns a unique set of SEGUE objects:

SELECT ... FROM SpecObjAll as sp WHERE sp.seguePrimary = 1 AND ....

This same criteria will work for the sppParams table. If, for any particular reason, you do not want to use the seguePrimary parameter to eliminate duplicates, you can examine the number of times a particular target appears in CASJobs output by using the count function. This can be a useful way to verify that your queries are working appropriately. For example, to examine the number of times each bestobjid value appears in a particular sample, one would use the following query:

SELECT sp.bestobjid, count(sp.bestobjid) as count FROM SpecObjAll as sp group by sp.bestobjid

The query above lists every bestobjid in SpecObjAll and the number of times it appears. If you want to avoid any target that is observed multiple times, which will severely limit any sample, you can use the following query:

SELECT sp.bestobjid, count(sp.bestobjid) as count FROM SpecObjAll as sp group by sp.bestobjid having count(sp.bestobjid) = 1

Similarly, altering the query above to read having count(sp.bestobjid) > 1 would list every target that is observed multiple times. It is critical to identify and account for duplicates in your sample, both from the perspective of avoiding repeats and to ensure a complete sample. They can also be very useful for testing different aspects of the SSPP and other estimates of stellar properties. Duplicate SEGUE Observations lists the stars with multiple observations.

Quality Cuts

There are a number of CAS parameters that allow you to avoid poor quality SEGUE data. These are detailed in the SEGUE SQL Cookbook.

Observational Biases in SEGUE

The survey design and target selection algorithm of SEGUE will give rise to a number of different observational biases. It is imperative to constrain and correct for these biases when extracting a SEGUE sample representative of the underlying Milky Way properties. Schlesinger et al. 2012 determined and corrected for the effect of SEGUE target selection on cool dwarf stars using a series of scaling weights. These weights, and a brief description of how they were determined, is available in the Target Selection Weights value-added catalog. Although DR10 contains corrections for only the G- and K-dwarf SEGUE-1 samples, many of the techniques are applicable, with modification, to other SEGUE stellar categories.

SSPP: Caveats that affect fitted parameters from SSPP

Correlation Coefficient

The correlation coefficient quantifies how well an observed SEGUE spectrum matches a synthetic spectrum generated with its adopted Teff, log g, and [Fe/H]. These measurements are listed in the SSPP as CCCAHK, which compares the spectra from 3850-4250 Å, and CCMGH, from 4500-5500 Å. The correlation coefficient ranges from 0 to 1, with 1 indicating an excellent match between the two. However, due to an error in the treatment of the inverse variance flux error array in the methods of NGS1, NGS2, and CaIIK1, there are some stars with very incorrect parameters. There are 8280 stars are affected by this bug in the SSPP. This is less than 2% of the set of SDSS, SEGUE-1, and SEGUE-2 stellar spectra with valid g-r and S/N limits that the SSPP is able to estimate parameters for. These stars can be removed by requiring CCMGH and CCCAHK in the SSPP parameter table to be greater than zero, that is, CCMGH > 0 and CCCAHK > 0. Similarly, if more than 5% of a wavelength region used by a particular parameter estimation method is missing pixels for an individual star (e.g., has the inverse variance of the spectrum flux array set to 0), the SSPP does not report the estimated value from this technique. This improves the reliability of the parameter estimates, especially at very low metallicity.

Signal-to-Noise Constraints

The SSPP only provides stellar parameter estimates for stars where the measured S/N per 1Å pixel is 10 or greater (sppParams.snr). Below this limit, the spectra are too noisy for reliable estimates. There are around 373,300 stars in SEGUE-1 and SEGUE-2. Around 67,200 of these spectroscopic observations have S/N<10, approximately 18%. Thus, S/N constraints affect a significant portion of the sample.

Effective Temperature Scale

The DR9 and beyond versions of the SSPP adopts a much improved (g-i)-temperature relation, the InfraRed Flux Method (IRFM) (Casagrande et al. 2010). Each SSPP temperature estimate is re-scaled to match the IRFM estimate. In particular, this improves the estimates of Teff for cool stars (<5000 K).

Surface Gravity Determinations

The SSPP for DR7 and DR8 used 10 different methods to estimate surface gravity. However, the gravity estimates from MgH, CaI2, and k24, have been removed starting with DR9. Comparison with high-resolution observations of SEGUE targets found that these techniques deviated significantly from the expected log g. Although removing these techniques from the pipeline have improved the surface gravity estimates, there are still known problems. Specifically, the DR9+ SSPP gravity estimate tends to overestimate log g by up to 1.0 dex for very cool giant stars. This issue was also in the DR8 SSPP. Although the SSPP continues to improve its log g techniques, the SSPP surface gravity estimates have large uncertainties. They are meant to help distinguish between the evolutionary states of different stars but are not meant to be used as a precise log g value.

Stellar Radial Velocities

The standard redshift z from idlspec2d is available unaltered in the specObj and sppParams tables. These redshifts, primarily for galaxy work, have no offsets or corrections applied. For stars, a better redshift to use is the ELODIE-matched template redshifts, stored as elodie_z in the specObj file and the specObj table in CAS. The CAS also records this as the quantity elodierv in the sppParams table, but with a correction term:

elodierv = c*elodie_z+7.3 km/s

The 7.3 km/s is an empirically derived offset putting the elodierv of all stars on a system consistent with that of other literature measures of known radial velocity standards.

SSPP Flags

The DR10 SSPP has a flag 'B', which indicates that the measured Hα strength is different than that predicted from the Hδ line. The relation used to predict Hα strength breaks down for stars below 5800 K, as their Hδ lines are too weak. Therefore, the 'B' flag should not be used for stars below this temperature.

Incomplete spptarget information in DR8 CAS

This issue only affects 6 plates (2053/2073, 2054/2074, 2055/2075, 2056/2076, 2057/2077), and is caused by a missing camcol1 (in some cases also camcol2 is missing). This translates directly to having a smaller area covered than should be by objects in the spptargets table for those plate pairs. Spectroscopic objects on those plates are therefore sometimes matched to objects on adjacent plates instead, or are unmatched at all.

Photometric-Spectroscopic Matching Caveats

Caveats that affect matching between photometric and spectroscopic data.

Missing matches in DR10 photoObjAll table (fixed)

At the time of the DR10 release, the photoObjAll table values of specObjID were incompletely updated at first. This meant that a JOIN ON photoObjAll.specObjID = specObjAll.specObjID would yield incorrect results (although JOIN ON photoObjAll.objID = specObjAll.bestObjID would yield correct results). This problem was corrected as of August 10, 2013.

Mismatches between spectra and photometric data

There are occasional "mismatches" between the spectra and the photometry, both due to problems on the spectroscopic side in identifying the location associated with every fiber, and due to problems on the photometric side in finding an associated photometric object given a location. With some frequency, the fiber mapping failed which identifies which fiber has been plugged into which hole. There are around 7200 such cases in DR10, which are marked as UNPLUGGED in the bitmask. The vast majority of these cases occur because the fiber was actually not plugged or was broken (in such cases, essentially no signal is detected in the fiber, and snMedian is reported as zero). In around 200 cases, there is measurable signal down the fiber. In cases where there is more than one such fiber on plate, there is a possibility that the fiber location associated with the spectrum is incorrect (and thus that the photometric and spectroscopic information is mismatched). This problem occurs for around 70 objects in the survey. Other mismatches can occur due to problems in the photometry. Errors in the deblending algorithm in the target reductions caused spectroscopy to be carried out occasionally on non-existent objects (e.g., diffraction spikes of bright stars or satellite trails). Many of these objects no longer exist in the current imaging reductions, with its improvements to the deblender over the years. We have in fact tried to mitigate this problem in this data release, as described in the spectroscopic-photometric matching documentation.

Missing SEGUE Photometry

The latest photometry is available for nearly all SEGUE objects; however, for a small fraction of fields (about 0.5%), the photo pipeline timed out before it finished cataloging and deblending objects. This is usually because there is a bright star in the field with scattered light wings that cause the deblender to work especially hard, as mentioned above. It also occurs for some of the lines-of-sight that include an open or globular cluster, where the deblender has difficulty separating stars from one another in the crowded field. Finally, it also occurs for SEGUE "SKY" spectra, which are pointed at a blank piece of sky, with no star or other imaging object underneath, for calibration purposes. For all of these spectroscopic observations, bestobjid is set to 0. There are about 12,500 stellar spectra that have no matching photometry.

One can still find the photometry for these objects by looking in the DR7 database and doing a position match. This requires a two stage query, as follows:

  1. To extract spectra of objects with no photometry, search for targets with sppparams.bestobjid = 0 and sppparams.elodiervfinalerr > 0, while rejecting sky spectra by excluding objects with sp.sourcetype ='SKY' or sp.sectarget != 16:
    SELECT s.plate,s.mjd,s.fiberid,sp.ra,sp.dec,s.elodiervfinal,s.elodiervfinalerr, s.fehadop,s.loggadop,sp.sourcetype INTO mydb.orphandr9spectra FROM sppparams s JOIN specobjall sp on s.specobjid=sp.specobjid WHERE s.bestobjid = 0 AND s.scienceprimary =1 AND elodiervfinalerr > 0 AND sp.sourcetype != 'SKY'AND sp.sectarget != 16
  2. The PhotoObjDR7 and SpecDR7 tables match the objid from DR7 to those from later data releases. We can use these to extract DR7 photometry from PhotoObjAll:
    SELECT top 10 poa7.run,poa7.rerun,poa7.camcol,poa7.field,poa7.obj, poa7.ra as pra,poa7.dec as pdec,poa7.psfmag_g,poa7.psfmag_r,m.* FROM mydb.orphandr9spectra_newquery as m JOIN specdr7 as sdr7 on m.specobjid=sdr7.specobjid JOIN dr7.photoobjall as poa7 on sdr7.dr7objid=poa7.objid

Not all of the "orphan" spectra have matching DR7 photometry, only around 5,300 do. Many of the lines of sight missing photometry come from crowded fields, such as the segcluster pointings.

Targeting photometry vs. matched photometry (BOSS)

The SDSS photometry version used when selecting targets for spectroscopy can be different than the DR8 version of the photometry used for matching observed spectra with photometric objects. The extreme case is ancillary programs, which may not have used SDSS photometry at all for their target selection.

plugMap information
in spPlate HDU 5, tracks the photometry used for targeting.
photoPos information
in photoPosPlate*.fits, tracks the match of the spectroscopic (RA, dec) with an object from DR8 photometry.

If the matching process identifies a different object from what was originally targeted, the following fields may disagree between the plugMap and the photoPos: RUN, RERUN, CAMCOL, FIELD, ID, RA, DEC, and plugMap.MAG may not match photoPos.FIBER2MAG. If the matching process fails to identify an object, then photoPos.THING_ID = -1, which is also the same THING_ID used for sky fibers.

BOSS Photometric Mismatches

There are main survey targets with THING_ID = -1 due to a mismatch between targetting on pre-DR8 photometry followed by matching to DR8 photometry.

Caveats that affect specific plates

SAS-only plates

If one browses the directory trees containing all of the spectra (see the spectroscopic data access page) one will find files associated with a certain number of plates not listed in the DR10 list of plates and not loaded into CAS. In essentially all cases, it is best to ignore such files and plates. We went through some effort to include all reasonably good plate observations; any plate observations found on SAS but not in CAS are likely to be disastrously bad.

Bad plates

A small number of plates suffered from a variety of problems, some more serious than others. For plates that we deem that the data is unreliable, they have had their platequality set to bad, and some terse comments put into the qualityComments status.

  • Plates with comments about collimation problems refer to hardware problem causing a mismatch between the flatfields and the science exposure instrumental profile shapes, in both the spatial and wavelength directions. This problem caused the optimal extraction process to reject an excessive number of pixels. This problem was fixed in software, and comparing overlapping objects from adjacent plates confirms that the redshifts from these problematic plates are unbiased. However, the spectra themselves should not be used for precision work or spectrophotometry.
  • Plates in the apbias program used multiple, very slightly offset pointings, but the reductions do not properly combine them. They should have valid redshifts in these spectra, but the spectrophotometry will be very inaccurate.
  • For some plates the software had issues with rejecting cosmic rays, because there was only a single exposure to work with. These are all marked as bad plates (though again in many cases the redshifts and spectrophotometry are fine, except for the cosmic rays).
  • Plates located in regions with extended diffuse Galactic emission (like in Orion or Taurus) often have sky-subtraction errors and issues, because there is no truly blank sky available. In these cases, the emission lines from the nebula are partially, but not wholly, subtracted and hard to interpret. Similar problems can occasionally happen if there is auroral activity while the spectrum was taken. If you suspect such problems, examine the spectra associated with the sky fibers.
  • Because of time-variability in the dichroic throughput, occasionally the spectrophotometry has "kinks" at the transition between the red and blue spectrographs; we have identified some, though perhaps not all, of the worst cases of these.
  • Occasionally the second spectrograph electronics caused serious issues for fibers 321 through 640 in SDSS.
  • One plate had substantial contamination from Pollux because of light scattered through clouds.
  • A number of other plates are simply low signal-to-noise ratio for a variety of reasons, but because they were special plates, needed to have their quality values set by hand. That is, they targeted deeper than we normally do, and so would have passed the survey's signal-to-noise criteria at the standard fiducial magnitudes.

Uncertain ZOFFSETs for some QSO targets (BOSS)

BOSS QSO targets at plate radius > 1.02 degrees generally have washers to offset their fibers in Z to optimize the signal-to-noise at 4000 Angstroms. ZOFFSET records the intended z-offset in microns, not the actual offset.

  • Prior to MJD 55442, washers were not used.
  • 55442 <= MJD <= 55474 was a transition period where washers were only sometimes used.
  • After MJD 55474 washers were regularly used for new plates.
  • Plates observed both before and after MJD 55474 may or may not have had washers for the later observations.

The actual washer state of a given plate/mjd is recorded in the yanny parameter file idlspec2d/opfiles/washers.par. Analyses which use ZOFFSET should consult that file to confirm the washer state or restrict themselves to plates which were first observed after MJD 55474.

Bad Sky Measurements for Some Plates

Plate 3770 MJD 55234 has bad sky measurements for fibers ≤ 500, due to being taken in marginal conditions.

Corrupt File Header for Plate 6138 MJD 56598 (BOSS)

This problem was fixed in DR16.

Although all its spectra were correctly observed, the FITS file for Plate 6138 MJD 56598 has a corrupted header. The workaround is to delete the keyword EXPID upon reading in the FITS header.

Second night of data missing for Plates 389 and 2338 (SDSS-I)

For plates 389 and 2338, data from their second night of observations was not included in the reductions. This was caused by a bug in the planning of reductions (not in the reductions themselves).

This causes the file spPlate-0389-51795.fits to contain exactly the same data as spPlate-0389-51794.fits, and the file spPlate-2338-53679.fits to contain exactly the same data as spPlate-2338-53676.fits.

SkyServer returns "response buffer limit exceeded"

SkyServer returns "response buffer limit exceeded"

A SQL Search query on SkyServer might, on very rare occasions, return the following error: SQL returned the following error message: 006~ASP 0251~Response Buffer Limit Exceeded~Execution of the ASP page caused the Response Buffer to exceed its configured limit. Your SQL command was:   This message is due to a default behavior of Microsoft Internet Information Services (IIS) version 6.0 and higher in which responses to actions of Active Server Pages (.asp) are limited to 4 MB file size. This error is extremely rare, but there is a simple workaround: simply run the query in CasJobs instead.