Stripe 82 Images Tutorial

How do I download all the images from Stripe 82?

Stripe 82 is an equitorial region repeatedly imaged during 2005, 2006, and 2007. This tutorial gives instructions for how to identify images which cover Stripe 82 and how to download them. See the Imaging Basics page for a description of SDSS imaging and how they are organized.

NOTE: This is actually quite a bit of data. If you really want all the Stripe 82 data, it may be more effective to email helpdesk@sdss.org to request an optimized custom data transfer between SDSS and your institution.

Identifying Images Covering Stripe 82

We want to download the corrected frame files as described in the data model. These are grouped in directories $BOSS_PHOTOOBJ/frames/[RERUN]/[RUN]/[CAMCOL]/, e.g. for RUN 301, RERUN 4797, CAMCOL 1:

Note that RUN is a imaging scan producing a set of raw data; RERUN is a version of software processing of that data which produces the corrected frame files.

Since we want all Stripe 82 images, we want all CAMCOLs (1-6) and all images in each directory whose RUN covered Stripe 82 and has a RERUN that processed it for DR10.

We'll use the CAS database to find which RERUN, RUN cover Stripe 82. The SkyServer Schema Browser describes all tables in the CAS database:

  1. Click the Table link on left to see a list of tables.
  2. Click the Run link on right to see the details of the Run table.

From inspecting this Run table, we can see that the query can be done with:

SELECT rerun, run
FROM Run
WHERE stripe = 82

Since this is a simple query, we'll use the SQL Search tool of the SkyServer CAS database web interface:

  1. Click "Clear Query"
  2. Enter the query given above
  3. Choose Output Format "CSV" radio button
  4. Click Submit

This will run the query and download a file like:

rerun,run
301,4797
301,2700
301,2703
301,2708
...

Downloading the data

We now need to convert that file into commands to download the data.

For RERUN 301, RUN 4797, CAMCOL 1, these are visible on the web at:
https://data.sdss.org/sas/dr16/eboss/photoObj/frames/301/4797/1/

The files could be directly downloaded via wget but it will be more efficient to get them via rsync instead, e.g.

rsync -aLvz --prune-empty-dirs --progress \
--include "4797/" --include "?/" --include "frame*.fits.bz2" \
--exclude "*" \
rsync://data.sdss.org/dr16/eboss/photoObj/frames/301/ ./301/

Here is an example python script to convert the above CSV file into the appropriate rsync commands:

#!/usr/bin/env python

import sys
import os

#- Template rsync command to run:
cmd_template = """rsync -aLvz --prune-empty-dirs --progress \
--include "%s/" --include "?/" --include "frame*.fits.bz2" \
--exclude "*" \
rsync://data.sdss.org/dr16/eboss/photoObj/frames/%s/ ./%s/"""

path_to_csv_file = '/Path/to/file.csv'

#- Loop over CSV file from CAS
fx = open(path_to_csv_file)
fx.readline()                       #- clear "rerun,run" header
for line in fx:
     rerun, run = line.strip().split(',')
     cmd = cmd_template % (run, rerun, rerun)
     print(cmd)
     err = os.system(cmd)
     if err != 0:
          print("ERROR downloading RERUN %s RUN %s" % (rerun, run))

fx.close()