Using PyVO to find and acquire HEASARC data#

Learning Goals#

By the end of this tutorial, you will be able to:

  • Access NuSTAR data using the VO python client pyvo.

  • Find and download data for a specific object.

Introduction#

This notebook presents a tutorial of how to access HEASARC data using the virtual observatory (VO) python client pyvo.

We handle the case of a user searching for data on a specific astronomical object from a specific high-energy mission observation table.

We will find all NuSTAR observations of 3C 105 that have an exposure of less than 10 ks.

Inputs#

  • The name of the object to identify observations of, in this case 3C 105.

Outputs#

  • NuSTAR observation files for the selected object.

Runtime#

As of 9th February 2026, this notebook takes ~60 s to run to completion on Fornax using the ‘Default Astrophysics’ image and the ‘small’ server with 8GB RAM/ 2 cores.

Imports#

import glob
import os

import pyvo
from astropy.coordinates import SkyCoord

Global Setup#

Functions#

Constants#

Hide code cell source

# The name of the source of interest - we'll use Astropy to retrieve its coordinates
SRC_NAME = "3C 105"

Configuration#

Hide code cell source

# -------------- Set paths and create directories --------------
# Set up the path of the directory into which we will download NuSTAR data
if os.path.exists("../../../_data"):
    ROOT_DATA_DIR = os.path.join(os.path.abspath("../../../_data"), "NuSTAR", "")
else:
    ROOT_DATA_DIR = "NuSTAR/"

# Whatever the data directory is, make sure it is absolute.
ROOT_DATA_DIR = os.path.abspath(ROOT_DATA_DIR)

# Make sure the download directory exists.
os.makedirs(ROOT_DATA_DIR, exist_ok=True)
# --------------------------------------------------------------

1. Finding the observations#

This part assumes we know the ID of the VO service. Generally these are of the form: ivo://nasa.heasarc/{table_name}.

We assume that we already know the name of the NuSTAR ‘master’ table that lists all NuSTAR observations - ‘numaster’.

If you don’t know the name of the table, you can search the VO registry using the pyvo.registry.search() function:

pyvo.registry.search("nustar master")
<DALResultsTable length=1>
           ivoid            ...
                            ...
           object           ...
--------------------------- ...
ivo://nasa.heasarc/numaster ...

The search service#

First, we create a cone search service instance, passing the VO service ID, and retrieving the cone search service object:

# First, set up the VO object we need to access the numaster table
nu_services = pyvo.regsearch(ivoid="ivo://nasa.heasarc/numaster")[0]

# Retrieve the cone search service object
cs_service = nu_services.get_service("conesearch")

We can examine the attributes and methods of the cone search service object using Python’s built-in dir() function:

dir(cs_service)
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_baseurl',
 '_capability_description',
 '_get_metadata',
 '_session',
 'baseurl',
 'capability_description',
 'columns',
 'create_query',
 'describe',
 'description',
 'search']

As well as the docstring written for the cone search service object and the list of possible input parameters, using Python’s built-in help() function:

help(cs_service)
Help on SCSService in module pyvo.dal.scs object:

class SCSService(pyvo.dal.query.DALService)
 |  SCSService(baseurl, *, capability_description=None, session=None)
 |
 |  a representation of a Cone Search service
 |
 |  Method resolution order:
 |      SCSService
 |      pyvo.dal.query.DALService
 |      builtins.object
 |
 |  Methods defined here:
 |
 |  __init__(self, baseurl, *, capability_description=None, session=None)
 |      instantiate a Cone Search service
 |
 |      Parameters
 |      ----------
 |      baseurl : str
 |         the base URL for submitting search queries to the service.
 |      session : object
 |         optional session to use for network requests
 |
 |  create_query(self, pos=None, radius=None, *, verbosity=None, **keywords)
 |      create a query object that constraints can be added to and then
 |      executed.  The input arguments will initialize the query with the
 |      given values.
 |
 |      Parameters
 |      ----------
 |      pos : astropy.coordinates.SkyCoord
 |          a SkyCoord instance defining the position of the center of the
 |          circular search region.
 |          converted if it's a iterable containing scalars,
 |          assuming icrs degrees.
 |      radius : `~astropy.units.Quantity` or float
 |          a Quantity instance defining the radius of the circular search
 |          region, in degrees.
 |          converted if it is another unit.
 |      verbosity : int
 |          an integer value that indicates the volume of columns
 |          to return in the result table.  0 means the minimum
 |          set of columns, 3 means as many columns as are available.
 |      **keywords :
 |         additional case insensitive parameters can be given via arbitrary
 |         case insensitive keyword arguments. Where there is overlap
 |         with the parameters set by the other arguments to
 |         this function, these keywords will override.
 |
 |      Returns
 |      -------
 |      SCSQuery
 |          the query instance
 |
 |      See Also
 |      --------
 |      SCSQuery
 |
 |  describe(self)
 |      describe the general information about the DAL service
 |
 |  search(self, pos, radius=1.0, *, verbosity=2, **keywords)
 |      submit a simple Cone Search query that requests objects or observations
 |      whose positions fall within some distance from a search position.
 |
 |      Parameters
 |      ----------
 |      pos : astropy.coordinates.SkyCoord
 |          a SkyCoord instance defining the position of the center of the
 |          circular search region.
 |          converted if it's a iterable containing scalars,
 |          assuming icrs degrees.
 |      radius : `~astropy.units.Quantity` or float
 |          a Quantity instance defining the radius of the circular search
 |          region, in degrees.
 |          converted if it is another unit.
 |      verbosity : int
 |         an integer value that indicates the volume of columns
 |         to return in the result table.  0 means the minimum
 |         set of columns, 3 means as many columns as are available.
 |      **keywords :
 |         additional case insensitive parameters can be given via arbitrary
 |         case insensitive keyword arguments. Where there is overlap
 |         with the parameters set by the other arguments to
 |         this function, these keywords will override.
 |
 |      Returns
 |      -------
 |      SCSResults
 |          a container holding a table of matching catalog records
 |
 |      Raises
 |      ------
 |      DALServiceError
 |         for errors connecting to or communicating with the service
 |      DALQueryError
 |         if the service responds with an error,
 |         including a query syntax error.
 |
 |      See Also
 |      --------
 |      SCSResults
 |      pyvo.dal.DALServiceError
 |      pyvo.dal.DALQueryError
 |
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |
 |  columns
 |      the available columns on this service
 |
 |  description
 |      the service description.
 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from pyvo.dal.query.DALService:
 |
 |  __repr__(self) -> str
 |      Return repr(self).
 |
 |  ----------------------------------------------------------------------
 |  Readonly properties inherited from pyvo.dal.query.DALService:
 |
 |  baseurl
 |      the base URL identifying the location of the service and where
 |      queries are submitted (read-only)
 |
 |  capability_description
 |      The service description.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from pyvo.dal.query.DALService:
 |
 |  __dict__
 |      dictionary for instance variables
 |
 |  __weakref__
 |      list of weak references to the object

Finding the data#

Next, we will use the search function in cs_service to search for observations around our source. We’ve already set up a constant for the source name, in the ‘Global Setup: Constants’ section:

SRC_NAME
'3C 105'

The search function takes as input, the sky position either as a list of [RA, DEC], or as an astropy sky coordinate object SkyCoord.

# Find the coordinates of the source
pos = SkyCoord.from_name(SRC_NAME)

# Show the retrieved coordinates
pos
<SkyCoord (ICRS): (ra, dec) in deg
    (61.81861303, 3.7071705)>

Now we run a cone search on the NuSTAR observation summary table (numaster), centered on the position of our source:

search_result = cs_service.search(pos)

We can quickly examine the output of the search by converting it to an Astropy table and displaying it by putting it at the end of the cell:

# Convert the result to an Astropy Table and render it
search_result.to_table()
Table length=5
__rownameradectimeobsidstatusexposure_aobservation_modeobs_typeprocessing_datepublic_dateissue_flagSearch_Offset
degdegdsdd
objectobjectfloat64float64float64objectobjectfloat64objectobjectfloat64int32int16float64
35783C10561.80223.683756338.087660061044002archived4807SCIENCEEGS59168.60005711201.7172
35793C10561.80593.685856339.164060061044006archived5583SCIENCEEGS59168.50005711201.4912
35803C10561.80343.685956338.625860061044004archived6208SCIENCEEGS59168.60005711201.5680
35813C10561.80203.687257826.466160261003004archived20703SCIENCEELS59110.30005783601.5573
35833C10561.84103.734957621.299460261003002archived20737SCIENCEELS59113.70005762502.1365

2. Applying observation selection criteria#

The search results table has several entries, each representing a different NuSTAR observation.

We can filter the results to only include observations that we’re interested in. As a slightly arbitrary example, we can select only those observations with an exposure less than 10 ks.

Due to the current design of the Python object returned by the cs_service.search(pos) call, we have to loop through the results to filter them, rather than applying a boolean mask as we might for Astropy Table or Pandas DataFrame objects:

obs_to_explore = [row for row in search_result if row["exposure_a"] <= 10000]
obs_to_explore
[('3578', '3C105', '61.8022', '3.6837', '56338.0876', '60061044002', 'archived', '4807.3775', 'SCIENCE', 'EGS', '59168.6', '57112', '0', '1.7172299582982673'),
 ('3579', '3C105', '61.8059', '3.6858', '56339.164', '60061044006', 'archived', '5582.6419', 'SCIENCE', 'EGS', '59168.5', '57112', '0', '1.4911511920888738'),
 ('3580', '3C105', '61.8034', '3.6859', '56338.6258', '60061044004', 'archived', '6208.4217', 'SCIENCE', 'EGS', '59168.6', '57112', '0', '1.5679511763430094')]

3. Identifying where to download observation data files#

Downloading the observations#

We can download the data directories using wget (or curl):

# Use wget to download the data when outside SciServer
wget_cmd = (
    f"wget -q -nH --no-check-certificate --no-parent --cut-dirs=6 "
    f"-r -l0 -c -N -np -R 'index*' -erobots=off --retr-symlinks "
    f"-P {ROOT_DATA_DIR} {{}}"
)

for link in links:
    os.system(wget_cmd.format(link))

Note

All HEASARC data is available locally when working on SciServer, mounted at /FTP/, so all you could replace this download step with a copy command. The data links strings could be split on ‘FTP’, and then have ‘/FTP/’ prepended, to get the SciServer local path.

We can now examine the directory containing the downloaded data:

glob.glob(os.path.join(ROOT_DATA_DIR, "**/**"))
['/home/jovyan/project/_data/NuSTAR/60061044002/auxil',
 '/home/jovyan/project/_data/NuSTAR/60061044002/event_cl',
 '/home/jovyan/project/_data/NuSTAR/60061044002/event_uf',
 '/home/jovyan/project/_data/NuSTAR/60061044002/hk',
 '/home/jovyan/project/_data/NuSTAR/60061044002/nu60061044002.cat.gz',
 '/home/jovyan/project/_data/NuSTAR/60061044002/pipe.log',
 '/home/jovyan/project/_data/NuSTAR/60061044006/auxil',
 '/home/jovyan/project/_data/NuSTAR/60061044006/event_cl',
 '/home/jovyan/project/_data/NuSTAR/60061044006/event_uf',
 '/home/jovyan/project/_data/NuSTAR/60061044006/hk',
 '/home/jovyan/project/_data/NuSTAR/60061044006/nu60061044006.cat.gz',
 '/home/jovyan/project/_data/NuSTAR/60061044006/pipe.log',
 '/home/jovyan/project/_data/NuSTAR/60061044004/auxil',
 '/home/jovyan/project/_data/NuSTAR/60061044004/event_cl',
 '/home/jovyan/project/_data/NuSTAR/60061044004/event_uf',
 '/home/jovyan/project/_data/NuSTAR/60061044004/hk',
 '/home/jovyan/project/_data/NuSTAR/60061044004/nu60061044004.cat.gz',
 '/home/jovyan/project/_data/NuSTAR/60061044004/pipe.log']

About this notebook#

Author: Abdu Zoghbi, HEASARC Staff Scientist

Author: David Turner, HEASARC Staff Scientist

Updated On: 2026-02-09

Additional Resources#

Contact the HEASARC helpdesk for further assistance.

SciServer Platform

Acknowledgements#

References#

Taghizadeh-Popp M., Kim J. W., Lemson G. et al. (2020) - SciServer: A science platform for astronomy and beyond