.. doctest-skip-all ########## servicemon ########## **servicemon** is a command line tool for measuring the timing of Virtual Observatory (VO) queries. The features are also available via the :doc:`servicemon-api`. Code and issue tracker are on `GitHub `_. ************ Installation ************ The latest version of **servicemon** requires Python 3.8 or higher and can be pip installed. If an environment already has an older version of **servicemon** installed add ``--upgrade`` to make sure the latest version is installed. Be sure to use the ``--pre`` option since the latest version is a pre-release. .. code-block:: bash $ pip install --pre servicemon ******** Overview ******** The basic intent of **servicemon** is to execute multiple queries on one or more VO services and record the timing of each query. The specific query parameters (such as ra, dec and radius) are varied with each query. Those query parameters can be listed in an input file, or randomly generated according to run-time arguments. The following commands are available for performing timed queries and generating random input parameters: `sm_query`_ Executes and times queries over a specified list of template services whose parameters are filled by a specified list of input parameters. `sm_run_all`_ Runs multiple `sm_query`_ commands in parallel based on the `Service Files`_ found in specified directory structure. `sm_replay`_ Replays the queries in a given ``csv`` result file from `sm_query`_, recording the timings into a new result file. `sm_conegen`_ Creates an input parameter file containing the specified number of the random cone search parameters ``ra``, ``dec`` and ``radius``. The services to query are specified in `Service Files`_, typically formatted as a Python list of dictionaries that specify a template query to be filled with the input parameters along with the query endpoint and other metadata. The input parameters are specified as a list of dictionaries, each of which has a value for ``ra``, ``dec`` and ``radius``. (A future generalization will support parameter names other than these cone search parameters.) ************** Basic Examples ************** Timing a Simple Cone Search (SCS) service ========================================= In this example, we will execute and time 3 different queries to a `Simple Cone Search (SCS) ` service hosted at the Cool Archive. 1. Create a cone search parameter input file for 3 random cones with radii from 0.1 to 0.2 degrees. (If desired, this step can be skipped by supplying the `--num_cones`, `--min_radius` and `--max_radius` to the `sm_query`_ command in step 3. In that case the random cones will be generated internally and not stored in a separate file.) .. code-block:: bash $ sm_conegen three_cones.py --num_cones 3 --min_radius 0.1 --max_radius 0.2 $ cat three_cones.py [ {'dec': 27.663409836268887, 'ra': 101.18375952324169, 'radius': 0.11092016620136524}, {'dec': 4.840997553935431, 'ra': 358.97280995896705, 'radius': 0.19181724441608894}, {'dec': 3.284106996695529, 'ra': 312.34607454539434, 'radius': 0.13515755153293374} ] 2. Create a service file that defines the Simple Cone Search service to query. For SCS services, use `service_type` of `cone`. This causes the ``ra``, ``dec`` and ``radius`` values from the input file to be appended to the ``access_url`` at query time according to the SCS standard. .. literalinclude:: files/cool_archive_cone_service.py :caption: cool_archive_cone_service.py 3. Run `sm_query`_. By default, all `The Output`_ will appear in the ``results`` directory. .. code-block:: bash $ sm_query cool_archive_cone_service.py --cone_file three_cones.py Timing a Table Access Protocol (TAP) service ============================================ This example queries a `Table Access Protocol (TAP) ` service hosted by the Great Archive 3 times with the same cones defined in the previous example. 1. Create a service file that describes the TAP service to be queried. Note that the ``service_type`` is ``tap``. The ``adql`` value is a template containing 3 `{}`s into which the input ``ra``, ``dec`` and ``radius`` values will be substituted. .. literalinclude:: files/great_archive_tap_service.py :caption: great_archive_tap_service.py 2. Run `sm_query`_. By default, all `The Output`_ will appear in the ``results`` directory. .. code-block:: bash $ sm_query great_archive_tap_service.py --cone_file three_cones.py Query multiple services in parallel =================================== It can be efficient to query multiple service providers in parallel, however, we may not want to execute multiple parallel queries on the same service provider. `sm_run_all`_ provides an automated way to handle that situation. For each subdirectory of the specified input directory, `sm_run_all`_ invokes `sm_query`_ once at a time for each service definition file found in that subdirectory. The subdirectories themselves (each perhaps representing a single service provider) are handled in parallel. This example uses `sm_run_all`_ to query the two services above in parallel. 1. Using the files from the previous examples, create an input directory structure to give to `sm_run_all`_. .. code-block:: bash $ mkdir -p input/cool_archive input/great_archive # Create a subdirectory for each archive $ mv cool_archive_cone_service.py input/cool_archive/ $ mv great_archive_tap_service.py input/great_archive/ $ mv three_cones.py input $ ls -RF input cool_archive/ great_archive/ three_cones.py input/cool_archive: cool_archive_cone_service.py input/great_archive: great_archive_tap_service.py 2. For all the cones in `input/three_cones.py`, run all the services in `input/cool_archive` in parallel with those in `input/great_archive`. In addition to specifying the input directory created above, `sm_run_all`_ requires that the result directory is explicitly specified. .. code-block:: bash $ sm_run_all input --cone_file input/three_cones.py --result_dir results *************** Command Options *************** sm_query ======== Documentation for this command: .. program-output:: sm_query --help sm_run_all ========== .. program-output:: sm_run_all --help sm_replay ========= .. program-output:: sm_replay --help sm_conegen ========== .. program-output:: sm_conegen --help ************* Service Files ************* A service file contains a Python list of dictionaries. Each dictionary defines a service endpoint, and must contain the keys defined below. All services are assumed to return results as VOTables. * **base_name** - This name of the service will be used in constructing the unique ids for each result row as well as the file names for the VOTable result files stored in the ``results`` subdirectory. * **service_type** - One of ``cone``, ``xcone`` or ``tap`` * ``cone`` The query will be constructed as a VO standard Simple Cone Search with the RA, DEC and SR parameters being automatically set based per cone. .. literalinclude:: files/cool_archive_cone_service.py * ``xcone`` A non-standard cone search. The **access_url** is assumed to contain three {}s (open/close braces). The RA, Dec and Radius for each cone will be substituted for those 3 braces in order. .. literalinclude:: files/sia_service.py :caption: service_type 'xcone' can be used for an SIA service * ``tap`` A Table Access Protocol (TAP) service. The **adql** value is a template template for the TAP query to be performed. .. literalinclude:: files/great_archive_tap_service.py :caption: Sample TAP service. * **access_url** - The access URL for the service. * **adql** - For the ``tap`` *service_type*, this is the ADQL query. For other types, this key must exist, but the value will be ignored. The ADQL query is assumed to contain three {}s (open/close braces). The ra, dec and radius for each cone will be substituted for those 3 braces in order. .. literalinclude:: files/multiple_services.py :caption: **Multiple services** are allowed, but it is recommended that all service_type values are the same. ********** The Output ********** `sm_query`_ and `sm_run_all`_ write multiple output files, all to the result directory specified using the ``--result_dir`` command argument. `sm_run_all`_ requires that the result directory is explicitly specified while `sm_query`_ will use a default of ``results``. `CSV files` By default, the output statistics for the queries are written to CSV files, one CSV file per service file. The CSV file names are ``_.csv``. These files are written by the default output writer plugin, ``csv_writer``. See `Plugins`_ for information on how to specify alternative or additional writers, or how to customize the CSV file name. `VOTable subdirectories and files` If ``--save_results`` is specified on the command line, the VOTables returned from each query will be stored in subdirectories of the result directory. Those subdirectories are named for the `base_name` specified in the query's service file. The names of the VOTables are built from attributes of the service and the input: ``____.xml`` Even when ``--save_results`` is not specified, the VOTables are written to those files temporarily. Empty VOTables subdirectories are an artifact of that process when ``--save_results`` is not specified. `Log files from sm_run_all` For each subdirectory handled, `sm_run_all`_ creates a file that logs the commands run in that directory. The files are called ``-_comlog.txt``. In addition, for each service queried (i.e., each `sm_query`_ run), a file is created to collect any stdout or stderr generated by `sm_query`_. Those files are named ``_-_runlog.xml``. ******* Plugins ******* Using a plugin ============== A plugin mechanism is provided to allow customization of the output from `sm_query`_ (and `sm_run_all`_). By default, a plugin called `csv_writer` writes the query statistcs to the CSV files described above. Alternative or additional plugins can be loaded via the ``--writer`` argument. To use a new plugin called `my_writer`: .. code-block:: bash :emphasize-lines: 2 $ sm_query cool_archive_cone_service.py --cone_file three_cones.py \ --writer my_writer When multiple writers are specified, each writer will be invoked for each result, and they will be executed in the order they were specified on the command line. To use a new plugin called `my_writer` along with the builtin `csv_writer`: .. code-block:: bash :emphasize-lines: 2,3 $ sm_query cool_archive_cone_service.py --cone_file three_cones.py \ --writer my_writer \ --writer csv_writer Writing a plugin ================ A plugin is a Python class in a file that gets loaded at run time. The class must be a subclass of `AbstractResultWriter` and must implement the abstract methods defined there. .. automethod:: servicemon.plugin_support.AbstractResultWriter.begin .. automethod:: servicemon.plugin_support.AbstractResultWriter.one_result .. automethod:: servicemon.plugin_support.AbstractResultWriter.end Loading a plugin at run time ============================ To load your plugin at run time, the plugin should be in a ``.py`` file. Then it should either be placed in the ``plugins`` subdirectory of your working directory, or its location should be specified on the command line (for `sm_query`_, `sm_replay`_) with the ``--load_plugins`` argument. The value for ``--load_plugins`` can be either a directory (from which all ``.py`` files will be loaded, or an individual ``.py`` file.) Example: csv_writer =================== The builtin plugin ``csv_writer`` is shown below. Note that its ``begin`` method accepts the keyword argument ``outfile`` which can be used to override the default output file name. To specify ``outfile`` on the command line, include it with the ``--writer`` value: .. code-block:: bash :emphasize-lines: 2 $ sm_query cool_archive_cone_service.py --cone_file three_cones.py \ --writer csv_writer:outfile=my_override_filename.csv .. literalinclude:: ../servicemon/builtin_plugins/csv_writer.py :language: python Developer documentation ----------------------- The :doc:`analysis-api` is for use in finding and analyzing data already collected. .. toctree:: :maxdepth: 1 analysis-api.rst