How to write a custom plugin module

Sections

A custom plugin module for spec2nexus.spec is provided in a python module (Python source code file). In this custom plugin module are subclasses for each new control line to be supported. An exception will be raised if a custom plugin module tries to provide support for an existing control line.

Load a plugin module

Control line handling plugins for spec2nexus will automatically register themselves when their module is imported. Be sure that you call get_plugin_manager() before you import your plugin code. This step sets up the plugin manager to automatically register your new plugin.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import spec2nexus.plugin
import spec2nexus.spec

# get the plugin manager BEFORE you import any custom plugins
manager = plugin.get_plugin_manager()

import MY_PLUGIN_MODULE
# ... more if needed ...

# read a SPEC data file, scan 5
spec_data_file = spec2nexus.spec.SpecDataFile("path/to/spec/datafile")
scan5 = spec_data_file.getScan(5)

Write a plugin module

Give the custom plugin module a name ending with .py. As with any Python module, the name must be unique within a directory. If the plugin is not in your working directory, there must be a __init__.py file in the same directory (even if that file is empty) so that your plugin module can be loaded with import <MODULE>.

Plugin module setup

Please view the existing plugins in spec_common for examples. The custom plugin module should contain, at minimum one subclass of spec2nexus.plugin.ControlLineHandler which is decorated with @six.add_metaclass(spec2nexus.plugin.AutoRegister). The add_metaclass decorator allows our custom ControlLineHandlers to register themselves when their module is imported. A custom plugin module can contain many such handlers, as needs dictate.

These imports are necessary to to write plugins for spec2nexus:

1
2
3
4
import six
from spec2nexus.plugin import AutoRegister
from spec2nexus.plugin import ControlLineHandler
from spec2nexus.utils import strip_first_word

Attribute: ``key`` (required)

Each subclass must define key key as a regular expression match for the control line key. It is possible to override any of the supplied plugins for scan control line control lines. Caution is advised to avoid introducing instability.

Attribute: ``scan_attributes_defined`` (optional)

If your plugin creates any attributes to the spec2nexus.spec.SpecDataScan object (such as the hypotetical scan.hdf5_path and scan.hdf5_file), you declare the new attributes in the scan_attributes_defined list. Such as this:

1
scan_attributes_defined = ['hdf5_path', 'hdf5_file']

Method: ``process()`` (required)

Each subclass must also define a process() method to process the control line. A NotImplementedError exception is raised if key is not defined.

Method: ``match_key()`` (optional)

For difficult regular expressions (or other situations), it is possible to replace the function that matches for a particular control line key. Override the handler’s match_key() method. For more details, see the section Custom key match function.

Method: ``postprocess()`` (optional)

For some types of control lines, processing can only be completed after all lines of the scan have been read. In such cases, add a line such as this to the process() method:

scan.addPostProcessor(self.key, self.postprocess)

(You could replace self.key here with some other text. If you do, make sure that text will be unique as it is used internally as a python dictionary key.) Then, define a postprocess() method in your handler:

def postprocess(self, scan, *args, **kws):
    # handle your custom info here

See section Postprocessing below for more details. See spec2nexus.plugins.spec_common for many examples.

Method: ``writer()`` (optional)

Writing a NeXus HDF5 data file is one of the main goals of the spec2nexus package. If you intend data from your custom control line handler to end up in the HDF5 data file, add a line such as this to either the process() or postprocess() method:

scan.addH5writer(self.key, self.writer)

Then, define a writer() method in your handler. Here’s an example:

def writer(self, h5parent, writer, scan, nxclass=None, *args, **kws):
    """Describe how to store this data in an HDF5 NeXus file"""
    desc='SPEC positioners (#P & #O lines)'
    group = makeGroup(h5parent, 'positioners', nxclass, description=desc)
    writer.save_dict(group, scan.positioner)

See section Custom HDF5 writer below for more details.

Full Example: #PV control line

Consider a SPEC data file (named pv_data.txt) with the contrived example of a #PV control line that associates a mnemonic with an EPICS process variable (PV). Suppose we take this control line content to be two words (text with no whitespace):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#F pv_data.txt
#E 1454539891
#D Wed Feb 03 16:51:31 2016
#C pv_data.txt  User = spec2nexus
#O0 USAXS.a2rp  USAXS.m2rp  USAXS.asrp  USAXS.msrp  mr  unused37  mst  ast
#O1 msr  asr  unused42  unused43  ar  ay  dy  un47

#S 1  ascan  mr 10.3467 10.3426  30 0.1
#D Wed Feb 03 16:52:03 2016
#T 0.1  (seconds)
#P0 3.5425 6.795 7.7025 5.005 10.34465 0 0 0
#P1 7.6 17.17188 -8.67896 -0.351 10.318091 0 18.475664 0
#C tuning USAXS motor mr
#PV mr ioc:m1
#PV ay ioc:m2
#PV dy ioc:m3
#N 18
#L mr    ay  dy  ar_enc  pd_range  pd_counts  pd_rate  pd_curent  I0_gain  I00_gain  Und_E  Epoch  seconds  I00  USAXS_PD  TR_diode  I0  I0
10.34665  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.037 0.1 199 2 1 114 114
10.34652  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.294 0.1 198 2 1 139 139
10.34638  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.553 0.1 198 2 1 181 181
10.34625  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.952 0.1 198 2 1 274 274
10.34278  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172309 41.621 0.1 198 2 1 232 232
10.34265  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 41.867 0.1 199 2 1 159 159
#C Wed Feb 03 16:52:14 2016.  removed many data rows for this example.

A plugin (named pv_plugin.py) to handle the #PV control lines could be written as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
from collections import OrderedDict
import six
from spec2nexus.plugin import AutoRegister
from spec2nexus.plugin import ControlLineHandler
from spec2nexus.utils import strip_first_word

@six.add_metaclass(AutoRegister)
class PV_ControlLine(ControlLineHandler):
    '''**#PV** -- EPICS PV associates mnemonic with PV'''
    
    key = '#PV'
    scan_attributes_defined = ['EPICS_PV']
    
    def process(self, text, spec_obj, *args, **kws):
        args = strip_first_word(text).split()
        mne = args[0]
        pv = args[1]
        if not hasattr(spec_obj, "EPICS_PV"):
            # use OrderedDict since it remembers the order we found these
            spec_obj.EPICS_PV = OrderedDict()
        spec_obj.EPICS_PV[mne] = pv

When the scan parser encounters the #PV lines in our SPEC data file, it will call this process() code with the full text of the line and the spec scan object where this data should be stored. We will choose to store this (following the pattern of other data names in SpecDataFileScan) as scan_obj.EPICS_PV using a dictionary.

It is up to the user what to do with the scan_obj.EPICS_PV data. We will not consider the write() method in this example. (We will not write this infromation to a NeXus HDF5 file.)

We can then write a python program (named pv_example.py) that will load the data file and interpret it using our custom plugin:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import spec2nexus.plugin
import spec2nexus.spec

# call get_plugin_manager() BEFORE you import any custom plugins
manager = spec2nexus.plugin.get_plugin_manager()

# show our plugin is not loaded
print("known: ", "#PV" in manager.registry) # expect False

import pv_plugin
# show that our plugin is registered
print("known: ", "#PV" in manager.registry) # expect True

# read a SPEC data file, scan 1
spec_data_file = spec2nexus.spec.SpecDataFile("pv_data.txt")
scan = spec_data_file.getScan(1)

# Do we have our PV data?
print(hasattr(scan, "EPICS_PV"))    # expect True
print(scan.EPICS_PV)

The output of our program:

1
2
3
4
5
known:  False
known:  True
False
True
OrderedDict([('mr', 'ioc:m1'), ('ay', 'ioc:m2'), ('dy', 'ioc:m3')])

Example to ignore a #Y control line

Suppose a control line in a SPEC data file must be ignored. For example, suppose a SPEC file contains this control line: #Y 1 2 3 4 5. Since there is no standard handler for this control line, we create one that ignores processing by doing nothing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import six
from spec2nexus.plugin import AutoRegister
from spec2nexus.plugin import ControlLineHandler

@six.add_metaclass(AutoRegister)
class Ignore_Y_ControlLine(ControlLineHandler):
    '''
    **#Y** -- as in ``#Y 1 2 3 4 5``

    example: ignore any and all #Y control lines
    '''

    key = '#Y'

    def process(self, text, spec_obj, *args, **kws):
        pass # do nothing

Postprocessing

Sometimes, it is necessary to defer a step of processing until after the complete scan data has been read. One example is for 2-D or 3-D data that has been acquired as a vector rather than matrix. The matrix must be constructed only after all the scan data has been read. Such postprocessing is handled in a method in a plugin file. The postprocessing method is registered from the control line handler by calling the addPostProcessor() method of the spec_obj argument received by the handler’s process() method. A key name [1] is supplied when registering to avoid registering this same code more than once. The postprocessing function will be called with the instance of SpecDataFileScan as its only argument.

An important role of the postprocessing is to store the result in the scan object. It is important not to modify other data in the scan object. Pick an attribute named similarly to the plugin (e.g., MCA configuration uses the MCA attribute, UNICAT metadata uses the metadata attribute, …) This attribute will define where and how the data from the plugin is available. The writer() method (see below) is one example of a user of this attribute.

Example postprocessing

Consider the #U control line example above. For some contrived reason, we wish to store the sum of the numbers as a separate number, but only after all the scan data has been read. This can be done with the simple expression:

1
spec_obj.U_sum = sum(spec_obj.U)

To build a postprocessing method, we write:

1
2
3
4
5
6
7
def contrived_summation(scan):
    '''
    add up all the numbers in the #U line

    :param SpecDataFileScan scan: data from a single SPEC scan
    '''
    scan.U_sum = sum(scan.U)

To register this postprocessing method, place this line in the process() of the handler:

1
spec_obj.addPostProcessor('contrived_summation', contrived_summation)

Summary Example Custom Plugin with postprocessing

Gathering all parts of the examples above, the custom plugin module is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import six
from spec2nexus.plugin import AutoRegister
from spec2nexus.plugin import ControlLineHandler
from spec2nexus.utils import strip_first_word

@six.add_metaclass(AutoRegister)
class User_ControlLine(ControlLineHandler):
    '''**#U** -- User data (#U user1 user2 user3)'''

    key = '#U'

    def process(self, text, spec_obj, *args, **kws):
        args = strip_first_word(text).split()
        user1 = float(args[0])
        user2 = float(args[1])
        user3 = float(args[2])
        spec_obj.U = [user1, user2, user3]
        spec_obj.addPostProcessor('contrived_summation', contrived_summation)


def contrived_summation(scan):
    '''
    add up all the numbers in the #U line

    :param SpecDataFileScan scan: data from a single SPEC scan
    '''
    scan.U_sum = sum(scan.U)


@six.add_metaclass(AutoRegister)
class Ignore_Y_ControlLine(ControlLineHandler):
    '''**#Y** -- as in ``#Y 1 2 3 4 5``'''

    key = '#Y'

    def process(self, text, spec_obj, *args, **kws):
        pass

Custom HDF5 writer

A custom HDF5 writer method defines how the data from the plugin will be written to the HDF5+NeXus data file. The writer will be called with several arguments:

h5parent: obj : the HDF5 group that will hold this plugin’s data

writer: obj : instance of spec2nexus.writer.Writer that manages the content of the HDF5 file

scan: obj : instance of spec2nexus.spec.SpecDataFileScan containing this scan’s data

nxclass: str : (optional) name of NeXus base class to be created

Since the file is being written according to the NeXus data standard [2], use the NeXus base classes [3] as references for how to structure the data written by the custom HDF5 writer.

One responsibility of a custom HDF5 writer method is to create unique names for every object written in the h5parent group. Usually, this will be a NXentry [4] group. You can determine the NeXus base class of this group using code such as this:

1
2
>>> print h5parent.attrs['NX_class']
<<< NXentry

If your custom HDF5 writer must create group and you are uncertain which base class to select, it is recommended to use a NXcollection [5] (an unvalidated catch-all base class) which can store any content. But, you are encouraged to find one of the other NeXus base classes that best fits your data. Look at the source code of the supplied plugins for examples.

The writer uses the spec2nexus.eznx module to create and write the various parts of the HDF5 file.

Here is an example writer() method from the spec2nexus.plugins.unicat module:

1
2
3
4
5
6
 def writer(self, h5parent, writer, scan, nxclass=None, *args, **kws):
     '''Describe how to store this data in an HDF5 NeXus file'''
     if hasattr(scan, 'metadata') and len(scan.metadata) > 0:
         desc='SPEC metadata (UNICAT-style #H & #V lines)'
         group = eznx.makeGroup(h5parent, 'metadata', nxclass, description=desc)
         writer.save_dict(group, scan.metadata)

Custom key match function

The default test that a given line matches a specific spec2nexus.plugin.ControlLineHandler subclass is to use a regular expression match.

1
2
3
4
5
6
7
 def match_key(self, text):
     '''default regular expression match, based on self.key'''
     t = re.match(self.key, text)
     if t is not None:
         if t.regs[0][1] != 0:
             return True
     return False

In some cases, that may prove tedious or difficult, such as when testing for a floating point number with optional preceding white space at the start of a line. This is typical for data lines in a scan or continued lines from an MCA spectrum. in such cases, the handler can override the match_key() method. Here is an example from SPEC_DataLine:

1
2
3
4
5
6
7
8
9
 def match_key(self, text):
     '''
     Easier to try conversion to number than construct complicated regexp
     '''
     try:
         float( text.strip().split()[0] )
         return True
     except ValueError:
         return False

Summary Requirements for custom plugin

  • file can go in your working directory or any directory that has __init__.py file
  • multiple control line handlers can go in a single file
  • for each control line:
    • subclass spec2nexus.plugin.ControlLineHandler
    • add @six.add_metaclass(AutoRegister) decorator to auto-register the plugin
    • import the module you defined (FIXME: check this and revise)
    • identify the control line pattern
    • define key with a regular expression to match [6]
      • key is used to identify control line handlers
      • redefine existing supported control line control lines to replace supplied behavior (use caution!)
      • Note: key="scan data" is used to process the scan data: spec2nexus.plugins.spec_common.SPEC_DataLine()
    • define process() to handle the supplied text
    • define writer() to write the in-memory data structure from this plugin to HDF5+NeXus data file
    • (optional) define match_key() to override the default regular expression to match the key
  • for each postprocessing function:
    • write the function
    • register the function with spec_obj.addPostProcessor(key_name, the_function) in the handler’s process()

Changes in plugin format with release 2021.0.0

With release 2021.0.0, the code to setup plugins has changed. The new code allows all plugins in a module to auto-register themselves as long as the module is imported. All custom plugins must be modified and import code revised to work with new system. See the spec2nexus.plugins.spec_common source code for many examples.

  • SAME: The basics of writing the plugins remains the same.
  • CHANGED: The method of registering the plugins has changed.
  • CHANGED: The declaration of each plugin has changed.
  • CHANGED: The name of each plugin file has been relaxed.
  • CHANGED: Plugin files do not have to be in their own directory.
  • REMOVED: The SPEC2NEXUS_PLUGIN_PATH environment variable has been eliminated.

Footnotes

[1]The key name must be unique amongst all postprocessing functions. A good choice is the name of the postprocessing function itself.
[2]http://nexusformat.org
[3]http://download.nexusformat.org/doc/html/classes/base_classes/
[4]http://download.nexusformat.org/doc/html/classes/base_classes/NXentry.html
[5]http://download.nexusformat.org/doc/html/classes/base_classes/NXcollection.html
[6]It is possible to override the default regular expression match in the subclass with a custom match function. See the match_key() method for an example.