Guide for Instrument Scientists¶
Introduction¶
In theory, the instrument scientist is responsible for writing a
single function, scan, which will create the necessary scan
objects for the user. In practice, the writing this scan function is
tricky and much of this project is about creating a generic scan
function that minimises the boilerplate required from the instrument
scientist.
The main point of entry will be the
Scans.Util.make_scan() function. Given a set of defaults
defined by the instrument scientist, the make_scan function will
create the necessary function. For example, on the Zoom instrument,
the scanning function is simply defined by:
>>> from .Util import make_scan #doctest: +SKIP
>>> scan = make_scan(Zoom()) #doctest: +SKIP
All that remains for the instrument scientist is to create a subclass
(such as the Zoom class in the example above)
of the Scans.Defaults.Defaults to provide make_scan with
the information that it needs.
Defaults¶
The Defaults class requires the instrument scientist to implement
four class methods. If either of the two methods are missing, the class
will immediately throw an error on the first attempt to instantiate
it. This helps finding errors quickly, instead of in the middle of a
measurement when the missing function is first needed.
detector¶
The Scans.Defaults.Defaults.detector() function should return
the result of a measurement in a Monoid. This will most likely be
either a total number of counts on a detector or transmission monitor.
However, it is possible to provide more complicated measurements and
values, such as taking a flipping measurement and returning a
polarisation.
The value returned by the function should either be a raw count
represented by a number or an instance of the
Scans.Monoid.Monoid class. The Monoid class allows for
multiple measurements to be combined correctly.
log_file¶
The Scans.Defaults.Defaults.log_file() returns the path to a
file where the results of the current scan should be stored. This
function should return a unique value for each scan, to ensure that
previous results are not overwritten. This can easily be achieved by
appending the current date and time onto the file name.
Monoid¶
Mathematically, a monoid is a collection with the following properties:
- There exists an operator ⊙, such that, for any two elements, such as x and y, in the collection, then there is another element in the collection whose value would be x ⊙ y.
- a ⊙ (b ⊙ c) = (a ⊙ b) ⊙ c
- There exists a zero element Z such that, a ⊙ Z = Z ⊙ a = a
The more intuitive explanation is that a monoid promises us that we can combine many elements together and get back a single element. Many common structures form monoids.
- Count
- 0 is the zero element and addition is the operator
- Lists
- The zero element is the empty list and concatenation is the operator
- Boolean
- False is the zero element and
oris the operator - Product
- 1 is the zero element and multiplication is the operator
- Sum
- 0 is the zero element and addition is the operator
- Unit Monoid
- The collection with only a single element is a monoid. The zero value is that element and the operator just returns its first value. For example, the set {🌲} is a monoid with zero element 🌲 and a combining operator 🌲 ⊙ 🌲 = 🌲.
- Minimum
- ∞ is the zero elemenent and the ⊙ operator simply returns the smallest of its operands
- A pair of monoids (m, n)
- The zero element is (0ₘ, 0ₙ) and our ⊙ operator is defined so that (xₘ, xₙ) ⊙ (yₘ, yₙ) = (xₘ ⊙ yₘ, xₙ ⊙ yₙ)
The ability of a pair of monoids to form another monoid allows for the development of surprisingly deep structures. For example, since the Sum and Count are both monoids, then the combination (Sum, Count) is also a monoid. We know that dividing the sum by the count will give us the average. What the monoid convention provides, however, is a way to combine two averages to correctly get the new average. If I know that one set has an average of 6 and the other has an average of 4, I don’t know what the average of the combined sets should be. On the other hand, if I know that one set has a sum and count of (60, 10) and the other has (160, 40), I know that the combined set has a sum and count of (220, 50) and the total average is 4.4. In a similar fashion, it is also possible to express the standard deviation as a monoid, allowing for a standard deviation that can be live updated as each data point arrives.
Uncertainties¶
Although monoids do not natively contain a notion of uncertainty [1], the monoids used in this project could allow for the calculation of uncertainty. The design decision was that adding that uncertainty calculation into the monoid provided enough utility and simplified the value enough to warrant its inclusion, despite the mathematical issues. We may re-examine this issue in the future.
| [1] | Returning to the Unit monoid example, there is no obvious implementation of uncertainty for {🌲}. |
Monoid Examples¶
Most of our monoids can be created fairly simply
>>> from Scans.Monoid import *
>>> s = Sum(2.0)
>>> x = Average(1.0)
>>> p = Polarisation(ups=100.0, downs=0.0)
>>> lst = MonoidList([p, x, s])
The first rule of monoids is that we can always add to values together
>>> s + 3
Sum(5.0)
>>> x + Average(5, count=2)
Average(6.0, count=3)
>>> p + Polarisation(ups=100, downs=400)
Polarisation(200.0, 400.0)
>>> lst + [300, 3, Sum(1)]
MonoidList([Polarisation(400.0, 0.0), Average(4.0, count=2), Sum(3.0)])
The second rule of monoids is that adding zero to something always returns the original value. This overrides other behaviours.
>>> s + 0
Sum(2.0)
>>> x + 0
Average(1.0, count=1)
>>> x + Average(0)
Average(1.0, count=2)
>>> sum([x, x, 0, 0, 0, 8, Average(0), Average(0)])
Average(10.0, count=5)
>>> p + 0
Polarisation(100.0, 0.0)
>>> lst + 0
MonoidList([Polarisation(100.0, 0.0), Average(1.0, count=1), Sum(2.0)])
Where appropriate, monoids can be cast into a float >>> float(s) 2.0 >>> float(x) 1.0 >>> float(p) 1.0
Similarly, casting to a string is also available
>>> str(s)
'2.0'
>>> str(x)
'1.0'
>>> str(p)
'1.0'
>>> str(lst)
'[1.0, 1.0, 2.0]'
Every element has an associate uncertainty
>>> s.err()
1.4142135623730951
>>> lst.err()
[0.1414213562373095, 1.0, 1.4142135623730951]
>>> Polarisation(8.0, 8.0).err()
0.25
The MonoidList has a couple of extra list related functionality. It can be iterated, like a normal list.
>>> lst += [0, -3, 8]
>>> for l in lst:
... print(l)
1.0
-1.0
10.0
You can also find the minimum and maximum value >>> lst.min() Average(-2.0, count=2) >>> lst.max() Sum(10.0)
As an example of a less intuitive but highly relevant monoid is the standard deviation.
>>> std = StdDev(3.0)
>>> float(std + StdDev(3.0))
0.0
>>> float(std + StdDev(4.0))
0.5
>>> float(sum(map(StdDev, [2, 4, 4, 4, 5, 5, 7, 9]), StdDev.zero()))
2.0
Models¶
All models for fitting should derive from the Scans.Fit.Fit
class. However, this class is likely too generic for common use, as
it expects the instrument scientist to implement their own fitting
procedures. While this is useful for implementing classes like
Scans.Fit.PolyFit, where we can take advantage of our
knowledge of the model to get an exact fitting procedure, most models
will not need this level of control. For this reason, there is a
subclass Scans.Fit.CurveFit which simplifies this work as
much as possible. Implementing a new model with CurveFit for fitting
requires implementing three functions.
- _model
- This function should take a list of x coordinates as its first parameter. The remaining function parameters should be the parameters of the model. This function should return the value of the model at those x-coordinates for the model with the given parameters
- guess
- This function takes two parameters - the lists of x and y coordinates for the data set. The return value is a list of approximate values for the correct parameters to the _model function. This rough approximation is used as the starting point for the fitting procedure.
- readable
- This function operates on a list of parameters values like the kind
returned by
guess. It returns a dictionary with each parameter given a human readable name. The purpose is to make it easier for users to understand the results of the fit.
As of the current version, there is a nasty bug with CurveFit. Specifically, CurveFit relies on scipy.optimize, which load the Intel Math Kernel Library. This library adds an operating system hook that crashes when the user presses Ctrl-C. Since the hook is at a much lower level than Python, there is nothing that can be done at the Python level to handle the issue. The result is that, while the fitting functions run properly, the python session will be permanently tainted so that Ctrl-C will now crash Python. The system environment variable FOR_DISABLE_CONSOLE_CTRL_HANDLER is the official way of bypassing this issue, but I have not had luck with getting this to work within the genie-python environment.