CERN Accelerating science

This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern for current CERN information.

PM & alarms  c/o Niall

I also provided an alarms example SDDS file to Alessandro last week. He will at some point verify that this format is acceptable for the analysis.

Here are the column descriptions...
& column name=timespec.seconds, type=long, &end
& column name=timespec.nanoseconds, type=long, &end
& column name=alarmid, type=string, &end
& column name=status, type=character, &end
& column name=System, type=string, &end
& column name=Identifier, type=string, &end
& column name=Description, type=string, &end

The System, Identifier, Description are normally seen on the alarm
console, the alarmid is internal but useful to us at the moment. The
status is either A(ctivated) or T(erminated).
For time, I used a format similar to the POSIX C timespec struct to hold

the UTC seconds since 1970 and nanos from the top of the last second.
However, I am not really sure about how time should be presented in SDDS

format as it is not a first class type like int/long/char etc.
Identifier (the LHC entity) is easier obviously being a string.

 

 

I am thinking that the most benecifial solution (on a postmortem event occurring) is to provide the current list of active alarms and subsequent events over a duration as a file. The format of this file could be a known standard format based on a known standard meta-format such as SDDS, XML, CSV, etc.
The advantages for all of us are:
* For alarms, we provide associated alarms once only :)
* For the postmortem system, once captured, all associated data for a
particular postmortem event is permanently available for everyone and
their tools.
* Data can be kept/moved/copied/modified/deleted together as a single set.
* For both, there would be no requirement and dependence on "live"
services (logging, alarms) to be available every time any analysis is
done. Longer term associated cyclic activities on both sides of a
service interface would be avoided -- namely specification, provision,
deployment, maintainance.

In the AB/CO Review, it was mentioned that PM data could be given to others to analyse such as PhD students. This would be difficult if they had to connect (to CERN) to retrieve the alarms from the alarm archive. One could also just "zip up" the PM event data and send it to people who have a specific interest or expertise.

File/alarms size estimates. Based on:
* Our current average input of 170'000 alarm events per day
* Assuming that _all_ alarms are required
* Time duration of five minutes before and after a PM event is required
...suggests a file containing around 7'000-10'000 alarms. This could be
further reduced if the postmortem trigger also communicated information concerning which accelerator, system (or set of) or location, the event was focused on. "Focused" or "localised" postmortem events also allows individual PM processes to be tested without disrupting others.