- CPV
- EMC
- [QC on FLP - plot integrated over the run] Raw Data Error
- [QC on FLP - plot integrated over the run] Payload Size/events
- [QC on FLP - plot integrated over the run] Bunch minimum amplitude EMCAL+DCAL
- [QC on EPN - plot integrated over the run] Cell Occupancy plots (PHYS) for E>0.2 GeV and E<0.2 GeV and Cell Occupancy plots (CALIB) for E>0.5 GeV
- [QC on EPN - plot integrated over the run] Cell Amplitude
- [QC on EPN - plot integrated over the run] Cell Time
- Number of triggers per Time Frame
- Known issues
- FDD
- FT0
- FV0
- HMP
- ITS
- General considerations
- Quality summary
- Error count vs Error id
- Trigger count vs TriggerID and FeeID
- Lane Status Flag: Not OK (NOK)
- Number of QC cycles without data from stave
- Lane Status Global
- ITS Misconfiguration plot
- Cluster Occupancy overview
- Angular Distribution
- Number of clusters per track
- Known issues
- ITS KNOWN ISSUES
- MCH
- MFT
- Expendable MFT tasks
- Quality summary
- Chips in Error/Fault/Warning
- Digit Occupancy Summary
- Cluster Occupancy Summary
- Real-time Cluster Occupancy Summary (last ~2 mins window)
- Track phi distribution, track eta distribution
- Tracks X-Y distribution
- Distribution of the #clusters per ROF
- Distribution of the #tracks per ROF
- Known issues
- MID
- PHS
- TOF
- TPC
- TRD
- ZDC
- VTX
CPV#
Physics PP runs#
Monitor if plots are not empty and timestamps are updated.
If plots are empty or not updated please inform the on-call.
If you see "Number of entries has not changed in the past cycle" but run is still ongoing then most probable reason is that PHS become busy. You can check with ECS shifter and inform the on-call if expert is not informed yet. If PHS is not busy but you see this message then inform the on-call.
If you see other red messages please check known issues. If issue is not known please inform the on-call.
[QC on QC merger node] Global CPV quality#
Please check known issues
Good plot:
The plot summarize CPV quality. If global quality is not good then please check below the details.
-
Number of digits increases
assures that digits are being produced. If bad then follow the procedure aboveIf you see "Number of entries has not changed in the past cycle"
. -
Number of clusters increases
assures that clusters are being produced. If bad then follow the procedure aboveIf you see "Number of entries has not changed in the past cycle"
. -
Digit occupancy check
assures that digit occupancy is good. If bad then check messages onDigit Map in M2,3,4
plots and follow the instructions. -
Cluster size check
assures that mean cluster size is within allowed limits. If bad please inform the on-call. If medium - put a log entry and inform the oncall during morning and afternoon shift. No need to call during night. -
CalibDigit amplitude check
assures that observed amplitude spectra are good. If bad please inform the oncall. Medium means lack of statistics: please check later. -
Errors presence check
that occured number of errors is within limits. If not good then please inform the oncall. If medium then put a log entry and inform the oncall during morning and afternoon shift. No need to call at night if there are no other issues.
[QC on QC merger node] Global quality trend#
Good plot:
The trend shows evolution of global CPV quality. It can become not good for small portion of cycles due to fluctuations. In this case run to be considered as good.
If the trend is not good for significant part of the run then run quality must be setted to bad. Put bad flag even if issue is known. Expert will have a look and adjust the qaulity later if needed.
[QC on EPN] Error occurance#
Good plot:
If plot is not good then please inform the oncall. If medium then put a log entry and inform the oncall during morning and afternoon shift. No need to call at night if there are no other issues. No need to put bad flag for run if there are no other issues.
[QC on EPN] Digit Map in M2, 3, 4#
Please check known issues
Good plots: The plots represent number of digits seen in each channel. It should be more or less uniform.
If red messages tell Hot 3G Cards (N1/N2) or Cold 3G Cards (N1/N2) then put log entry and ask the oncall to check the plots.
Bad example: HV Trip in module M2
Bad example: Number of entries has not changed in the past cycle
Minimal duration after SOR before taking any action required by these instructions: 5 min at 500 kHz
Actions to be taken by QC shifters#
- Minimal duration after SOR before taking any action required by these instructions: 5 min at 500 kHz.
- Inform on-call when quality is bad and it is not a known issue.
Known issues#
-
In PEDESTAL runs QC plot "Pedestal sigma distribution M3 is bad". Acording to expert, noise conditions in CPV depend on general environment in ALICE, therefore, from time to time CPV pedestals can become wider which is reflected in QC-CPV calibration run plot with the message "Number of bad pedestal sigmas in module M3 (sometimes in M2) is larger than the upper limit". This is a known issue, can be ignored, but make a log entry when it happens. See: https://ali-bookkeeping.cern.ch/?page=log-detail&id=45015
-
In PHYSICS runs Red messages on QC Plot "Digit Map in M4". Problem with high voltage is preventing this module from running normally. Experts are trying to recover. No need to report. Run quality must be setted to "good" if there are no problems with other modules. See: https://ali-bookkeeping.cern.ch/?page=log-detail&id=53523. Quality aggregator is also expected to show bad quality for
Digit oocupancy check
andCalibDigit amplitude check
.
EMC#
[QC on FLP - plot integrated over the run] Raw Data Error#
Green: good quality | Red: bad quality |
---|---|
- If no entries, a green message "No Error: OK" inform you that everything is working properly.
- In case of errors, a red message will appear: call EMCAL oncall if EMCAL is included in global runs. Take note of the error-type in the y axis.
[QC on FLP - plot integrated over the run] Payload Size/events#
good quality | bad quality | empty |
---|---|---|
- If "Data OK" is shown, everything is fine.
- If some of the DDL presents entries that are larger than the others, a red message will appears: please call EMCAL oncall.
- If the plot is empty, and EMCAL is included in the data taking, call EMCAL oncall.
[QC on FLP - plot integrated over the run] Bunch minimum amplitude EMCAL+DCAL#
good quality | bad quality |
---|---|
- One peak should be visibile for all supermodules around 0 if EMCAL is in data taking.
- If the peak moves to the range 20-50 "Min raw amplitude (ADC)" for any supermodule, or a second peak appears for any supermodule in that range , call EMCAL oncall
- If the plot is empty, and EMCAL is included in the data taking: please call EMCAL oncall
[QC on EPN - plot integrated over the run] Cell Occupancy plots (PHYS) for E>0.2 GeV and E<0.2 GeV and Cell Occupancy plots (CALIB) for E>0.5 GeV#
good quality (high-E) | good quality (high_E) | bad quaity |
---|---|---|
In case of good data the peak is at 0; if the plot is peaked at different values call EMC oncall
Occupancy plots: Acceptance losses#
Missing acceptance (low-E) | Missing acceptance (high-E) |
---|---|
In case counts in a certain detector segment stop increasing the corresponding segment stops sending data. The corresponding area must be visible in both plots. - call EMCAL on-call
[QC on EPN - plot integrated over the run] Cell Amplitude#
- no instructions for the moment.
- If the plot is empty, and EMCAL is included in the data taking, call EMCAL oncall.
[QC on EPN - plot integrated over the run] Cell Time#
good quality | bad quality |
---|---|
- We expect a gaussian peak roughly centred at 0 and a structure of single lines with a spacing of 100 ns from noisy channels. Depending on the filling scheme smaller gaussian like peaks from from pileup could appear. In such cases the data is OK.
- In case a second peak appears which is either of the same magnitude as the main peak or of magnitude 1/3 as the main peak and the peaks are separated by 100 ns call the EMCAL on-call
- If the main peak deviates from 0 by more than 100 ns call the EMCAL on-call
- If no gaussian like peak is present call the EMCAL on-call
- If the plot is empty, and EMCAL is included in the data taking, call EMCAL on-call.
Number of triggers per Time Frame#
good quality | bad quality |
---|---|
Trigger rate appox. const with minor fluctuations | Trigger rate dropping by several events / timeframe or to almost 0 |
The number of physics triggers per Time Frame must stay constant while the run is ongoing. In case the number of trigger decreases call the EMCAL on-call. The expected value will depend on the interaction rate and a checker will be implemented soon; for the moment check the "EMC KNOWN ISSUES" for the expected value under current conditions.
Known issues#
-
Do not call EMC oncall during the night for QC plots related errors for non PHYSICS runs, unless a PHYSICS run is expected next.
-
For the plot "Number of Physics triggers per timeframe" the expected value is currently approx 20. If value is below 15 for at least 2 consecutive timestamps, call on-call.
-
Payload Size/Event can be ignored in all runs till further notice
-
In plot "Raw data errors" (error rate) quality is bad if the rate is above 100 errors / minute for several minutes, and good if it is below.
-
An empty region is expected near the bottom-right corner for CAL (Calibration) triggers.
FDD#
FDD: General#
All QC plots are generated in 5 min. cycles. The histogram contents are reseted after each cycle. Please wait at least 5 min. from the start of the RUN to judge about the quality of collected data.
If a non-critical (i.e. expendable) QC task fails, please add a bookkeeping entry about it.
FDD: Quality summary#
The left panel summarizes all FDD checks for the last QC cycle of the selected run. The top part shows a global aggregated quality with a text message suggesting actions for the QC shifter:
- Quality: Bad: Inform the FIT on-call immediately
- Quality: Medium: Follow individual check instructions
- Quality: Good: All checks are OK
- Quality: NULL: Some histograms are empty, inform the FIT on-call immediately
The bottom part of the summary shows individual check results and reasons for qualities Bad, Medium and NULL.
The right panel shows a time trend of the global quality for the selected run.
FDD: Out of bunch collisions#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
BC vs trigger correlation for the events which were detected but are not aligned (out-of-bunch) with LHC filling scheme. The number of out-of-bunch events depends on the trigger settings. The check is performed for the Vertex trigger.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FDD: Fraction of events with CFD in ADC gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in ADC gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FDD: Fraction of events with the CFD in time gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in time gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FDD: Fraction of charge in ADC range#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of charge in ADC range in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FDD: Validation of hardware (HW) triggers in software (SW)#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of only software or hardware (SW + HW) triggers. In ideal case both the FW and SW triggers or nighther of them should be present in a given event. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
Known issues#
- The Out of bunch collisions plot can be ignored in SYNTHETIC runs at the moment (i.e. no need to take any actions in case of not GOOD quality). The BC distribution is not simulated properly in MC and will always cause problems in this plot.
FT0#
FT0: General#
All QC plots are generated in 5 min. cycles. The histogram contents are reseted after each cycle. Please wait at least 10 min. from the start of the RUN to judge about the quality of collected data.
If a non-critical (i.e. expendable) QC task fails, please add a bookkeeping entry about it.
FT0: Quality summary#
The left panel summarizes all FT0 checks for the last QC cycle of the selected run. The top part shows a global aggregated quality with a text message suggesting actions for the QC shifter:
- Quality: Bad: Inform the FIT on-call immediately
- Quality: Medium: Follow individual check instructions
- Quality: Good: All checks are OK
- Quality: NULL: Some histograms are empty, inform the FIT on-call immediately
The bottom part of the summary shows individual check results and reasons for qualities Bad, Medium and NULL.
The right panel shows a time trend of the global quality for the selected run.
FT0: Out of bunch collisions#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
BC vs trigger correlation for the events which were detected but are not aligned (out-of-bunch) with LHC filling scheme. The number of out-of-bunch events depends on the trigger settings. The check is performed for the Vertex trigger.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FT0: Fraction of events with CFD in ADC gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in ADC gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FT0: Fraction of events with the CFD in time gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in time gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FT0: Fraction of channels out of colliding BCs#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of channles fired out of colliding BCs. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FT0: Validation of hardware (HW) triggers in software (SW)#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of only software or hardware (SW + HW) triggers. In ideal case both the FW and SW triggers or nighther of them should be present in a given event. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
Known issues#
- channels 60-63 (FT0A side) are OFF in PHYSICS and COSMICS runs
- channels 139 and 176-179 (FT0C side) are OFF in PHYSICS and COSMICS runs
FV0#
FV0: General#
All QC plots are generated in 5 min. cycles. The histogram contents are reseted after each cycle. Please wait at least 10 min. from the start of the RUN to judge about the quality of collected data.
If a non-critical (i.e. expendable) QC task fails, please add a bookkeeping entry about it.
FV0: Quality summary#
The left panel summarizes all FV0 checks for the last QC cycle of the selected run. The top part shows a global aggregated quality with a text message suggesting actions for the QC shifter:
- Quality: Bad: Inform the FIT on-call immediately
- Quality: Medium: Follow individual check instructions
- Quality: Good: All checks are OK
- Quality: NULL: Some histograms are empty, inform the FIT on-call immediately
The bottom part of the summary shows individual check results and reasons for qualities Bad, Medium and NULL.
The right panel shows a time trend of the global quality for the selected run.
FV0: Out of bunch collisions#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
BC vs trigger correlation for the events which were detected but are not aligned (out-of-bunch) with LHC filling scheme. The number of out-of-bunch events depends on the trigger settings. The check is performed for the TrgNchan trigger.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FV0: Fraction of events with CFD in ADC gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in ADC gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FV0: Fraction of events with the CFD in time gate#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of events with CFD in time gate in each detector channel. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
FV0: Validation of hardware (HW) triggers in software (SW)#
[QC on EPN/QC nodes - plot integrated over the QC cycle]
Fraction of only software or hardware (SW + HW) triggers. In ideal case both the FW and SW triggers or nighther of them should be present in a given event. Horizontal lines show the levels below which the warnings/errors are raised.
Actions:
- If WARNING - Add bookkeeping entry (once per run).
- If ERROR - call FIT-on-call.
Known issues#
HMP#
Busy time#
[QC on FLP]
The plot shows the busy time for each detector DDL. In case more than three equipments exceeds 120 microsec or the plot is empty call HMP on-call
Event size#
[QC on FLP]
The plot shows the event size for each detector DDL. In case more than three equipments exceeds 13 kB or the plot is empty call HMP on-call
Sum Q maps#
[QC on FLP]
The plot shows the charge of all the detector channels in only one 2D map. It looks like the example plot shown above. In case it is completely empty call HMP on-call.
Charge vs HV sector#
[QC on FLP]
The plot shows the charge of all the detector HV sectors in only one 2D map. It looks like the example plot shown above. The white bands correspond to the faulty HV sectors that are off. In case more than two bands (w.r.t those shown here) became white call the an-call.
Occupancy#
[QC on FLP]
The plot shows the occupancy for each detector DDL. In case more than three equipments exceeds 3% or the plot is empty call HMP on-call
Known issues#
For the moment links 3 and 12, exlcuded from data taking
ITS#
General considerations#
If any of the plots listed on this page remains empty during a run, please call the ITS on-call. The only exception, i.e., where empty means good quality, is for the four plots on the lane status.
Quality summary#
This figure shows the time trend of aggregated ITS quality across all ITS QC plots, along with a separate quality trend for the "Fraction of QC Cycles with data" plot:
Please create a logbook entry and contact the ITS on-call if: - Any of these plots turns BAD, even if the quality flag later returns to GOOD. - Any of these plots remains Medium for longer than 10 minutes.
Additionally, the layout includes a text summary of all ITS QC checks for the latest QC cycle, as shown in the text box:
The top line provides the aggregated quality status along with a text message suggesting actions for the QC shifter.
- Quality: BAD: contact ITS on-call expert
- Quality: Medium: Contact the ITS on-call expert if the quality remains in this state for longer than 10 minutes.
- Quality: NULL: The plots are empty. Check in DCS if ITS is in STANDBY. If not, inform the ITS on-call expert.
In cases of BAD or Medium quality, this canvas will display the error message from the QC plot that triggered the issue, in the format "Flag: Unknown: ERROR MESSAGE."
Error count vs Error id#
This plot provides information about issues that occurred during the decoding of ITS data. The QC distinguishes 24 possible problems with the data, filling the Error ID into the Y-bin and the FEE ID (from where the problematic data originated) into the X-bin. A few important considerations for this plot:
- Errors with IDs 19 and 24 (last bin) are for expert purposes and do not indicate problems with the data. Therefore, they never trigger a bad quality flag in QC checks.
- In the case of corrupted data from ITS, new bins will be filled. Please note that BAD data quality is flagged by QC checks only when ITS encounters decoding issues during the most recent QC
Please call ITS on-call if you see new entries in this plot or in case of BAD Quality message.
Trigger count vs TriggerID and FeeID#
This plot summarizes the trigger flags from the last QC cycle. The X-axis corresponds to the Front-End Electronics (FEE) ID, and the Y-axis shows the list of all possible triggers that can be received by the FEE. There are a few possible scenarios:
- Left plot (GOOD): Each FEE receives PHYS, HB, ORBIT, SOC, and TF trigger signals. Color variation between different parts of the detector, coming from different half-layers, is possible.
- Middle plot (BAD): Corrupted data from ITS, where unexpected triggers (not PHYS, HB, ORBIT, SOC, or TF) can be observed.
- Right plot (BAD): One of the ITS FEEs stopped sending data, resulting in no triggers being filled for that FEE ID.
In both cases of bad quality call ITS on-call
Lane Status Flag: Not OK (NOK)#
This plot indicates the fraction of lanes (color scale) in a "Not OK" status for each ITS stave, represented by a triangle. A blue triangle corresponds to a stave without any problematic lanes, while staves with errors will be represented by different colors. The GOOD run quality is shown in the left plot, while other quality levels (right plot) may show the following messages:
- Quality: MEDIUM: Middle Layer (ML) or Outer Layer (OL) have staves in ERROR.
- Quality: MEDIUM: The Inner Barrel has a stave with more than two chips in ERROR/FAULT/WARNING.
- Quality: BAD: Layers 0–6 have more than 25% of staves with lanes/chips in ERROR/FAULT/WARNING.
If the BAD quality message is printed, call the ITS on-call.
Number of QC cycles without data from stave#
This plot indicates the fraction of consequative QC cycles without data (out of 3) for each ITS stave, represented by a triangle. A blue triangle corresponds to a stave without any problematic lanes, while staves with readout issues are represented by different colors. The GOOD run quality is shown in the left plot, while other quality levels (right plot) may show the following messages:
- Quality: GOOD:
- Quality: BAD: Stave LX_XX is empty
Please, call the ITS on-call in case of BAD quality
Lane Status Global#
[QC on FLP]
This plot shows the fraction of lanes into ERROR, FAULT, and WARNING statuses. The TOTAL bin gives the total fraction of lanes in any not OK status. The BAD quality will be triggered when the bin value exceeds the 10% threshold.
The following Quality messages can appear:
- Quality::GOOD
- Quality::BAD: >10% of the lanes are bad.
In case of BAD quality messages call the ITS on-call.
ITS Misconfiguration plot#
This plot shows the estimated readout rate for each FEE component of the ITS. All FEEs should display the same estimated readout frequency, as shown in the example figure. Please note that color variations along the Z-axis are acceptable and do not indicate any issues.
The following Quality messages can appear:
- Quality::GOOD
- Quality::BAD: MISCONFIGURATION. CALL EXPERTS. This bad quality may also be triggered by corrupted data from ITS. In such cases, you will see random bins filled across the entire plot (e.g., see the right plot).
In case of BAD quality messages call the ITS on-call.
Cluster Occupancy overview#
[QC on EPN]
Overview of the cluster occupancy, i.e., number of clusters per event, for each stave (1 bin in the plot). The left figure gives example of the GOOD run, while on the right is the problematic distribution. Check the general trend, i.e. occupancy decreasing when going from the innermost to the outermost layers. This MO can have the following quality messages:
- Quality:: MEDIUM: Layer_Stave has large cluster occupancy
- Quality:: BAD: Layer_Stave has empty stave
Call the ITS on-call in case of anomalies in the plot or if the BAD quality message will appear.
Angular Distribution#
[QC on EPN]
Angular distribution of online reconstructed ITS tracks as a function of phi vs. eta (2D plot).
Possible Quality messages that can appear on the plot during a run:
- Quality::GOOD: plot might still be bad! See example above and look at the arrows in the plot.
- BAD: Asymmetric Phi distribution (OK if there are disabled ITS sectors)
- BAD: Asymmetric Eta distribution (OK if there are disabled ITS sectors)
- BAD: NO ITS TRACKS
Please try to reconfigure the detector (set run type again) in case the quality is BAD or if the plot shows several holes (blue regions in between the yellow parts as shown in the plot above), call ITS on-call is the issue persists. Please call the ITS on-call also if the plot remains empty during the run.
Number of clusters per track#
[QC on EPN]
Distribution of the number of clusters per track. The plot shows a GOOD example of pp collisions run. The following messages can appear:
- Quality::GOOD
- Quality::Medium Mean is outside 5.2-6.2, ignore for COSMICS and TECHNICALS
- Quality::BAD: 0 tracks with 4/5/6/7 clusters (OK if it's synthetic run)
- Quality::BAD: NO ITS TRACKS
Call the ITS on-call in case of BAD quality messages, if a completely different plot is obtained, or if the plot stays empty after 5 min of data taking. In case of MEDIUM status, create log-book entry.
Known issues#
ITS KNOWN ISSUES#
MCH#
Quality Summary#
[QC on EPN]
The left panel shows a summary of the automated checked on the MCH data, in a human-readable format. The top line describes the aggregated quality status, followed by a message suggesting the appropriate action according to the quality level:
-
Bad: immediately inform the MCH on-call
-
Medium: write a logbook entry, tagging MCH
-
Null: the plots are completely empty. Check in DCS if MCH is in STANBY. If not, inform the MCH on-call.
The right panel shows a trending plot of the aggregated quality. The message in the left panel always corresponds to the most recent point in the trending plot.
If the quality in the trend plot is Bad for the whole duration of a run, MCH should be marked as Bad in the Bookkeeping flags for the run.
Quality Plots - Decoding and Pre-clustering#
The following plots show the distribution of various estimators of the MCH data quality. Each horizontal bin shows the value of the monitored quantity, averaged over one Detection Element(DE). The vertical dashed lines show the boundaries between each of the 10 MCH chambers. An horizontal dashed line shows the threshold used by the checker to decide if a given detection element is considered good or bad.
The checker assigns an overall Good (green), Medium (orange) or Bad (red) quality flag to the plot, depending on the number and pattern of bad DEs. In general, the quality is still considered Good if only few DEs are bad. The quality is set to Medium if several DEs are Bad, but no significant impact on the detector acceptance is expected. If the number and pattern of bad DEs is such that the acceptance will be degraded, the quality is set to Bad.
The overall aggregated MCH quality is the logic AND of the qualities of the individual plots.
Fraction of Synchronized Boards#
[QC on EPN - plot from the last QC cycle]
Green plot: good quality | Red plot: bad quality |
---|---|
The plot shows, for each Detection Element, the fraction of FEC boards that are properly synchronized. A given DE is coinsidered bad if the corresponding fraction is below the horizontal dashed line. The overall plot is considered Bad if the number and position of bad DEs is such that it significantly impacts the tracking performance.
Fraction of Boards not in Error#
[QC on EPN - plot from the last QC cycle]
Green plot: good quality | Red plot: bad quality |
---|---|
The plot shows, for each Detection Element, the fraction of FEC boards that do not have decoding errors. A given DE is coinsidered bad if the corresponding fraction is below the horizontal dashed line. The overall plot is considered Bad if the number and position of bad DEs is such that it significantly impacts the tracking performance.
Fraction of Boards with Good Rate#
[QC on EPN - plot from the last QC cycle]
Green plot: good quality | Red plot: bad quality |
---|---|
The plot shows, for each Detection Element, the fraction of FEC boards that have a correct hit rate. A given DE is coinsidered bad if the corresponding fraction is below the horizontal dashed line. The overall plot is considered Bad if the number and position of bad DEs is such that it significantly impacts the tracking performance.
Average Hit Rate#
[QC on EPN - plot from the last QC cycle]
Green plot: good quality |
---|
The plot shows the average hit rate (in kHz) for each detection element. A given DE is coinsidered bad if the corresponding rate is below the horizontal dashed line. The overall plot is considered Bad if the number and position of bad DEs is such that it significantly impacts the tracking performance.
Average Pseudo-efficiency#
[QC on EPN - plot from the last QC cycle]
Green plot: good quality | Red plot: bad quality |
---|---|
The plot shows the average pseudo-efficiency for each detection element. The detection efficiency is estimated from the correlation between the pre-clusters reconstructed in either cathode of each DE. A given DE is coinsidered bad if the corresponding efficiency is below the horizontal dashed line. The overall plot is considered Bad if the number and position of bad DEs is such that it significantly impacts the tracking performance.
Quality Plots - Tracking#
The following plots provide an estimation of the MCH tracking quality using the output of the online event processing.
Track phi#
[QC on EPN - plot integrated over the run]
Green panel: good quality | Red panel: bad quality |
---|---|
The plot shows the track distribution in phi. An automatic checker verifies the unifirmity of the distribution.
In case of Bad or Null quality call the MCH oncall.
Track q/pt#
[QC on EPN - plot integrated over the run]
Green panel: good quality | Red panel: bad quality |
---|---|
The plot shows the charge over pt track distribution. An automatic checker verifies that the difference between the positive and negative tracks is below a certain thershold.
In case of Bad or Null quality call the MCH oncall.
Number of clusters per track#
[QC on EPN - plot integrated over the run]
Green panel: good quality | Red panel: bad quality |
---|---|
The plot shows the number of clusters assicated to each track. An automatic checker verifies that the average of the distributions is between the limits represented by the vertical dashed lines.
In case of Bad or Null quality call the MCH oncall.
Clusters size per chamber#
[QC on EPN - plot integrated over the run]
Green panel: good quality | Red panel: bad quality |
---|---|
The plot shows the clusters size per chamber. An automatic checker verifies that the cluster size in each chamber is above a threshold represented by the horizontal dashed line.
In case of Bad or Null quality call the MCH oncall.
Number of clusters per chamber#
[QC on EPN - plot integrated over the run]
Green panel: good quality |
---|
The plot shows the number of clusters per chamber. An automatic checker verifies that the number of clusters for each chambes is above a threshold represented by the horizontal dashed line.
In case of Bad or Null quality call the MCH oncall.
Known issues#
-
In SYNTHETIC runs, MCH global quality may depend on the replay configuration
- in p-p replay (500 kHz) : MCH run global quality is expected to be good. If quality is bad or medium, notify the on-call via a logbook entry
- in PbPb replay : MCH run global quality is expected to be good. If quality is bad or medium, notify the on-call via a logbook entry
- other replay settings : MCH run global quality may oscillates between good and bad. If quality is bad or medium, notify the on-call via a logbook entry
-
In TECHNICAL runs, MCH global quality depends on the DCS state of MCH
- if MCH state is READY : MCH global quality should be good. If MCH global quality is bad a notification via a logbook entry is enough.
- if MCH state is BEAM_TUNING (BEAM_TU) or STANDBY_CONFIGURED (STDB_CO): MCH global quality is expected to be bad. Only the "Decoding errors" is expected to have a good quality; if this is not the case a notification via a logbook entry is enough.
-
In COSMICS runs, the low number of tracks makes it difficult to compute the efficiency for all detection elements in each cycle. Hence oscillations between good and bad status should be expected.
- if Bad Preclusters quality: Bad Mean Efficiency vs DE(B) and Mean Efficiency vs DE(NB) show multiple bins below threshold value. This is a known issue, no need to notify.
- if the global quality is bad continuously for more than 1 hour, please make a notification via a logbook entry.
-
FLP Infologger
- QC plots have been disabled and will generate some errors about not-found plots that can be ignored, for instance :
- Requested resource does not exist: ali-qcdb.cern.ch:8083/qc/MCH/QO/DecodingCheck/1709911754085/PeriodName=LHC24aa/RunNumber=548050/
- Requested resource does not exist: ali-qcdb.cern.ch:8083/qc/MCH/QO/PreclustersCheck/1709911754085/PeriodName=LHC24aa/RunNumber=548050/
- Requested resource does not exist: ali-qcdb.cern.ch:8083/qc/MCH/QO/DecodingCheck/1709911754085/PeriodName=LHC24aa/RunNumber=548050/
- QC plots have been disabled and will generate some errors about not-found plots that can be ignored, for instance :
MFT#
Expendable MFT tasks#
The MFT currently has 4 post-processing tasks that are marked as non-critical (= they are allowed to crash while a run is ongoing):
- MFTReadoutTrend
- MFTOccupancyTrend
- MFTTrendSlices
- RefComp
If such tasks crash during a PHYSICS run, please immediately call the MFT on-call and create a log entry tagging MFT. If a crash occurs in COSMICS/SYNTHETIC/NOISE/..., a log entry is sufficient (no need to call).
Quality summary#
The left panel summarizes all MFT QC checks for the last QC cycle. The top row provides a summary of the quality status with a text message suggesting actions to be taken:
- Quality: Bad - call the MFT on-call immediately
- Quality: Medium - create a log entry tagging MFT
- Quality: NULL - QC objects were not created
The right panel provides a time trend of the MFT quality summary.
Good quality example:
Bad quality example (triggered by the real-time cluster occupancy):
Chips in Error/Fault/Warning#
- Description: this plot is created on the FLPs and shows the number of MFT chips in Error/Fault/Warning. Up to the first 20 chips are explicitly listed.
- Checks to be done: there is an automatic checker on the number of chips in E/F/W.
- Actions to be taken: follow the instructions in the plot. Since the MFT has automatic chip recovery that is triggered if a certain amount of chips in E/F is reached, call the on-call if the quality remains Bad for more than 2 minutes.
Digit Occupancy Summary#
- Description: this plot is created on the FLPs and shows the number of digits per MFT zone per LHC orbit. It contains all data since SOR.
- Checks to be done: there is an automatic checker for empty ladders (each MFT zone is composed of multiple ladders).
- Actions to be taken: follow the instructions in the plot:
- The quality turns Medium if some individual ladders are empty: create a log entry tagging MFT.
- If at least two adjacent ladders are empty, the quality turns Bad: call the MFT on-call immediately.
Cluster Occupancy Summary#
- Description: this plot is created on the EPNs and shows the number of clusters per MFT zone per LHC orbit. It contains all data since SOR.
- Checks to be done: there is an automatic checker for empty ladders (each MFT zone is composed of multiple ladders).
- Actions to be taken: follow the instructions in the plot:
- The quality turns Medium if some individual ladders are empty: create a log entry tagging MFT.
- If at least two adjacent ladders are empty, the quality turns Bad: call the MFT on-call immediately.
Real-time Cluster Occupancy Summary (last ~2 mins window)#
- Description: this is the same plot as Cluster Occupancy Summary, but it only contains data from the last time window (duration approximately 2 mins).
- Checks to be done: there is also an automatic checker for empty ladders with the same settings. The output of this checker corresponds to the "Real-time cluster occupancy" shown in the "Quality Summary" at the top.
- Actions to be taken: follow the usual instructions given in the plot.
Track phi distribution, track eta distribution#
- Description: these plots are created on the EPNs and show the track phi and eta distributions (normalized by the number of LHC orbits).
- Checks to be done: data from the current run are black, while the reference histogram plotted in the background is blue. Both histograms should look similar.
- Actions to be taken: if the black histogram significantly deviates from the blue reference (the ratio panel shows a deviation larger than 20% for some bins), call the MFT on-call. Deviations are allowed if they are present only at the tails of the distribution (as in the case of the eta distribution shown above).
Tracks X-Y distribution#
- Description: this plot is created on the EPNs and shows the track position in the X-Y plane (normalized by the number of LHC orbits).
- Checks to be done: the histogram should look similar to the reference (left) shown in this documentation.
- Actions to be taken: if the plot looks significantly different from the reference, call the MFT on-call.
Distribution of the #clusters per ROF#
- Description: this plot is created on the EPNs and shows the number of clusters per MFT ROF (readout frame). The data are normalized by the number of LHC orbits.
- Checks to be done: the histogram should look similar to the reference (left) shown in this documentation.
- Actions to be taken: if the plot looks significantly different from the reference, call the MFT on-call.
Distribution of the #tracks per ROF#
- Description: this plot is created on the EPNs and shows the number of tracks per MFT ROF (readout frame). The data are normalized by the number of LHC orbits.
- Checks to be done: the histogram should look similar to the reference (left) shown in this documentation.
- Actions to be taken: if the plot looks significantly different from the good reference, call the MFT on-call.
Known issues#
MID#
Local boards occupancy map (DigitsQC)#
[QC on EPN]
The plot shows fired local Boards.
In case of :
-
empty column call On Call
-
empty or very high rate on several neighboring boards of the detector call On Call
Hits multiplicity (DigitsQC)#
[QC on EPN]
These plots show hits multiplicity by plane for bending and non-bending
-
if mean value > 100.
call On Call
Known issues#
- MID has some empty bins in SYNTETHIC runs on "Local boards Occupancy Map" plot. Ignore them till it will be fixed
PHS#
Physics PP runs#
Monitor if plots are not empty and timestamps are updated.
If plots are empty or not updated please inform the on-call.
If you see "Number of entries has not changed in the past cycle" but run is still ongoing then most probable reason is that PHS bacome busy. You can check with ECS shifter and inform the on-call if expert is not informed yet. If PHS is not busy but you see this message then inform the on-call.
If you see other red messages please check known issues. If issue is not known please inform the on-call.
[QC on QC merger node] Global PHS quality#
Good plot:
The plot summarize PHS quality. If global quality is not good then please check below the details.
-
Number of cells increases
assures that digits are being produced. If bad then follow the procedure aboveIf you see "Number of entries has not changed in the past cycle"
above. -
Number of clusters increases
assures that clusters are being produced. If bad then follow the procedure aboveIf you see "Number of entries has not changed in the past cycle"
above. -
Cells check
assures that cells occupancy is good. If not good then checkCells HG occupancy
plots and follow the instructions. -
Clusters check
assures that mean cluster energy size is within allowed limits. If not good then please inform the on-call. If medium - put a log entry and inform the oncall during morning and afternoon shift. No need to call during night. -
Errors check
assures that occured number of errors is within limits. If not good then please inform the oncall. If medium then put a log entry and inform the oncall during morning and afternoon shift. No need to call at night if there are no other issues.
[QC on QC merger node] Global quality trend#
Good plot:
The trend shows evolution of global PHS quality. It can become medium for small portion of cycles due to failure of fit procedure. In this case run to be considered as good.
If the trend is not good for significant part of the run then run quality must be setted to bad. Put bad flag even if issue is known. Expert will have a look and adjust the qaulity later if needed.
[QC on EPN] Error occurance#
Good plot:
If plot is not good then please inform the oncall. If medium then put a log entry and inform the oncall during morning and afternoon shift. No need to call at night if there are no other issues. No need to put bad flag for run if there are no other issues.
[QC on EPN] Cell HG occupancy in M1, 2, 3, 4#
Please check known issues
Good plots: The plots represent number of cells seen in each channel. It should be more or less uniform.
If red messages say Not OK then put log entry and inform the oncall. Note that messages can be false positive. If you think the plots are not different too much from reference then just put log entry, no need to call during night. If they are differ too much (big empty/hot regions, etc.) then call the on-call. Please do not hesitate to call the on-call if you have any doubts: it's better to wake up us at night than loose data!
Actions to be taken by QC shifters#
-
Minimal duration after SOR before taking any action required by these instructions: 5 min at 500 kHz.
-
Follow the instructions above.
Known issues#
- COSMIC, TECHNICAL, PHYSICS_PP runs Cell HG occupancy, mod[1-4]: white horizontal stripes can be seen which differs from the reference plots. QC shifter can ignore these patterns, because they are caused by dynamic FEE mask which is changed by the PHOS experts without prior notice. Only completely empty plots should be reported by the QC shifter to PHOS/CPV on-call
- COSMIC, TECHNICAL, PHYSICS_PP runs Cell HG occupancy, mod1 shows missing data in the area x=(32,47);z=(0,27) since end of March 2023. This is a known issue and is being investigated by PHS experts. No need to report it in the QC EOS reports and no calls to PHS/CPV shifters are needed.
- COSMIC, TECHNICAL, PHYSICS_PP runs. Sometimes QC shifters report on the error message in EPN infoLogger: Trailer decoding error: Last RCU trailer word not found. These errors are persistent, they appear in all COSMIC, TECHNICAL, PHYSICS runs since the beginning of Run3, and caused by a bug in SRU firmware. All these errors should be ignored.
TOF#
Ignore alarms if TOF is not READY
Readout map (Slot Participating)#
[QC on FLP - plot integrated over the run]
Green panel: good quality | Red panel: bad quality |
---|---|
The plot shows a map of TOF readout slots per crate. The checker controls if enough crates are in the readout, if it detects lower than expected the quality is set to BAD. In case of red allarm please call TOF oncall.
Hit Multiplicity#
[QC on EPN - plot integrated over the run]
The plot shows the number of hits detected by TOF, a checker provides instructions for the shifter based on measured counts. In case of yellow allarm please contact TOF on-call via email, in case of red alarm call TOF expert.
Known issues#
Slot Participating will not update in REPLAY Runs, this is a known issue do not call the oncall.
TPC#
TPC_Physics#
Quality Observer / Number of Clusters / Quality Trending / Drift velocity trending#
[QC on EPN]
To be checked - General:
- The time stamp at the bottom should update every few minutes during running.
- The aggregator summary on the top left might take a bit longer to appear at the begining of each run.
-
When run is ongoing all qualities in the list should be Good after about 10 minutes.
-
If in the quality trending, the quality is constantly Bad for multiple cycles (>30 minutes) call oncall.
-
Please also check the N Clusters 2D plots; if there are holes call oncall, unless issue already known.
-
Note that the checker of the drift velocity of tha gas updates every 10 minutes
- Note that the summary quality may occasionally fluctuate due to rapid changes in luminosity, which affect the checker of the IDCs (Integrated Digital Currents)
Known issues#
Occasionally, the 'Calib' QO may go red; this is an issue with the interplay of the validity of two objects. Please ignore if this behaviour is not persistent (work in progress).
TRD#
Layout for physics runs in pp#
Note:
- all QC tasks for TRD are running on the EPNs
Data size per sector#
The TRD has 18 sectors which should all produce a similar data size per TF. There can be two bands forming for each sector as is the case in the example above, because the number of calibration triggers per TF (which are very large in data size) depends on the readout trigger rate and is not fixed. This is OK, please respect the info box on the plot. If it is not green, please write a bookkeeping entry tagging TRD.
Tracklet distribution in half-chambers#
This plot shows the number of tracklets per half-chamber. The x-axis is the sector number. If you see one column completely empty please notify the TRD oncall.
The crosses on the plot are from a static half-chamber status map which needs to be replaced by a dynamic one to correctly cross out half-chambers where no data is expected because of hardware issues.
Eta-phi distribution of ITS-TPC-TRD tracks#
This plot will be empty if either ITS or TPC is missing in the run. This is expected and not a problem.
Eta-phi distribution for ITS-TPC tracks matched to at least 3 TRD tracklets. The PHOS-hole from abs(eta) < 0.2 and phi ~ 5 leads to almost no tracks in that region. No need to call TRD expert if plot does not look as example above. We are currently adding automatic checks.
Pulse height plot based on ITS-TPC-TRD matched tracks#
Ignore in case eta-phi distribution plot is empty or very sparsly filled.
This plot will be empty if either ITS or TPC is missing in the run. This is expected and not a problem.
A clear peak should be visible between time bins 0-4, followed by a plateau and a falling edge around 20. If plot does not look approximately as in the example above (and the track eta-phi plot is filled) please create a bookkeeping entry tagging TRD.
Pulse height plot based on TRD-only data#
In case the pulse height plot based on matched tracks is filled you can ignore this plot.
A peak should be visible between time bins 0-5. Call the TRD oncall if that is not the case, as there might be an issue with the trigger settings. The plateau region at larger time bins might not look as smooth as in the above picture in case there is a lot of pile-up.
Number of tracklets per event and TF#
We would like to be notified via bookkeeping entry in case either more than one distinct peaks are appearing in the distributions or in case there are entries in the underflow bin in either of the two histograms.
Raw data statistics#
This plots summarizes statistics on the raw data, such as the number of collected timeframes nTF, the number of triggers nTrig the number of calibration triggers nCalTrig, the number of tracklets nTrkklts and the number of digits nDigits. Furthermore, the readout rate and the calibration trigger rate are determined from the collected data and shown in the plot. Please note that it is normal that the readout rate is lower than the interaction rate due to TRD dead times. The oncall shall only be called in case the plot is not filled at all.
Known issues#
QC plots for COSMICS are not consistent with the documentation.
- For the "Data sizes from HalfCRU header" plot shows deviations due to noisy sectors (values around 2500). This issue can be ignored till documentation is up to date.
- For the "Number of Tracklets per timeframe" plot has a double peak structure in COSMICS and a lot of entries in the underflow bin (around 90% of tracklets). The experts have been notified. The relevant log can be found here: https://ali-bookkeeping.cern.ch/?page=log-detail&id=80122. The issue may be fixed in future or the documentation updated.
ZDC#
Baseline [QC on FLP]#
| Green panel: good quality | |:--------------------------:|
The plot shows the baseline mean values of each ZDC channel. The quality of the histogram is determined by verifying the deviation from the expected average value. If the plot is empty or the messagge start with Warning or Error call the on-call.
Align Plot [QC on FLP]#
The plot shows if all channels are aligned. Most of the channels should be centered on sample 6. If any channel deviates by plus or minus one, it is considered good. Otherwise call the on-call
Bc Overflow [QC on FLP]#
The Bc Overflow plot must be empty. If it fills up it means that there are corrupted data. Call ZDC on-call
Bc Data Loss [QC on FLP]#
The Data_loss plot must be empty. If it fills up it means that there are problem about data loss. Call ZDC on-call
Known issues#
FLP InfoLoger: message: "Baseline Error in PED_Z%" message: "Rec Error in ADC_Z%" Detector ZDC. During SYNTHETIC runs this can be ignored.
VTX#
Vertex distributions from matched central barrel tracks#
[QC on EPN]
The left plot shows the transversal x-y distribution of the reconstructed vertices, while the right plot shows the longitudinal vertex distribution.
The mean value of the z vertes distribution usually fluctuates by +/- 0.3 cm around the central value. The mean values in the x-y directions are usually very stable.
The RM/RC must be informed immediately if:
- the mean x and y values are outside [-0.1 cm, +0.1 cm]
- the mean z value is outside [-1.0 cm, +1.0 cm]
- the standard deviation in z is larger than 6.0 cm
A bookkeeping entry tagging RC should be added if the mean z value is outside [-0.5 cm, +0.5 cm].
[QC on EPN]
The RM/RC must be informed if the values are outside the limits