in Safety Instrumented Functions SIF
In a Safety Instrumented Function (SIF), voting logic determines how many sensors or devices must agree before a protective action is taken. Choosing the right voting architecture directly affects both safety availability and process availability.
What Is Voting?
A rule that decides when to trip (take protective action) based on the number of input signals that detect a hazardous condition. Written as MooN — M out of N devices must agree.
Why Do We Use Voting?
To balance two competing goals: act fast enough to prevent a hazard, but not so fast that we cause unnecessary process shutdowns (spurious trips).
Hardware Fault Tolerance (HFT)
HFT defines how many single failures can be tolerated without losing the safety function. IEC 61508 requires minimum HFT based on the target SIL.
The Two Competing Goals
Safety Availability
The SIF trips when the hazardous condition is real. More sensitive voting (e.g., 1oo2) increases the chance of detecting a true hazard.
Process Availability
The SIF does not trip on false signals. Less sensitive voting (e.g., 2oo2) reduces spurious trips but reduces fault tolerance.
The Fault Tolerance Concept
Voting is not static. Every time a device fails, is bypassed, or goes into test mode, the effective voting changes. This is called Dynamic Voting Degradation.
A healthy SIF has the voting architecture you designed. But the moment a device fails, is bypassed, or goes into proof test; your effective voting degrades silently. Risk increases. Most operators don't see it.
The Degradation Path
What Causes Degradation?
A bypass is often applied manually without a formal risk assessment. The SIF continues to operate — but the risk reduction is already reduced. The process may be running unprotected for hours or days without anyone knowing.
In a 1oo1 (1 out of 1) architecture, a single device makes the trip decision. If it detects the hazard — it trips. There is no backup.
HFT = 0. One failure — dangerous or spurious — directly impacts either safety or process availability. IEC 61508 limits this architecture to SIL 1 maximum under Route 1H.
✅ Advantages
Simple design · Low cost · Minimal wiring · Easy maintenance
❌ Disadvantages
No fault tolerance · Single point of failure · SIL limited · High PFDavg
1oo1 Signal Chain: Single path from sensor to final element
Dynamic Degradation Table — 1oo1
| State | Condition | Effective Voting | HFT | Protection? | Recommended Action |
|---|---|---|---|---|---|
| Healthy | All devices operational | 1oo1 | 0 | ✅ Yes | Normal operation |
| Dangerous | Sensor fails dangerous (DU) | None | −1 | ❌ Lost | 🚨 Emergency repair · Compensating measure |
| Spurious | Sensor fails safe (spurious trip) | Tripped | — | Process down | Restore / investigate cause |
| Bypassed | Sensor bypassed for maintenance | None | −1 | ❌ Lost | 🚨 Operator watch · Time limit bypass |
| Testing | Proof test in progress | None | −1 | ❌ Lost | Minimize test duration · Compensating measure |
| LS Failed | Logic solver channel failed | None | −1 | ❌ Lost | 🚨 Emergency procedure · Immediate repair |
1oo2: Either sensor A or sensor B detects the hazard to trigger a trip. Maximises safety availability — but increases the chance of a spurious trip if one sensor fails safe.
✅ Advantages
HFT = 1 · High safety availability · Detects hazard even with one sensor failed dangerous · SIL 2–3 capable
⚠️ Disadvantages
Higher spurious trip rate · One safe-fail sensor trips process · Both sensors must agree to NOT trip
Degradation Path
1oo2: Either sensor triggers a trip — OR logic
Dynamic Degradation Table — 1oo2
| State | Condition | Effective Voting | HFT | Spurious Trip Risk | Recommended Action |
|---|---|---|---|---|---|
| Healthy | Both sensors operational | 1oo2 | 1 | Medium | Normal operation |
| Degraded | Sensor A fails dangerous (DU) | 1oo1 (B only) | 0 | Lower | Repair A · Flag to operations |
| Degraded | Sensor A fails safe (spurious) | Tripped | — | Tripped | Investigate · Restore quickly |
| Vulnerable | Sensor A bypassed for maintenance | 1oo1 (B only) | 0 | Lower | Time limit bypass · Compensating measure |
| Critical | Sensor A bypassed + B fails DU | None | −1 | Lost | 🚨 Emergency procedure · Shutdown if required |
| Testing | One sensor in proof test | 1oo1 | 0 | Lower | Minimize test window · One at a time |
2oo2: Both sensors must agree to trip. Used where a spurious shutdown is extremely costly. But there is no fault tolerance — one failure kills the safety function.
HFT = 0. If one sensor fails dangerous, the SIF cannot trip. Used only when both availability and specific failure mode analysis justifies it.
✅ Advantages
Very low spurious trip rate · High process availability · Suitable where false trips are costly
❌ Disadvantages
HFT = 0 (same as 1oo1 from safety perspective) · One DU failure = no trip possible · Not suitable for high-demand SIFs
Degradation Path
2oo2: Both sensors must agree to trip — AND logic
Dynamic Degradation Table — 2oo2
| State | Condition | Effective Voting | HFT | Protection? | Recommended Action |
|---|---|---|---|---|---|
| Healthy | Both sensors operational | 2oo2 | 0 | ✅ Yes | Normal operation |
| Critical | Sensor A fails dangerous (DU) | Cannot Trip | −1 | ❌ Lost | 🚨 Repair immediately · Apply compensating measure |
| Tripped | Sensor A fails safe (spurious) | May Trip | — | Process down | Investigate · Restore |
| Degraded | Sensor A bypassed | 1oo1 (B only) | 0 | ✅ Partial | Restore A urgently · Time limit bypass |
| Lost | A bypassed + B fails DU | None | −1 | ❌ Lost | 🚨 Emergency · Consider process shutdown |
2oo3 is the most widely used voting architecture in the Oil & Gas industry. It balances safety availability and process availability better than any other architecture. HFT = 1 — one failure is tolerated without losing either protection or causing spurious trips.
✅ Advantages
HFT = 1 · SIL 2/3 capable · No spurious trip on single failure · Continues to protect on single DU failure · Online repair possible
⚠️ Considerations
3× sensor cost · Common cause failure risk · More complex wiring · Diagnostics required · CCF beta factor critical
Full Degradation Path
2oo3: 2 of 3 sensors must agree to trip
Degradation Matrix — 2oo3
| Sensors Available | Failed/Bypassed | Effective Voting | HFT | Spurious Risk | Safety Risk | Action |
|---|---|---|---|---|---|---|
| A, B, C all OK | 0 | 2oo3 | 1 | Low | Low | Normal |
| A and B only (C failed) | 1 (DU) | 1oo2 | 0 | Medium | Medium | Repair C promptly |
| A and B only (C bypassed) | 1 (bypass) | 1oo2 | 0 | Medium | Medium | Time limit · Return quickly |
| A only (B + C unavail.) | 2 | 1oo1 | 0 | Higher | Higher | 🚨 Compensating measure |
| None available | 3 | None | −1 | N/A | Lost | 🚨 Emergency procedure |
3oo5 uses 5 sensors with a majority vote of 3. It provides HFT = 2 — the highest fault tolerance available in standard industrial practice. Used for turbomachinery, HIPPS, and critical offshore shutdowns.
🎯 Typical Applications
Gas turbine overspeed protection · Compressor anti-surge · HIPPS (High Integrity Pressure Protection System) · Critical fired heater BMS · Offshore ESD
Up to 2 devices can fail without losing the safety function AND without causing a spurious trip. This is the gold standard for availability and safety combined.
Degradation Path — 3oo5
| Active / Available | Unavailable | Effective Vote | HFT | Status |
|---|---|---|---|---|
| 5 of 5 | 0 | 3oo5 | 2 | Healthy |
| 4 of 5 | 1 | 2oo4 or 3oo4 | 1–2 | Good |
| 3 of 5 | 2 | 2oo3 | 1 | Degraded |
| 2 of 5 | 3 | 1oo2 | 0 | Vulnerable |
| 1 of 5 | 4 | 1oo1 | 0 | Critical |
| 0 of 5 | 5 | None | −1 | Lost |
3oo5: 3 of 5 sensors must agree — highest fault tolerance
This master table covers every standard voting architecture. Use it during SIL verification reviews, bypass approvals, and proof test planning.
Complete Architecture Comparison
| Architecture | Failed/Bypassed | Effective Voting | HFT Remaining | Protection? | Spurious Risk | Recommended Action |
|---|---|---|---|---|---|---|
| 1oo1 ARCHITECTURE | ||||||
| 1oo1 | 0 | 1oo1 | 0 | ✅ Yes | Low | Normal |
| 1oo1 | 1 (any) | None | −1 | ❌ Lost | N/A | 🚨 Emergency action |
| 1oo2 ARCHITECTURE | ||||||
| 1oo2 | 0 | 1oo2 | 1 | ✅ Yes | Med | Normal |
| 1oo2 | 1 (DU) | 1oo1 | 0 | ✅ Yes | Lower | Repair promptly |
| 1oo2 | 1 (bypass) | 1oo1 | 0 | ✅ Yes | Lower | Time-limit bypass |
| 1oo2 | 2 | None | −1 | ❌ Lost | N/A | 🚨 Emergency action |
| 2oo2 ARCHITECTURE | ||||||
| 2oo2 | 0 | 2oo2 | 0 | ✅ Yes | Very Low | Normal |
| 2oo2 | 1 (DU) | Cannot trip | −1 | ❌ Lost | N/A | 🚨 Immediate repair |
| 2oo2 | 1 (bypass) | 1oo1 | 0 | ✅ Partial | Med | Restore urgently |
| 2oo2 | 2 | None | −1 | ❌ Lost | N/A | 🚨 Emergency action |
| 2oo3 ARCHITECTURE | ||||||
| 2oo3 | 0 | 2oo3 | 1 | ✅ Yes | Low | Normal |
| 2oo3 | 1 | 1oo2 | 0 | ✅ Yes | Med | Repair/restore quickly |
| 2oo3 | 2 | 1oo1 | 0 | ✅ Yes | Higher | 🔶 Compensating measure |
| 2oo3 | 3 | None | −1 | ❌ Lost | N/A | 🚨 Emergency action |
| 3oo5 ARCHITECTURE | ||||||
| 3oo5 | 0 | 3oo5 | 2 | ✅ Yes | Very Low | Normal |
| 3oo5 | 1 | 3oo4 / 2oo4 | 1–2 | ✅ Yes | Low | Repair/monitor |
| 3oo5 | 2 | 2oo3 | 1 | ✅ Yes | Med | Repair promptly |
| 3oo5 | 3 | 1oo2 | 0 | ✅ Yes | Higher | 🔶 Compensating measure |
| 3oo5 | 4 | 1oo1 | 0 | ✅ Yes | High | 🚨 Emergency procedure |
| 3oo5 | 5 | None | −1 | ❌ Lost | N/A | 🚨 Emergency action |
When a device is bypassed, it is removed from the voting logic. A 2oo3 system with one bypass becomes a 1oo2 system — not a 2oo3 with a disabled channel. This has a direct impact on PFDavg calculation and should trigger a Management of Change (MoC) review.
Hardware Fault Tolerance (HFT) is the number of dangerous failures a subsystem can tolerate while still performing its safety function. IEC 61508 defines minimum HFT requirements based on target SIL and Safe Failure Fraction (SFF).
HFT = 0
No failures tolerated before protection is lost. System still functions with all devices healthy.
HFT = 1
One failure tolerated. Protection maintained with one device failed. Most common for SIL 2 systems.
HFT = 2
Two failures tolerated. Protection maintained with two devices failed. High-integrity critical systems.
IEC 61508 Route 1H — Minimum HFT Requirements
| Target SIL | SFF < 60% | SFF 60–90% | SFF 90–99% | SFF ≥ 99% | Common Architecture |
|---|---|---|---|---|---|
| SIL 1 | 1 | 0 | 0 | 0 | 1oo1 (if SFF high enough) |
| SIL 2 | 2 | 1 | 1 | 0 | 1oo2 or 2oo3 |
| SIL 3 | 3 | 2 | 1 | 1 | 2oo3 or 3oo5 |
| SIL 4 | 4 | 3 | 2 | 2 | Special design required |
A sensor with high Safe Failure Fraction (SFF ≥ 90%) — typically a modern smart transmitter with diagnostics — requires lower HFT to achieve the same SIL. This allows simpler architectures to qualify for SIL 2 without full 2oo3 redundancy.
HFT vs Architecture Summary
Single point risk
tolerated
tolerated
Nuclear / critical
IEC 61511 Clause 11.9 requires that a formal bypass management system is in place. Every bypass degrades the SIF — and must be managed as a temporary increase in risk.
Bypass Workflow
IEC 61511 Requirements
- ✅Authorized bypass only — named individual must approve every bypass
- ✅Bypass register — all active bypasses documented with start time and reason
- ✅Time limit — maximum bypass duration defined in Safety Requirement Specification
- ✅Operator alarm — DCS/SIS must show a continuous bypass alarm on operator console
- ✅Compensating measures — documented and actioned before bypass applied
- ✅Shift handover — active bypasses communicated at every shift change
- ✅Management of Change — MoC required if bypass extends beyond defined limit
- ✅Restoration check — functional test after repair before removing bypass
- ✅Audit trail — full electronic record in SIS event log / historian
- ✅No simultaneous bypasses on same SIF without FSE review
Bypasses left in for days or weeks without compensating measures. The SIF is effectively disabled — and nobody is watching the process manually. This has caused major incidents in the process industry.
Every proof test temporarily removes a channel from the voting logic. A 2oo3 system becomes a 1oo2. Test the second channel without restoring the first — and you're down to 1oo1. Plan your proof testing carefully.
2oo3 During Staggered Testing
Never take a second channel into test while the first is still in test. Doing so on a 2oo3 system degrades it to 1oo1 — just one remaining sensor stands between you and a potential major event.
PFDavg Impact During Testing
| Architecture | Channels in Test | Effective Vote | PFDavg Impact |
|---|---|---|---|
| 2oo3 | 0 | 2oo3 | Normal |
| 2oo3 | 1 | 1oo2 | Increases |
| 2oo3 | 2 | 1oo1 | Significant ↑ |
| 2oo3 | 3 | None | Protection lost |
| 1oo2 | 0 | 1oo2 | Normal |
| 1oo2 | 1 | 1oo1 | Increases |
| 1oo2 | 2 | None | Protection lost |
| 3oo5 | 1 | 3oo4 | Minor ↑ |
| 3oo5 | 2 | 2oo3 | Moderate ↑ |
Proof Testing Best Practices
- ✅Test one channel at a time — never two simultaneously on the same SIF
- ✅Define maximum test window in SRS (typically 4–8 hours per channel)
- ✅Notify operations before starting — shift supervisor sign-off required
- ✅Apply compensating measures during test window
- ✅Restore and verify channel before starting next one
- ✅Record proof test result — found fail rate contributes to PFDavg
- ✅Proof test coverage (PTC) target ≥ 90% for SIL 2 applications
Many engineers believe that a 2oo3 or 1oo2 architecture guarantees protection. It does NOT — if all three sensors share a common failure cause. Redundancy only helps when failures are independent.
What Causes Common Cause Failures?
Shared Process Connection
All three sensors tapped from the same impulse line or manifold. Blockage, plugging, or freeze affects all simultaneously.
Shared Power Supply
All sensors powered from the same 24V DC bus. A power supply fault disables all channels at once.
Environmental Effects
High temperature, humidity, vibration, or corrosive atmosphere affecting all sensors in the same location equally.
Incorrect Calibration
Same technician calibrates all three sensors incorrectly using the same faulty reference. All fail together at the same set point.
Systematic Failures
Software error, firmware bug, or design error that affects all channels — not caught because all channels show the same wrong value.
Shared Instrument Air
All pneumatic transmitters or control valves sharing the same instrument air supply. Air failure affects all simultaneously.
The Beta Factor — Quantifying CCF
IEC 61508 introduces the β factor to quantify the fraction of dangerous failures that are common cause. Typical β values in oil and gas:
| Separation Level | Typical β (DU) | Design Measures | CCF Risk |
|---|---|---|---|
| No separation (shared impulse lines) | 0.10 – 0.20 | None | High |
| Physical separation only | 0.05 – 0.10 | Separate tapping points | Medium |
| Separation + diverse technology | 0.02 – 0.05 | Different sensor types | Lower |
| Full diversity + separation + procedures | 0.01 – 0.02 | Full IEC 61508 Table D.4 | Low |
CCF Defence Measures
- ✅Separate tapping points (min 300mm apart)
- ✅Independent power supplies per channel
- ✅Diverse sensor technologies (DP + guided radar)
- ✅Separate cable routes / cable trays
- ✅Separate calibration teams per channel
- ✅Environmental protection (heat tracing per channel)
Use this checklist during SIS design, SIL verification review, and Functional Safety Assessment (FSA) to ensure dynamic voting degradation risks are properly managed.
⚙️ Sensor Selection
- ✅Select SIL-rated instruments with IEC 61508 certification
- ✅Use Smart transmitters with HART diagnostics (improves SFF)
- ✅Specify minimum SFF and DC (Diagnostic Coverage)
- ✅Apply physical separation between redundant sensors
- ✅Consider diverse technologies to reduce CCF β factor
- ✅Define proof test interval based on SIL target PFDavg
🖥️ Logic Solver Design
- ✅Use SIL-certified SIS logic solver (not standard PLC)
- ✅Implement diagnostic coverage ≥ 90% for SIL 2
- ✅Configure bypass indication alarms on DCS operator station
- ✅Implement electronic bypass register with time limit
- ✅Enable voting reconfiguration under authorized control only
🔩 Final Element Design
- ✅Use fail-safe action (spring-return / fail-close / fail-open)
- ✅Apply partial stroke testing (PST) for large ESD valves
- ✅Separate instrument air supply per valve where possible
- ✅Confirm valve SIL data (PFDavg, B10, DC) from manufacturer
🗳️ Voting Selection
- ✅Document voting selection rationale in SRS
- ✅Verify HFT meets IEC 61508 Route 1H requirements
- ✅Define degraded voting response procedures for each failure mode
- ✅Consider 2oo3 as default for SIL 2 — best balance of safety and availability
- ✅Avoid 2oo2 unless detailed failure mode justification exists
🔑 Bypass Philosophy
- ✅Define bypass philosophy in Operations & Maintenance Manual
- ✅Specify maximum bypass duration in SRS
- ✅Require FSE or Operations Manager approval for bypass
- ✅Prohibit simultaneous bypasses without FSE review and MoC
🔬 Proof Testing
- ✅Prepare detailed proof test procedures per IEC 61511-1 Clause 14
- ✅Test one channel at a time — never concurrent testing
- ✅Achieve ≥ 90% proof test coverage (PTC) for SIL 2
- ✅Record found fail rate — update PFDavg calculation
- ✅Review test interval if found fail rate > assumed DU failure rate
🔐 Cybersecurity (IEC 62443)
- ✅Separate SIS network from DCS (air gap or unidirectional gateway)
- ✅Protect remote bypass capability with multi-factor authentication
- ✅Log all bypass and override events with user ID and timestamp
Save this page. Share it with your team. Use it in every SIS design review.
| Architecture | HFT | Max SIL | Safety Avail. | Process Avail. | Typical Application | Key Risk |
|---|---|---|---|---|---|---|
| 1oo1 | 0 | SIL 1 | Medium | Medium | Low-risk alarms, auxiliary shutdown | Single point of failure |
| 1oo2 | 1 | SIL 2/3 | High | Lower | High-demand SIF, fire and gas, ESD | Higher spurious trip rate |
| 2oo2 | 0 | SIL 1 | Lower | High | Process availability-critical, low risk | Zero fault tolerance — DU = no trip |
| 2oo3 ★ | 1 | SIL 2/3 | High | High | ESD, HIPPS, separator, heater BMS | CCF if sensors not separated |
| 3oo5 | 2 | SIL 3+ | Highest | Highest | Turbomachinery, HIPPS, critical offshore | Complexity, cost, CCF |
Dynamic Voting — The Golden Rules
Every bypass = voting change
A bypassed device is removed from the voting logic. 2oo3 with one bypass = 1oo2. Never forget this.
One test at a time
Never test two channels simultaneously on the same SIF. You push the architecture to 1oo1 or worse.
Redundancy ≠ CCF protection
Three sensors on the same impulse line can all fail together. Physical separation is mandatory.
Time-limit every bypass
Define maximum bypass duration in SRS. No open-ended bypasses. Ever.
Operators must be aware
The DCS operator station must show every active bypass. Shift handover must include bypass status.
Document degraded state
Every degraded voting state must have a defined response procedure — in writing, in the SRS.
Post a Comment