Brody
July 17, 2022

The Swiss Cheese model is a patient safety standard, has been around for decades, but is also used in multiple industries. In healthcare it is used to mitigate patient safety events. The theory is that you can look at a the pathway to a bad outcome as having multiple layers, with each layer having a ‘hole’ in it, the more layers of cheese the more likely the holes will be covered. Just as there are rarely a single causality to an adverse event there are multiple opportunities to prevent the event.

 

You can think of the Root Cause Analysis tool: the fishbone or Ishikawa diagram. Many attributes that lead to the undesired outcome. Stop any of those attributes from being actualized and the adverse event becomes a near miss, or doesn’t happen at all. The Swiss cheese model is the model frequently sited to prevent those events.

The standard Ishikawa with 5 core causality areas

In order for there to be a layer in the process you must understand that process. The humans, machines, supplies, processes, configurations, etc. And indeed both the desired and undesired outcome. If you do truly understand those elements of the workflow then you also understand the holes in those process workflows.

For patient safety, my Ishikawa has two core facets, or sides of the fish, the system/solution and the user/using entity. My role in patient safety was as a vendor, so this is what makes sense to me. Is the system safe as built, and is the system safe as used. My walkthrough on the system side progresses from feature, requirements, architecture, design, build, configuration, deployment, and interconnection. My walkthrough of the use involves, the training, people, use, purpose, administration, and interconnection. Yes, I put interconnection on both side of the fish, as there is what the vendor knows about, and what the using entity knows about.

This diagram is shortened for clarity

Each of those elements, ‘bones’ has multiple sub-elements and if you ask questions of those elements you will eliminate or confirm that element having participated in the adverse event. From that you can understand your holes in the cheese and potentially identify how to create the next layer of cheese that compensates for that hole.

When is Swiss cheese, just air?
So what happens when the holes in the Swiss cheese are so large that they effectively become more hole than cheese? Or, what happens when the layer that is supposed to be there isn’t there at all? Or, what happens when the layer that should prevent the event is bypassed entirely?

 

That is the purpose of this paper, when we do Root Cause Analysis we are looking at multiple aspects and the paper is not to say this model is incorrect or shouldn’t be used, but rather to advise that care should be taken to understand what the state of each protective layer in the process truly is.

 

The first question, what if the hole in the cheese is greater than the footprint of the cheese itself? So when we look at these individual layers and we look at the holes in these layers, are we considering the magnitude of these holes? If we are looking at the personnel involved in an action, are we looking at the training, culture, and institutional standards that are in place? Is there an expectation of excellence, with the idea that errors will happen but they will be used as a training aid, or is the administration on a witch hunt to demonize or hide errors? Is the staffing so low that the load makes it almost impossible to perform the expected level of care? From the vendor side, is this feature robust and well understood, or is it novel and part of someone’s sales pitch? Is the installation fully qualified so that you know the solution is present and was it correctly configured to the intent of purpose? Is the solution kept up to date or is it allowed to slowly degrade over time? Is it even on, or how do you know that the safety feature has not died a silent death?

All questions that should be asked of the different aspects of the solution that can contribute to the event itself. They can show the nature of the gaps that are present in the system — and some of these gaps should be risk alerts that when they occur a mitigation plan is enacted.

Second question, what if the layer of protection doesn’t exist? So your DUR protection fails silently and as a result the system is throwing far fewer alerts, or none at all? Many users certainly wouldn’t mind a few less alerts, might even consider that a feature, but the alerts are there to keep us from doing something without due thought. Or, you were told you had a DUR but turns out that the vendor had hard coded the results to pass certification. Or your solution loses referential integrity without requiring turning the entire screen a flashing red and requiring a reload. Referential integrity failures are when patient A is in context but data from patient B is displayed to the clinician. If a system isn’t set up to detect these conditions then how does someone know that it has happened. And, it does happen.

Third question, what happens when the layer is intentionally bypassed? Bypassed either through disregard or intentional configuration. There are any number of reasons for a segment of the safety system to be bypassed: performance, usability, upgrade derailments, lack of training or lack of understanding of the relationship with the system and safety. When the layer is bypassed, we should consider this being the same as it not being there at all.

 

Real world examples
There are several that come to mind. We can talk about vendors who deliberately misled their testing authority by hard coding for the tests. We can talk about institutions that knew there was a problem with the dispensing system and advised nursing staff to bypass the warnings, along with a very large number of other factors, that led to a patient’s death, and subsequent conviction of the nurse involved — that should be a case study in multiple causality failures in itself. But, I think I would rather look at the VA’s rollout of the Cerner replacement for Vista. All of the facts are not in, and we probably won’t know everything involved in this latest report, but the OIG Preliminary Report indicates that their ordering system lost approximately 11,000 orders. The loss was silent and the facility managers were apparently unaware. These orders included imagery, followups, referrals, etc.

 

What to do
The Model is a good model, and it generally works, but we do need to be aware that sometimes the thing we are relying on to be part of the layered safety net may not be there or may be degraded to the point that they might as well not be there at all. The basic solution is to validate your assumptions on a regular basis. Heartbeat monitors that validate that the sub-component is still alive and functioning. Fake patients that exercise specific parts of the system like DUR, Population Health or prescription exchange. Many systems have the concept of a test patient built into their system, some even exclude them from reports.

Education is also important, the more people understand the system the better they will be able to recognize when something isn’t behaving the way it should, or shouldn’t. Clinicians didn’t go through all that it takes to get their license to be programmers, but basic system education will go a long way to protecting the system from unrecognized errors.

Monitoring is also important. My go to classic documentation error is a medication allergy posted to the Problems and History list, but not to the Allergy/Intolerance list. A problem or history addition of, “History of Allergy to Penicillin” is great in that section but also must be followed up with an addition to the Allergy/Intolerance domain. Why? Because an ICD-10 code does not process in DUR/PAR during the medication prescription process and alerts will not be displayed.

 

The layers are there, we just need to be wary of assuming that they are functioning as we expect, we need to be wary of assuming, and we need to trust but verify. Document the layers of your “cheese” and find ways to make sure they are present and functioning. The documentation should also include the holes in those layers. The cost of failure in these cases is too high to ignore.

 

Thanks and I hope this helped advance your understanding of SoS and Patient Safety

© 2022 Adapttest Consulting