Intro

D-Day

This post goes beyond the definitions and motivations for fostering psychological safety in the workplace, and aims to provide an historical argument for the practice, linking it to other industrial process innovations of the last 80 years.

I will work backwards, starting with the most well known study involving the practice at Google in 2012, and ending 80 years ago, at D-Day in World War 2. It’s supposed to be a light (although long) read, so I’ll tend towards lightness on detail. If the claims seem thin, that’s a weakness of my writing (or, perhaps, the medium), not the history!

A lot of the meat of this post comes from Deming’s Journey to profound Knowledge, which I recommend

Brief Definition

It is not uncommon for someone to say they have never heard of the term when I mention it. But, invariably, when I give them a rough definition, they say something like, “Oh, yeah, my company did that. We got this new CEO a couple of years ago and we all went to a team retreat where our department head said we should bring our whole selves to work and we’d get a bonus if we came up with innovative ideas.”

Since it’s a poorly understood term, here is a definition, straight from Google’s AI Overview:

Psychological safety is a shared belief among team members that it’s safe to take risks, express ideas, ask questions, and admit mistakes without fear of negative consequences. It’s about fostering an environment where people feel comfortable being themselves, contributing their ideas, and challenging others without worrying about being judged, embarrassed, or punished.

Psychological safety is an inherently appealing idea to both employees and managers. It appeals to our nobler instincts. In a sea of survival-of-the-fittest management doctrine, it is a lone raft of ‘better together’ thinking. HR departments rejoice! You don’t have to be the bad guy! We will thrive if we cultivate curiousity, mutual respect, tolerance and trust.

I like this idea. I really want it to be true. But I am a skeptical person, and years of conditioning activates my doublespeak hackles whenever I hear a company tell me we can all just get along.

History

For this particular subject, my skepticism would not be satisfied by force of argument, correspondence to other beliefs I hold, or an historical understanding of how the term evolved in management parlance. I didn’t care how to foster it, I didn’t care how some people get it wrong.

What I wanted was results. Business results. The way I see it, if this idea actually produces the results it claims to, then history should be dripping with examples of success.

I think I have learnt enough to put my skepticism aside. Here is the high level summary of my findings.

2012: Google’s Project Aristotle

This is the ‘skin deep’ level of the psychological safety historical record. Initiated at Google in 2012, the Project Aristitle aimed to investigate what factors contributed team effectiveness. The idea was to give researchers access to a large population of teams with good variance in attributes (large - small ; colocated - remote ; senior - junior ; high paid - low paid etc) and a team performance label (low, mid, high performing), to see if they could discern any patterns that emerged.

Informal lore is that Google expected the most effective teams to be a combination of high performers, an experienced manager, and unlimited resources. It turned out that none of those were particularly important.

Here’s what did matter, as copy pasted from Google’s reWork site, which is worth a read:

Psychological safety,
Dependability: Members reliably complete quality work on time (vs the opposite - shirking responsibilities).
Structure and clarity: An individual’s understanding of job expectations, the process for fulfilling these expectations, and the consequences of one’s performance are important for team effectiveness. Goals can be set at the individual or group level, and must be specific, challenging, and attainable. Google often uses Objectives and Key Results (OKRs) to help set and communicate short and long term goals.
Meaning: Finding a sense of purpose in either the work itself or the output is important for team effectiveness. The meaning of work is personal and can vary: financial security, supporting family, helping the team succeed, or self-expression for each individual, for example.
Impact: The results of one’s work, the subjective judgment that your work is making a difference, is important for teams. Seeing that one’s work is contributing to the organization’s goals can help reveal impact.

It seems compulsory for blog posts to now say what didn’t work, so here they are, copy pasted as well:

Colocation of teammates (sitting together in the same office)
Consensus-driven decision making
Extroversion of team members
Individual performance of team members
Workload size
Seniority
Team size
Tenure

Ok, fine. Psychological safety is a key contributer to high team performance at Google. But that’s one company, and a really unusual company at that. And what about the other important factors? Dependability, structure, meaning, impact. Those, surely, are as important.

2003-2012: Site Reliability Engineering and DevOps

Consumers think of Google as a search company. Developers think of Google as a scale company. Their stellar growth in the early 2000s was driven by their ability to run things on a bigger scale, with greater reliability, while delivering better quality, than enyone else. They elevated developers to engineers, and, in the process, created a new kind of engineer - the Site Reliability Engineer. Their core competence is obvious - it’s in their job title - but the methods they employ to achieve them are, perhaps, not. From Wikipedia:

Common definitions of the practices include (but are not limited to):

Automation of repetitive tasks for cost-effectiveness.
Defining reliability goals to prevent endless effort.
Design of systems with a goal to reduce risks to availability, latency, and efficiency.
Observability, the ability to ask arbitrary questions about a system without having to know ahead of time what to ask.

Google actually made a book about SRE, which is incredibly useful - I highly recommend a read.

These practices are very familiar to modern software developers, but in the early 2000s they were completely novel. They are familiar to developers because SRE is now considered a specialised subset of a near-universal software development approach called DevOps. Here are some concepts that appear in almost any organisation that practices DevOps:

You build it you maintain it: The team both develops a feature and maintains it in production. Tight coupling of the all parts of the production chain.
Kanban: Use of issue trackers to define small, self contained tasks, which are then picked up by developers and completed in a short time window.
Continuous integration and delivery (CI/CD): Deployment of code (arising from the tasks above) nearly continuously to production, resulting in a continuous refinement of a product, as opposed to big bang, infrequent releases
Automation: Use of automation in building, testing and deployment to reduce human error, increase velocity, and ensure comprehensive review of all changes.
Monitoring: Continuous montitoring of development processes to help identify areas for optimisation.
Short Feedback loops: Maintaining continual feedback to help developers adapt and optimise on a continuous basis.

Ok, fine. But how does SRE relate to DevOps and what does any of this have to do with psychological safety?

Here is my claim. DevOps and SRE have these things in common:

Individuals need to think about systems in order to do their jobs well.
Processes need to be broken up into well defined, simple, observable tasks, and continual effort is expended in automating as much of the overall process as possible.
Teams need to measure the statistical properties of those processes to identify variation. The variation needs to be analysed to identify areas for optimisation.
Individuals are encouraged to solve problems they identify on their own initiative in a structured manner, recording the outcome to the organisation can learn from their experiments.

Well, I think that it is almost impossible to do those things well if you are not in a psychologically safe environment. In a psychologically unsafe environment, it is against the interests of an individual to engage meaningfully in any of the above practices. Systems thinking leads to sticking your nose in other people’s business, process simplification and monitoring leads to management by quota, individual problem solving either gets your manager promoted and you fired.

The game theory dynamics make the likelihood of delivering quality software reliably and at scale almost zero if you don’t foster the sort of conditions found in highly effective teams at Google in 2012.

1950-1980: Japan

But where did DevOps come from? Did software developers come up with this stuff out of thin air? No! A lot of this thinking actually came from the production lines of big industrial business in America, especially the automotive and aerospace industries. When the dot com boom really got underway in the late 90’s, a lot of nascent tech companies found that, while they were sitting on very large untapped markets, they didn’t really know how to ,manage their production in an efficient manner. And, they had a lot of money, so they went shopping for production line and management skill in corporate America.

The best of these managers were just coming off of a successful overhaul of American industry following the rise of Japanese industry in the 70’s and 80s. Innovative companies like Toyota and Sony were completely decimating US domestic competitors in the 70’s and a large driver of their success was their ability to build things cheaper, to a higher quality, more reliably than their US counterparts. In the 80’s and early 90’s US management practices went through a quality revolution, adopting management philosophies like Lean and Six Sigma. These philosophies emphasised the following:

Process improvement through statistical process control
A data driven approach to monitoring
A focus on defect reduction
A focus on meeting customer needs, and adjusting rapidly to meed evolving needs

Can you see how DevOps is actually an industry specific variant of this management philosopy? Note the heavy focus on abstractions of the production line in things like Lean and Six Sigma. One tends to imagine robots, rather than humans, performing these tasks, and bad implementations tended to fall into that trap. The Japanese management philosophies which influence these management styles are much more explicit in the role of humans in production. The Toyota Way 14 Principles, probably the most famous of these, covers all of the process parts above, but also has this to say about the psychological environment of the production floor:

Build a culture of stopping to fix problems, to get quality right the first time. Quality takes precedence (jidoka). Any employee can stop the process to signal a quality issue.
Standardized tasks and processes are the foundation for continuous improvement and employee empowerment. Remember Kanban in DevOps?
Grow leaders who thoroughly understand the work, live the philosophy, and teach it to others. This principle argues that training and ingrained perspective are necessary for maintaining the organization.
Become a learning organization through relentless reflection (hansei) and continuous improvement (kaizen). The general problem-solving technique to determine the root cause of a problem includes initial problem perception, clarification of the trouble, locating the cause, root cause analysis, applying countermeasures, reevaluating, and standardizing.

It is simply impossible for me to imagine a shop floor worker pressing a button to shut down an entire factory floor because they spot a manufacturing defect if they do not feel psychologically safe. Why would they? Sure, you prevent a defective car rolling off the line, but nobody would know you saw it, and your life wouldn’t be improved if you brought it up. You might even be fired, since the financial impact of selling one bad car is far outstripped by the fincancial impact of stopping a production line.

These management practices, then, entirely depend on the presence psychological safety.

1939-1945: World War II: Why the good guys won

There is a poetic link between this section and the original reason for this blog post. We tend to believe that the Allies won World War II because of some sort of inherent capacity that democracy harnesses. That people fight harder when they are free, perhaps. But this is not necessarily the case. In fact, I’m not sure that there exists any pattern to speak of before, say, 1750 that bears this out. But we like to think that free, open societies are better, but, if that is the case, there ought to be an historical record. We also like to think that psychologically safe workplaces are better, but, if that is the case, there should be an historical record.

So. Google ate the world in the 2010s. DevOps at the world in the 2000s. Lean ate the world in the 1990s. Japan ate the world in the 50s, 60s, 70s and 80s. Where did the Japanese get these ideas? Why didn’t it help them win the war? Well, it turns out they didn’t have these ideas during the war. They were introduced to Japan by an American statistician named W Edwards Deming in 1947, while he was contracted by the US Army to consult with the Japanese census office. While in Japan, he gave a short seminar on a US Army production method called Statistical Process Control (SPC) to the newly formed Union of Japanese Scientists and Engineers (JUSE). These ideas were wildly successful in Japan - to the point that the most coveted national prize for production excellence in Japan is called the Deming Prize. The core ideas of SPC form the bedrock of of the statistical component of management styles like the Toyota Way.

The reason that SPC is so process and statistically focused is because it came from a WWII US Army contractor specification document. However, in training seminars given by Deming before and after the war to train young engineers, not much time would be spent on those methods. Rather, he would emphasise that a poorly performing factory was never the fault of workers. He would claim that worker behaviour is driven by the system they find themselves in. If a worker doesn’t care about their work, it is because they exist in a system that penalises them for caring about their work. If workers do not take initiative, it is because the system that they find themselves in discourages initiative taking. And the system is defined by management. Therefore, the cause of poor factory performance is almost always the fault of management. Ding ding ding! SPC is not viable without psychological safety.

The American War Standards defined the standards the US Army contractors needed to adhere to in order to meet minimum qualifying standards for supplying goods to the US Army during World War II. Deming was on the committee that drafted the standards, and they in effect required all contractors to implement SPC in their production processes to be able to supply goods to the war effort. Since the entire US economy was reconfigured to support the war effort for the entire Allied side in the war this amounted to the implementation of SPC across virtually the entire manufacturing base of the US.

The results were spectacular. It is often claimed that the Allies did not win the war because they fought harder, or because they lost more - it is because they outproduced the Axis powers. US industry produced more tanks, guns, boots, food, airplanes, and guns than anyone else, and that meant the Allies were better able to project force in remote theaters than the Axis. And, this was done with a workforce that was almost completely untrained in 1939. The US military had 8 million personnel at the peak fo the war, and the vasty majority were physically fit men, who where the core of the US industrial workforce before 1939. Their vacant jobs were filled by women and people of colour, who were barred from serving. They were trained minimally and put in the job almost immediately, because the demand was so great. But the system they operated in was governed by the American War Standards, and, by extension, they operated under statistical process control.

Here are two possibly apocryphal stories to illustrate the point. The first was that the Nazi rank and file became convinced of defeat before the leadership because, as early as 1944, when Nazi counteroffensives would retake Allied positions, infantry would frequently discover American beer in the Allied dugouts. The fact that the Allies could supply their troops with an abundant luxury from halfway across the world while the Nazi infantry frequently went days without bullets or food meant that defeat was inevitable.

The second actually has a component of truth. During the war the US Navy operated a fleet if ice cream factory ships to supply their men with ice cream. This was because alcohol had been banned on US Navy since before WWI, and they thought ice cream could serve as a morale boosting substitute. That part is true, and here is the source. It is said that when US warships started to routinely enter Japanese domestic waters the Japanese miliatary refused to consider the possiblity of defeat. But, when the Ice Cream Barges arrived, they knew it was a lost cause.

I forgot about D-Day!

Oh, I forgot about D-Day! Well, the most massive amphibious invasion in human history was conducted with ships, tanks, boots, helmets, rifles, water bottles, radios and food all made under an industrial system that was governed by the American War Standards, and those standards relied on Statistical Process Control, and SPC does not work without psychologocal safety.

Wrapping Up

If you made it this far, you probably found the post interesting. I am satisfied the psychological safety is an essential part of an effective work culture. Not because it is highly compatible with prevailing beliefs about human flourishing, but because it is a necessary condition for the success of the most effective production systems ever devised.

What I take from this is that psychological safety is not an end in and of itself. It is not even sufficient. It is a core component of a larger system of management practices that have a long lineage, and currently still stand as the most effective way to consistently delivery quality at scale.

Deming had a very long career after Japan, and continually refined his ideas. Shortly before he died in 1993, he settled on four fundamental pillars of knowledge, which he thought were universally applicable to the management and improvement of any system. They are:

Appreciation for a system: Understanding that organizations are complex, interdependent systems, and that actions taken in one part of the system can have unintended consequences in other parts.
Knowledge of variation: Recognizing that variation is inherent in all processes and learning to distinguish between common cause variation (natural, stable variation) and special cause variation (unusual variation indicating an issue).
Theory of knowledge: Applying a scientific approach to problem-solving, including understanding that theories are tentative and should be tested and revised based on evidence.
Psychology: Understanding that individuals are motivated by intrinsic factors and that effective management systems should foster these intrinsic motivators, rather than relying solely on extrinsic rewards.