What makes one AI testing strategy succeed where another fails? Is it the technology, the people, or the funding behind it or is it something deeper, something about how we think about strategy itself? In today’s fast-moving landscape of AI integration, the distinction between good and bad strategies is not just a matter of preference. It is a matter of survival.
But what truly defines a good strategy in AI testing? And how do we recognise a bad one before it’s too late? Drawing inspiration from Richard Rumelt’s ideas, we will begin by clarifying what the word “strategy” really means, beyond the buzzwords and lofty goals. Many organisations mistake ambition for direction and confuse lists of actions or aspirations with coherent strategic thinking. The result is often what Rumelt calls a “bad strategy”: one that hides behind slogans, ignores the real problem, and substitutes wishful thinking for diagnosis.
So why do even smart, capable teams fall into this trap? What happens when companies chase “AI transformation” without first understanding the systems, data, and operational realities they are dealing with? And what are the consequences of testing becoming a vibes confirmation rather than a means of discovery?
In contrast, what does a good AI testing strategy look like when applied to a complex system? How can it align with enterprise objectives, foster accountability and build operational resilience? Most importantly, how can it balance the need for speed with the discipline of evidence-based verification, especially in a world where AI’s behaviour is probabilistic and constantly evolving?
Throughout the session, we will explore these questions through the layers of verification that define large-scale AI implementations. I will share concrete examples from practice that illustrate both the pitfalls of flawed strategies and the power of sound, principle-driven ones.
By the end, the challenge will be clear: is your AI testing strategy built to impress, or is it built to endure?
(The topic can be a part of either the AI Governance & Trust or the Human-centered AI Testing streams)