As an engineer, I want to stay in the game in the post-hype Gen AI world. This technology as with all others will go through a boom and bust cycle, also known as the hype cycle (I spoke about this at the AWS Generative AI Deep Dive Days in July this year).
In short, the cycle includes a technology trigger i.e. the inception of the technology, peak hype, trough of disillusionment, and then a period where the technology rises towards some level of stability and hopefully success.
Arguably, Gen AI is at peak hype right now, if not already starting to roll down towards the trough of disillusionment. This begs the question: how do we survive and thrive to reach the plateau of productivity? That's the goal over the next couple of years. In this article, I will share some observations with you from the gemba of Gen AI - where it actually happens, and take you through a high-level framework and some examples to boot.
As engineers, it's very easy for us, very human of us, to have our worldview and biases. In our case, we see shiny things and sometimes pursue them aggressively, without necessarily seeing the broader context in which the shiny thing exists.
This can inevitably lead us into a situation we might not want to be in. To protect against this fateful outcome, there is a requirement to be vigilant, cautious, nimble. It's important to acknowledge that we have blind spots, which is also the first step in addressing them.
The biggest blind spot in tech and data is simply - business value. There are so many foundational layers that are required to reach AI-driven results that the business value - the foundation of all the other foundations - is often overlooked by those who are intimately involved in the development of the solution.
The risk is that we become cost centres instead of what we set out to be, which is centres of change. The bigger risk is that we see another AI winter in which organisations choose not to engage in the conversation because they felt burnt, which happened in my living memory.
Mitigation of said blindspot, as it relates to AI-driven applications, starts with design and ends in deployment. One could claim this is true with all tech. However, it differs in two key ways:
With regard to my first point, it's essential to focus on the problem space while seeking the right use case.
In contrast, most practitioners are too often guilty of thinking firstly of the use case - pivoting head first into a POC wall only to pop out the otherside with a solution. That flow is far too simplistic, and yet it persists in our popular consciousness. In effect, this is solutioning while not robustly understanding the problem space.
Moreover, it implies we want to arrive at a very specific destination while starting from a very specific point. That is difficult and increases risk. It also fails to consider that a “use case” is part of a broad problem space.
To minimise that risk, and increase the likelihood of success, we need to operate more fluidly within that problem space. while categorising challenges. From that point, one can then prioritise opportunities followed by solutions and use cases.
Without this approach, your culture of experimentation has nothing to lean on and nothing to iterate on. You remain narrow.
The most successful use case might not stem from the first opportunity but rather its neighbour. And, if you haven’t done the work to know your neighbours - you won’t be able to turn to them in an hour of need let alone cultivate the culture of experimentation so desperately required when running a data science team.
I’ll touch on the structure of a data science team momentarily, but first, let me share some lessons from the gemba in identifying valuable opportunities.
As Gen AI has become mainstream, a trend has emerged similar to the adoption of machine learning. About 80% of the use cases that drive value are bound, which is in stark contrast to agentic workflows.
As background, ‘agentic’ refers to an advanced approach that combines generative capabilities with autonomous, goal-oriented behaviour. It enables AI to operate with greater independence and effectiveness in complex environments. And in turn it is more complex, less established and comes with higher risk. For example, Bumble’s CEO famously put forward a good example of an agentic workflow: your profile’s AI (i.e. an “AI data concierge”) will date other AIs to identify suitable matches. Worth pointing out that she very quickly acknowledges that this is an “out there” idea, and I will point out that this is very much like a recommendations system, albeit how it reaches its output is very different to collaborative filtering or a Boltzman machine. Importantly, it’s not really proven yet.
In contrast, bound use cases forgo leveraging agents, as well as chat UIs. Instead it favours a more “traditional” approach that sees single chain outputs and embeddings used the most.
Bound is currently driving the most value, especially within enterprises, which strive for greater control, less complexity and lower risk. When considering the impending trough of disillusionment, it's crucial to ask two key questions to figure out if a bound or agentic approach is the right path.
Is the desired customer behaviour narrow? For example, do you want the customer to select something from a menu or do you want them to comprehend why a recipe is structured in some way? If the former, a bound use case makes sense.
And, does that behaviour require them to perform a kind of internal exploration? For example, do you want the customer to convey symptoms or to reflect on their own internal dialogue that we all have? If the former, a bound use case indeed.
Let me give you a few examples where bound use cases are driving value:
Beyond being bound, the common factor in all these examples is that they drive some form of value. If you remove that, validity is in question, and AI initiatives risk being labelled as a cost-centre.
It can easily be navigated if solutions are immediately useful to the team developing them. Success can be articulated through an improvable metric and the solution is foundationally important - verging on the mundane (which YC partners note in this commentary that is valuable to non-startups as well). The examples above speak to these points respectively.
As someone who's been in the trenches of AI and ML implementation, I've seen firsthand the struggles organisations face when trying to scale AI & ML projects beyond the design phase. The key lies in building a cross-functional data science team that can embrace a robust MLOps worldview - a practice and culture empowered by technology and data.
In my experience, this team needs to bridge the cultural divide between data scientists and engineers amongst many other roles. Seven crucial roles that form the backbone of a successful data science team: data engineers, data analysts, ML engineers, data science leaders, statisticians, domain experts, and software engineers.
It’s integral for these roles to collaborate throughout the entire ML lifecycle. Too many organisations fall into traps like building oversized teams or allowing a "that's not my job" attitude to persist - surefire ways to hinder progress.
At the heart of this team structure should be a strong data science leader. This person acts as the glue, the traffic control, between core data capabilities, application capabilities, and the business. They're not just a figurehead, not just a product person, not just an engineer - they're the key decision-maker who embodies the best qualities of various roles and can effectively bridge the divergent worlds.
Successful MLOps isn't just about implementing technology; it's about driving transformational change through practice and culture i.e. people and processes. Empowering these data science leaders is crucial to fostering a high-performing, data-driven culture. However, this empowerment needs to come from the top - C-suite support is essential for these leaders to truly make an impact.
Understanding and leveraging bound use cases can help navigate the trough of disillusionment and achieve long-term success with Gen AI. By focusing on driving business value and carefully considering user behaviour, we can create impactful AI solutions that stand the test of time.