In my role at Mesh AI, one of the most frequent questions I get from prospective clients is: “How would you go about building a data science function from the ground up?”.
Well, building a data science capability is a tricky proposition that can go wrong in many different ways.
The issue is that getting value from your data science team is like trying to satisfy Goldilock’s taste for porridge. Everything’s gotta be juuuuuuust right, otherwise you won’t get anywhere near the business value out of it you should.
And you’ll be eaten by bears!
I’ve seen a lot of companies make critical mistakes when building out their data science programs. In this blog, I’m going to explore the five things you need to know to be successful in this endeavour.
Let’s jump in.
Don’t try to go from zero to INFINITY.
It’s simply not realistic that your business will be able to go from nothing to deploying real-time deep learning models overnight.
And if you aim too high, too soon, you will overextend, your foundation will be wobbly and the whole data Jenga tower will come crashing down.
Instead, you must start small and grow as organically as possible.
Rather than thinking in terms of grand future visions, consider what it is that you actually need today. And start from there: from right where you are!
Many businesses assume that if they build an amazing data science platform that business value will magically materialise.
But they haven’t considered what they will actually use the platform for!
If you build a platform without considering the use cases that it will drive, then there’s no way you can build it in a way that will meet your actual needs.
This would be like kitting out a fancy kitchen with high-end appliances without considering the kind of food you want to make. A spaghetti maker is of no use if you want to make a sandwich!
You’re much better off building your data science capability in parallel with the development of use cases. In this way, you know what you’re building will have a purpose, you won’t overinvest in things you don’t need and can iterate over time.
The key point is this: technology should serve business decisions, not the other way around!
Rather than starting by thinking about what platform or team you want, a better way of bootstrapping your data capacity is to think about business opportunities that data science could help with.
Start with a compelling event: a known business problem or opportunity that you know you need to respond to.
It need not be a massive issue, only a well-defined and practical one. Like using real-time pricing to incentivise purchases and increase lifetime customer value, for example.
There are many advantages to this: firstly, by choosing a ‘compelling’ event you know it's of value to the business, secondly, having a single problem space gives clear boundaries to the work so you can clearly measure success and, thirdly, an early win here can help build confidence and incentivise the business to invest further in growing the capability.
The trick is then to build out your data capability to support this specific compelling event. This gives you some real-life experience to work with, allowing you to tweak your approach and scale it across the business over time, without investing in things you don’t need.
If you hire a few data nerds, throw them in a room by themselves and say “optimise my business!”...very little is going to happen.
They might tweak a few models here and there, but whatever they do will be fundamentally disconnected from your business objectives and decision-making.
Your data team needs to participate in your business: to be involved in decision-making, work with engineering to implement tooling and technology, and speak to customers to understand their needs and how they are using your products.
Often business people don’t know what is possible, and data scientists don’t know what is needed! The business value comes from their interaction and mutual support.
Instead, you need to understand what you want your data scientists to do and how you want them to work with your business. This should be explicitly defined in your operating model.
For example, rather than having an isolated research team, you could have a hub-and-spoke model, with data scientists seeded onto all the different product teams.
There’s a temptation to hire the baddest data gurus under the assumption that they are what your business needs.
But you probably don’t need people with PhDs, especially early on.
Instead, I would give some serious thought to what it would take to actually get things done in your business and the kinds of problems you want your prospective hires to be working on.
Use that as your starting point for writing job descriptions, rather than specifying a long list of techniques and technologies.
In general, I would err on the side of favouring people with practical experience over academic knowledge or pedigree. You can’t get a degree in experience!
I would also favour those with high potential over those who have a lot of skills with a niche technology. You can teach them the tools, you can’t increase their potential!
These would be my key considerations if I were building a data science team from scratch in a large enterprise.
The key is always coming back to considering what the actual business need is now, building capacity based on clear business use cases and deeply considering how your data scientists can interact with your business in a way that increases the likelihood of success.
If you want to be competitive, you need to sort your data constraints, and that's where Mesh-AI can help. Identify the areas in your organisation that require the most attention and solve your most crucial data bottlenecks. Download our Data Maturity Assessment & Strategy Accelerator eBook.
Interested in seeing our latest blogs as soon as they get released? Sign up for our newsletter using the form below, and also follow us on LinkedIn.