Big Data: The Cheetah Problem – From Data Sharing to Data Collaboration
There’s a sea change happening in the business world. From the old days of data silos and distinct but often untrackable asset ownership and management, we’re seeing a shift towards data sharing as a way of making more understanding and profit from the data any company has at its fingertips.
Matthias Nicola, Field CTO at Snowflake (an innovative cloud company) challenged the limitations of such a shift in his keynote speech to the Big Data London conference on September 21st.
Behold The Cheetah!
Nicola likened businesses to wild African animals – and in particular the relationships between their behavior patterns and the likelihoods of their offspring surviving to maturity.
The likelihood of cheetah cubs surviving to adulthood, he explained, was 1 in 20 – because female cheetahs live and hunt alone, and whenever they hunt, they have to leave their cubs (the cherished genetic data-bundle in which their direct future is invested) unprotected to both the elements and potential predators.
The likelihood of lion cubs surviving to maturity, he said, was roughly four times better – and a large part of that could be explained by the fact that lions live and hunt in prides (a life that is not just shared but collaborative – hunting together, with bodies left over to watch the cubs while the prime killers go about their work of providing meat).
It may well strike you as a clunky metaphor – but as soon as you understand it is a metaphor, its clunkiness entirely ceases to matter.
Sharing Is Caring – For The Bottom Line
The point is that generally in nature, having partners in an ecosystem who are prepared to share the responsibility of protecting precious data-bundles – and that the same works in business, if only companies have the courage to adopt a leonine lifestyle, rather than splendid, cheetah-style isolation when it comes to their data.
“Data sharing can bring you to a point of three times the success of data hording or siloing,” explained Nicola, and yet, he added, a survey run by Snowflake to find the scale of data sharing and collaboration had revealed that a bare 55% of business leaders were even confident they knew where all their data was. He made no reference to the percentage of those who were in fact incorrect in their assumption that they knew the location of all their data assets (despite the fact that practically every firm on the planet will have dark data in its system – data created, copied, or shared to clouds as a way of solving particular problems and then abandoned, without the company itself ever having been aware of its existence).
What’s more, only 45% said they were even able to share their data with their partners, even though the evidence shows that those who do share it broadly tend to achieve better results than those who don’t.
Nicola dropped a bomb of perspective in the hall, saying that to be as effective as possible, the business world needed to break its traditional mindset of data asset hording, and make what is clearly going to be quite the journey to a sharing-first model.
“Data sharing is a business necessity,” he said, encapsulating the idea.
If It Were Easy, Everybody’d Be Doing It
It wasn’t, however, going to be anything like easy to move to a sharing-first model, he acknowledged.
“There are a lot of common inhibitors to moving to a sharing-first model,” said Nicola. Internal silos are always possible, and given the traditional ways in which businesses have worked for at least the last 30 years, they’re even very likely. It’s arguable that the post-pandemic continuation of remote working patterns (and even some relatively new device management strategies like Bring Your Own Device) maintains and reinforces a silo-spawning mindset, with people saving copies of documents locally rather than to common shared drives.
The business culture where knowledge and data access have frequently been equated with rank or status is another significant barrier to abandoning a data-hording mindset – especially in businesses where no alternative marker of status is provided as a carrot to lead people into giving up their ‘data privileges.’
Determining ownership of data, particularly in companies with a long legacy and/or relatively fast staff turnover, where ownership of particular data may either never have been established in the first place, or long since faded into obscurity.
There may, in addition, be a lack of trust in the data to be shared – if it’s old, or vague, or incomplete, those with whom you’re trying to share it may prefer an up to date estimate of their own over the scrappy inexactitude of your technically genuine data.
Your data may be subject to security or governance challenges, which will tend to slow down – or even stop dead – the sharing process.
And of course, crucially in business circles, there may be both an implied business cost and very real bottom line costs to sharing your data.
Any one or all of these factors might intervene to stop a company wanting to embrace the sharing-first model, Nicola acknowledged.
Outsurviving The Cheetah People
By way of a subtle sales pitch, he also mentioned that it can be quite difficult to do, in terms of moving data from one environment to another, and maximising the usefulness of shared data – Snowflake have a product that can make it easier, and in fact deliver insights that would be otherwise be unavailable due to PII (Personally Identifiable Information) constraints.
But the point is that by not only sharing but actively collaborating on data, businesses can get significantly ahead of both the game and the cheetah-model companies who try to do everything themselves, while hording their data-babies.
“Who do you actively collaborate with in terms of your data?” he asked. “Do you collaborate with your resellers? Your customers? Do you reach out and collaborate across the industry?”
If you’ve ever questioned the importance of big data to your business, you may be failing to appreciate what data sharing, and particularly data collaboration can do for you. But importantly, Nicola finished on a note of digital transformation. “The value of a technological solution – like ours – is often limited by the inability of the people in a system to change their processes. So if you want to get the most out of data collaboration, there’s an educational process that needs to happen,” he said. “In fact, you need to teach people who’ve potentially lived their whole business lives as cheetahs, to behave like lions.”
Feel free to cringe at the cheetah-cancellation if you like, but it’s worth remembering that any shift towards data collaboration is a matter of both technology – which already exists, and can be relatively easily deployed – and the psychology and data culture of your company, which is a whole different, more complicated ball game.
21 September 2023
20 September 2023
20 September 2023