The dangers of valueless security risk
• Storing endless amounts of valueless data is an unnecessary business expense.
• It’s also potentially a security risk, because valueless data tends not to be well protected against hackers.
• How can you mitigate the security risk of such endlessly stored, never-used data?
Our modern business world, more than the business worlds of any previous era, is data-driven. Data provides in many cases the real value proposition of many businesses – from vast social media platforms to competitive widget-making firms in the middle of nowhere. But where there is a vast amount of stored and historic data, there’s also a security risk. Just because the company which initially collated the data for its own purposes may have finished wringing value from it, it doesn’t mean that the data has actually lost its potential value to brokers and bad actors.
We spoke to Terry Ray, SVP at Imperva, about how much of a security risk was inherent in data thought of by companies as being “dead.” He defined it as being a “valueless security risk.” We wanted to know more.
Define a valueless security risk for us? What do we actually mean by that?
It’s a security risk that comes from data that isn’t necessarily valueless to everybody, but that no longer has value to the collecting organization.
For example, my name, address and phone number are going to be as valuable to an organization today as they were five years ago, and as they will be five years from now – they’re ways the company has of contacting me, a once and future customer.
But where I shopped, the things I bought, even my credit card number from five years ago, things that I bought for other people, my healthcare information that was valuable while I was being billed, that’s all data that probably isn’t really all that valuable, until maybe I need it for some other procedure down the road.
So the point is, valueless data is only valueless relative to the people who have the data in their systems. But there’s a lot of it. Organizations have vast volumes of data that was used at one point, but they never really figured out who owns that data, how the data was really used, and whether or not they can in fact delete it.
It’s very rare for an organization to say “I don’t know who owns this data, and I don’t know how important it is, so I’m just going to delete it and see if somebody screams.”
They don’t do that. Nobody does that, because of the risk that if it was valuable and you flushed it, you just cost the company potential value for the sake of your own convenience.
The drawer of miscellaneous data.
So to use a domestic example, companies run their data policy like a “miscellaneous drawer”? That drawer into which you put random buttons and pieces of string, rather than throwing them away, because “they might be useful one day,” and you’re scared that the moment you throw them away, you’ll suddenly have the perfect use for a random button and a piece of string?
Exactly that. Except companies rarely if ever open the drawer and do inventory of their growing button-and-string collection, to even see if there’s anything useful in there.
And the point is, there may be people out there who can make illegal money out of the contents of your drawer – and who are prepared to go to some lengths to steal it and monetize it. That’s your valueless security risk.
Got it. So why are companies still leaving themselves open to that risk? What’s the impulse in businesses to just keep hold of data? Is it just that impulse of saying “I don’t know who else has got copies of this, or when it might be useful again”? Or is there more to it than that?
Well, there are a few things, to be fair. Each individual region, each individual industry has its own data retention requirements, so in the US, a lot of the data retention requirements around healthcare, for example, are usually 7 years, and can be anything up to 21 years for paper-trail data. So at least for paper-based data, we have these huge libraries of data, despite so much of that data having now gone electronic too – in duplicate. So there is that factor of how long companies are legally required to keep information.
But that’s just the regulated data. And because it’s regulated, organizations tend to prioritize their controls when it comes to data security around it.
The ever-growing data issue.
The corollary to which being that organizations don’t prioritize unregulated data?
Right. If you’re not guaranteed to have an audit on it, it’s just not as important to the organization.
That means they end up with two separate batches of data – regulated data and unregulated data. The unregulated data is where that “Can I delete this? Better not, you never know” mindset comes in. Because, as with your string and buttons, I guarantee you, the second you delete it, it’ll turn out to have customer information in there, and now someone’s asking you about this customer, and you want to go back and help them understand the history behind that customer.
So you don’t delete it.
Any of it.
The point then is that unless someone owns that data, (and that’s a hard thing to find, the data owner within a company), and that person tells me we don’t need that data anymore, then I’m not going to delete it. I’ll archive it in long-term storage, and the chances are it’ll never be needed or looked-at again. It becomes, to the organization, valueless data, being retained potentially until the end of human civilization, unexamined.
In the Limbo Drawer of Useful Items.
Exactly. And the other challenge that these organizations run into is where they put their controls. Which translates as the ways they have to go look for this data. Usually, once it gets archived into long-term storage, it’s no longer indexed into corporate search functions.
Where’s the security risk?
So it starts to beg the question – well, if I’m going to dump it into a long-term storage that’s really hard to index, if somebody really was looking for that data because, for example, they suddenly had need of a really interesting button and a perfect piece of string, to what lengths would they go to find the data, versus just giving up and saying “You know what, I’m not going to spend the next three months hunting for this piece of information. It’d be faster for me just to go find another avenue to answer whatever question I need to.
Same as with the items in the Limbo Drawer of Useful Items. If you forget which drawer it is, and it’s probably in the attic or the basement behind a whole bunch of heavy items you’ll have to move to get at it…
You’re gonna go and buy a brand spanking new button.
Right. So what you have is a valueless drawer of buttons. Or a valueless data security risk.
Whereas if we apply the right controls, it gives us more information and more ability to dig into that data. For example, when we look at data security, for example, data monitoring, data classification, discovery, those kinds of things, if I’ve done a data classification project, and then I go and look for exact certain types of data, whether I archive that data or not, I still know what server it was in, I still know what format it might have been in, I might know something close to the exact file name or the exact table or column in which it existed in the database. But at least I have a general idea of where it might be.
And more importantly, files are interesting. And they’re also nice enough to tell you the last time somebody actually accessed or modified them, so that’s helpful. So even if you wanted to do retention, you could say, just archive everything that hasn’t been touched in a decade.
Maybe that makes me feel safe – if it hasn’t been accessed in a decade, we can get rid of it, because clearly, no-one’s desperate to look at it.
Databases, sadly, don’t tell you the last time an individual record was touched, so you have to use some kind of technology that says “I’m watching what happens in your files, I’ll tell you, I can validate that their dates are the same as our dates.”
Touch my data and I’ll know.
My third-party product, sitting here watching all use of it, can tell me “I have no record of anybody touching that table in a year.” Or “I know that Terry, in fact, touched that table just last week.”
So then you maybe go ask Terry what was in it, and why he opened it. Is it important data? It gives you this inkling of information to go and start to find an avenue and say “We’re keeping the server but we want to clean it up. Can we do that safely? Can we start to delete some of this data? Otherwise, it just grows and grows.
And of course, the Oracles and Microsofts of the world are happy to sell you more file space and more storage and more file protection, and everything that goes with it. And before you know where you are, you have quite the bill to safely store data about which you have no idea – whether it’s relevant, vital, whether it’s useless to you.
Paying for extra space in the Drawer of Useful Items that you’re never, ever going to open.
Yeah. That’s all it is.
“Fools! Bureacratic fools! They don’t know what they’ve got there.” Raiders of the Lost Ark exemplifying the problem of data without management or security.
In Part 2 of this article, we’ll look at why storing your valueless data ad infinitum is not just an expense but also a potential data security risk.
22 February 2024
22 February 2024
21 February 2024