Data security starts with discovery and audit

First, define your jewels...
4 October 2022

At the Big Data London conference in September, we sat down to talk data security with Terry Ray, SCP of security solution experts, Imperva. We ranged across subjects like the difference between cybersecurity and data security, and got inspired by Terry’s “Complete Awareness” approach to data security. When it comes to the ‘crown jewels’ of corporate data, step #1 is knowing what it is, where it is, where else it might be – and who might have access to it. If you’ve ever blanched at the complexity of mapping data movement through a system, read on…

THQ: So, considering the fact that “cybersecurity” is the watchword of every company everywhere right now – what’s the difference between cybersecurity and data security? And why does the difference matter?

TR: The answer depends on who you ask. Cybersecurity is a kind of a catch all for everything. It can include network security, endpoint security, application security, whatever. So, it includes everything. Even data security, if you drill down into its silo, will give you different answers depending on who you talk to and what their priorities are as regards that data. Some people will also include the area of information security, because what’s the difference? But it often doesn’t really fit in that world.

THQ: OK. Imagine we asked Terry Ray?

TR: In my opinion, data security includes all of those things that go together to prevent exposure and breach of critical assets like data. What that means is, it could be something as simple as encrypting your data, in my opinion, although encryption honestly doesn’t go that far. Then further along the line, you have tokenization, and masking or obfuscation of location, or transforming the data, all three of those fit in the same level of manipulating the data in a way that makes it not-readable.

But then you shift gears. What about access management, identity access management, that’s authorization. Am I authorized to access this data? That often fits into data security as well. But the reality is, the overwhelming majority of data breaches and data exposures are from completely authorized users. Greater than 90% of the time, it’s completely authorized users that are responsible for breaches. That’s a fact that takes us into Identity Access Management (IAM) – is that part of data security at all? Because it’s not really securing anything.

The Big Five Questions

So ultimately, data security comes down to a handful of questions. Do I know where my data is? Yes? Great. Do I know which of my data is sensitive in some way, and which of my data is not sensitive in some way? If we know that, then we look at access to that data. Who is accessing it? When are they accessing it? What are they accessing? Ans where security becomes important is when you take the answers to all those questions and ask one more question. Should that person access that data?

I mean, that’s probably leaving out another ten questions in the interrogative process, but you get the idea. There are a lot of different ways that people can say “I have data security.” And so, when I talk to those people, I get them to answer their own questions. If you can answer questions like “Do you know what your private data is?” and “Do you know where your private data is?” you might be in a good position.

At least for a moment. Question 3 – do you know whether your private data could be anywhere other than where you expect it to be? That’s really knowing where your data is. Get those three questions answered and congratulations – you know where your data assets are.

Now, answer question 4. Do you know what Terry accessed?

Man Hunt

You’re only given Terry, one user, somewhere in your company. What did he access?

If you can tell me every file Terry ever touched, every database he ever accessed, every data warehouse he got into, you have a really good chance of achieving data security. Because you know where the users are, you know what they’ve used, seen, touched, altered – at any point. And then you multiply that by Terry, and Sarah, and Gwyneth in the London office, and so on, and so on. And you should be able to do that in multiple directions – show me the entire history of the Johnson file, that was opened on February 2nd, 2016. Show me everything that was ever done on it. If you can do that, you have data security on the Johnson file. And so on.

So once you have a high level of confidence that you know exactly what Terry did, there’s one final question. Should he have done that? Was that normal for Terry? Does Terry normally do what we see he did here? Was that data something that Terry normally accesses? Or is that something that maybe only an application or somebody else accesses and all of a sudden, Terry accessed it for the first time last Thursday for half an hour. That all speaks to likely analytics or machine learning, crunching through all of that other data we’ve gone through.

Knowing What You Have

So to me, that’s the major difference between data security, which is a very siloed pillar of cybersecurity, that the other pillars don’t touch, like endpoint security doesn’t do any of that, and the rest of the elements of cybersecurity in general. Network security doesn’t do any of that either. So there’s really a distinction between them, and you could argue that if you get data security right to the extent of being able to answer all five questions, it’s the most fundamental part of the security puzzle.

Once you can answer those five questions – What is your data? Where is your data? Where might it be besides where you know it is? Who’s doing what? And should they be allowed to do it? – you’ve got a pretty good idea what’s happening to the crown jewels of your data.


In Part 2 of this article, we’ll tackle the practical methods by which companies – once they know what they have – can protect their data jewels, once they know all there is to know about them.