BIG DATA

Everything you need to know about pseudonymization

It's the secret weapon of smart companies that rely on data to turbocharge their business.

5 September 2018

No matter what data you collect, if it's personal, there will be strings attached.

Soumik Roy

@soumikroy

soumik@hybrid.co

All stories

No matter what data you collect, if it’s personal, there will be strings attached. Source: AFP PHOTO / Daniel LEAL-OLIVAS

Data is the lifeblood of most organizations today. It’s what powers even the simplest of processes, be it sales, operations, or even administration.

Companies have become masters of data collection over time, picking up bits and bytes from every interaction with a customer, at every stage of the customer lifecycle.

What is pseudonymization?

The GDPR (and other data privacy regulations across the world) emphasize on protecting personal data. In other words, if you’ve got information relating to an identified or identifiable natural person — either directly or indirectly — you’re supposed to comply with the GDPR.

Data that cannot be tied back to a natural person, however, is known as anonymized data and exempt from the GDPR.

However, between identifiable and anonymized data, there’s another segment — where certain parts of a data set are removed in order to hide the identity, but the data can still be tied back to the person with a little effort — that’s called pseudonymization.

What does pseudonymized data look like?

In order to understand how pseudonymized data can help businesses, it’s important to understand what pseudonymized data looks like, so here are a few examples:

Imagine a grocery store in the normal course of the day: There are hundreds or even thousands of transactions every day, with customers swiping payment cards and loyalty cards at the checkout counter every few minutes.

The store will collect all the data from the customer — their name, their food preferences, their purchasing power, and so on.

Now, if this data is to be pseudonymized, the grocer simply needs to refer to each customer by their customer ID instead of their name — and make sure that the table that matches customer IDs and names isn’t accessible to all.

Another example would be a gas station. Given your preference to stick to one provider, you’ll buy gas from different points along your journey (or the same few points around your home and office). Either way, the company will know the names of people, their routes, and consumption patterns.

In order to pseudonymize this data, all that the gas company needs to do is make sure the customer names are hidden away in a different table and that each purchase is attributed to an encrypted user instead.

How does pseudonymization help?

When you understand what pseudonymized data is, it doesn’t take long to understand how it helps businesses.

The fact is, most insights and intelligence algorithms don’t need personal data. All they need is current, real data about customers and their spending habits, about stakeholders and their choices, and about different demographics and their preferences.

The only thing that pseudonymized data doesn’t allow businesses to do is to provide deeply personalized experiences. However, there are a few smart workarounds for that as well.

For example, if you’re a grocer and you want to provide personalized experiences to your customers, use the data to form user personas and map product choices to certain personas. This way, say someone likes purchasing imported products from a certain country, you can personalize their experience — without knowing much about them.

YOU MIGHT LIKE

BIG DATA

What you can learn from British Airways’ GDPR fiasco

Is pseudonymization legal?

Well, it depends on how you’re processing your data, but under normal circumstances, pseudonymization is legal. Of course, before proceeding, businesses must seek specific legal advice.

The fact that pseudonymization is defined under the GDPR actually makes it easier for companies to understand how to handle and manage data that has been pseudonymized.

Of course, care needs to be taken, but so long as businesses are cautious about who has access to the master table that can decrypt the data to tag it back to individual people, data pseudonymization is a good option for those looking to continue using data to transform their business without breaching any data privacy laws.