GENERATIVE AI

Data management specialist warns against data dangers of generative AI

It's 10pm in generative America. Do you know where your data is?

26 May 2023

Sam Altman said generative AI will need regulation.

Tony Fyler

@more__hybrid

fyler@hybrid.co

All stories

Open AI boss Sam Altman, telling Congress that generative AI would need regulation. Source: ANDREW CABALLERO-REYNOLDS / AFP

• The rush to add generative AI to products brings data risks.
• Safeguards and regulations could be slow to rein the technology in.
• The potential for bias in the data intensifies over time.

Generative AI has been taking the world roughly by storm since Microsoft and OpenAI launched ChatGPT in November, 2022. There is absolutely no doubt that it represents a transformation in the way the world works. But there has also been significant backlash, particularly on training data and data usage. The backlash culminated in the Biden administration demanding the generative AI industry implement standards and responsibilities for those using the tech – or have them imposed by government.

Generative AI: everybody’s favorite new toy.

THQ:

That’s the thing, though, isn’t it? Almost as soon as it was released, almost everybody found ways of using it somehow.

KS:
And that’s the danger. Especially in America, where you’re rewarded for taking risks and for moving fast. But without any regulation, companies can move fast, companies can innovate, so there’s no real downside to adding ChatGPT to any business model from a corporate perspective. Yet. And I think that’s the challenge.

YOU MIGHT LIKE

GENERATIVE AI

White House underscores responsibilities of the generative AI industry

THQ:
We saw the meeting of the kings of generative AI at the White House, the takeaway from which seemed to be that if you’re going to make billions of dollars out of this technology, responsibilities need to come along with the rewards.

KS:
And of course, that was followed by Sam Altman of OpenAI talking to Congress and agreeing that regulation was needed, and then hearings in the Senate around AI.

The good news is that I think the open letter that a lot of executives sent to Congress saying that the government needs to step in has had an impact and people are looking into it. I think the problem might be that in general, regulations tend to follow after you’ve seen one or two bad things happen. And in the case of AI, it may be too late if it takes us that long to get regulations.

Generative AI: a rapidly evolving genie?

THQ:
There’s no real way of putting this particular genie back in any kind of bottle. It’s just how we contain it somehow so that we know what it’s about to do, right?

KS:
Right. But we have contained other things before. We contained nuclear weapons. We knew how to contain atomic bombs. We know how to contain drug research that could be potentially dangerous if done the wrong way.

So the idea of regulating something that can have a great impact is not a new concept. It’s just that the regulation has to happen very quickly. What’s new is the pace at which this is moving.

THQ:
Absolutely – after all, just weeks ago, the tech giants were the news, with their generative AI models. Then open-source coders got their hands on LLaMA, and now suddenly they’re the news, with their smaller, more agile, more function-specific generative AI models.

Six months from now, you can place your bets what’ll be changing the world in this area. So how is a regulatory regime that takes a year to create supposed to still be relevant, when technology has outpaced it two or three times in the time it’s taken to draw up?

“I’m afraid I can’t do that, Dave…”

KS:
Exactly. Part of the point, part of how generative AI can give us great benefits, is that it does things much faster than humans can do. So it’s very powerful, but also there’s a lot of risk because people don’t understand what it does and how it does it.

Especially, there’s a lot of risk around the data that generative AI is based on, because at the end of the day, it’s still a machine that’s learning from patterns and it’s gleaning those patterns from the data that you give it. People forget that. Because it sounds so human, we attribute it with human qualities, but it’s really machine learning at the end of the day.

THQ:
That’s the whole issue with China’s condemnation of the technology, isn’t it? They don’t want to have any version of it in the country that isn’t taught on solidly socialistic principles. So what you’ll get when you ask it things is solidly socialistic answers, that may not necessarily reflect “the truth,” as we see it in the West.

KS:
Exactly. That notion of bias in the data is very prevalent. And because it’s generative, because it’s generating new content, you tend to think it is also innovating and changing mindsets. But it’s not. It’s actually perpetuating the bias that already exists in the data that you fed it. It’s generating more and more of the same kind of data.

THQ:

Leading to a funnel effect over time, where you continue to narrow the focus of the results you get?

Generative AI: a powerful toddler?

THQ:
Like a toddler…

KS:
And it’s doing that on its own, through a neural network of some kind. And because it is doing that, we’re applying it to general domains like natural language or image management and things like that.

That is where we really need regulation, because when you apply something in the general domain, it has tremendous potential to deliver both positive and negative impacts.

And when you have something that’s not deterministic, you don’t know what it’s going to do.

THQ:
And in that short sentence, you just described about half of the science fiction dystopian short stories that have been written in the last six months or so.

You don’t know what it’s going to do – and yet it’s almost everywhere, being put to a whole variety of uses, underpinning businesses across the world, using enormous quantities of company data in ways that are potentially obscure.

KS:
That’s why data management is going to be integral to the future of generative AI, and why we need some data safeguards in place.

In Part 2 of this article, we’ll explore the potential data issues that using generative AI opens you up to – and how, in practical terms, you can guard against them, while regulations are mulled by the powers-that-be.